cfdm.dataset_flatten

cfdm.dataset_flatten(input_ds, output_ds, strict=True, copy_data=True, group_dimension_search='closest_ancestor')[source]

Create a flattened version of a grouped CF dataset.

The following dataset formats can be flattened: netCDF and Zarr.

CF coordinate variables

When a CF coordinate variable (i.e. a one-dimensional variable with the same name as its dimension) in the input dataset is in a different group to its corresponding dimension, the same variable in the output flattened dataset will no longer be a CF coordinate variable, as its name will be prefixed with a different group identifier than its dimension.

In such cases it is up to the user to apply the proximal and lateral search algorithms to the flattened dataset returned by dataset_flatten, in conjunction with the mappings defined in the newly created global attributes _flattener_variable_map and _flattener_dimension_map, to find which variables are acting as CF coordinate variables in the flattened dataset. See CF conventions section 2.7 Groups for details.

For example, if an input dataset has dimension lat in the root group and coordinate variable lat(lat) in group /group1, then the flattened dataset will contain dimension lat and variable group1__lat(lat), both in its root group. In this case, the _flattener_variable_map global attribute of the flattened dataset will contain the mapping 'group1__lat: /group1/lat', and the _flattener_dimension_map global attribute will contain the mapping 'lat: /lat'.

Added in version (cfdm): 1.11.2.0

Parameters:
input_ds:

The dataset to be flattened. Must be an open dataet object with the same API as netCDF4.Dataset, h5netcdf.File, or zarr.Group.

output_ds: netCDF4.Dataset

A container for the flattened dataset that will get updated in-place with the flattened input dataset.

strict: bool, optional

If True, the default, then failing to resolve a reference raises an exception. If False, a warning is issued and flattening is continued.

copy_data: bool, optional

By default, copy_data is True and all data arrays from input_ds are copied to output_ds. If False then no data arrays are copied, instead all variables’ data will be represented by the fill value, but without having to actually create these arrays in memory or on disk.

group_dimension_search: str, optional

How to interpret a dimension name that contains no group-separator characters, such as dim (as opposed to group/dim, /group/dim, ../dim, etc.). The group_dimension_search parameter must be one of:

  • 'closest_ancestor'

    This is the default and is the behaviour defined by the CF conventions (section 2.7 Groups).

    Assume that the sub-group dimension is the same as the dimension with the same name and size in an ancestor group, if one exists. If multiple such dimensions exist, then the correspondence is with the dimension in the ancestor group that is closest to the sub-group (i.e. that is furthest away from the root group).

  • 'furthest_ancestor'

    This behaviour is different to that defined by the CF conventions (section 2.7 Groups).

    Assume that the sub-group dimension is the same as the one with the same name and size in an ancestor group, if one exists. If multiple such dimensions exist, then the correspondence is with the dimension in the ancestor group that is furthest away from the sub-group (i.e. that is closest to the root group).

  • 'local'

    This behaviour is different to that defined by the CF conventions (section 2.7 Groups).

    Assume that the sub-group dimension is different to any with the same name and size in all ancestor groups.

Note

For a netCDF dataset, for which it is always well-defined in which group a dimension is defined, group_dimension_search may only take the default value of 'closest_ancestor', which applies the behaviour defined by the CF conventions (section 2.7 Groups).

For a Zarr dataset, for which there is no means of indicating whether or not the same dimension names that appear in different groups correspond to each other, setting this parameter may be necessary for the correct interpretation of the dataset in the event that its dimensions are named in a manner that is inconsistent with CF rules defined by the CF conventions (section 2.7 Groups).

Added in version (cfdm): 1.13.0.0

Returns:

None