cf.Field.compress¶
-
Field.
compress
(method, axes=None, count_properties=None, index_properties=None, list_properties=None, inplace=False)[source]¶ Compress the field construct.
Compression saves space by identifying and removing unwanted missing data. Such compression techniques store the data more efficiently and result in no precision loss.
The field construct data is compressesed, along with any applicable metadata constructs.
Whether or not the field construct is compressed does not alter its functionality nor external appearance.
When writing a compressed field construct to a dataset space will be saved by the creation of compressed netCDF variables, along with the supplementary netCDF variables and attributes that are required for the encoding.
The following type of compression are available (see the method parameter):
- Ragged arrays for discrete sampling geometries (DSG). Three different types of ragged array representation are supported.
- Compression by gathering.
New in version 3.0.5.
Parameters: - method:
str
The compression method. One of:
'contiguous'
Contiguous ragged array representation for DSG “point”, “timeSeries”, “trajectory” or “profile” features.
The field construct data must have exactly 2 dimensions for which the first (leftmost) dimension indexes each feature and the second (rightmost) dimension contains the elements for the features. Trailing missing data values in the second dimension are removed to created the compressed data.
'indexed'
Indexed ragged array representation for DSG “point”, “timeSeries”, “trajectory”, or “profile” features.
The field construct data must have exactly 2 dimensions for which the first (leftmost) dimension indexes each feature and the second (rightmost) dimension contains the elements for the features. Trailing missing data values in the second dimension are removed to created the compressed data.
'indexed_contiguous'
Indexed contiguous ragged array representation for DSG “timeSeriesProfile”, or “trajectoryProfile” features.
The field construct data must have exactly 3 dimensions for which the first (leftmost) dimension indexes each feature; the second (middle) dimension indexes each timeseries or trajectory; and the third (rightmost) dimension contains the elements for the timeseries or trajectories. Trailing missing data values in the third dimension are removed to created the compressed data.
'gathered'
Compression by gathering over any subset of the field construct data dimensions.
Not yet available.
- count_properties:
dict
, optional Provide properties to the count variable for contiguous ragged array representation or indexed contiguous ragged array representation.
- Parameter example:
count_properties={'long_name': 'number of obs for this timeseries'}
- index_properties:
dict
, optional Provide properties to the index variable for indexed ragged array representation or indexed contiguous ragged array representation.
- Parameter example:
index_properties={'long_name': 'which station this profile is for'}
- list_properties:
dict
, optional Provide properties to the list variable for compression by gathering.
- Parameter example:
list_properties={'long_name': 'uncompression indices'}
- inplace:
bool
, optional If True then do the operation in-place and return
None
.
Returns: Examples:
>>> f = cf.example_field(3) >>> print(f) Field: precipitation_flux (ncvar%p) ----------------------------------- Data : precipitation_flux(cf_role=timeseries_id(4), ncdim%timeseries(9)) kg m-2 day-1 Auxiliary coords: time(cf_role=timeseries_id(4), ncdim%timeseries(9)) = [[1969-12-29 00:00:00, ..., 1970-01-07 00:00:00]] : latitude(cf_role=timeseries_id(4)) = [-9.0, ..., 78.0] degrees_north : longitude(cf_role=timeseries_id(4)) = [-23.0, ..., 178.0] degrees_east : height(cf_role=timeseries_id(4)) = [0.5, ..., 345.0] m : cf_role=timeseries_id(cf_role=timeseries_id(4)) = [b'station1', ..., b'station4'] : long_name=some kind of station info(cf_role=timeseries_id(4)) = [-10, ..., -7] >>> f.data.get_compression_type() '' >>> f.compress('contiguous', inplace=True) >>> f.data.get_compression_type() 'ragged contiguous' >>> f.data.get_count() <CF Count: (4) > >>> print(f.data.get_count().array) [3 7 5 9]
>>> f = cf.example_field(4) >>> print(f) Field: precipitation_flux (ncvar%p) ----------------------------------- Data : precipitation_flux(cf_role=timeseries_id(3), ncdim%timeseries(26), ncdim%profile_1(4)) kg m-2 day-1 Auxiliary coords: time(cf_role=timeseries_id(3), ncdim%timeseries(26)) = [[1970-01-04 00:00:00, ..., --]] : latitude(cf_role=timeseries_id(3)) = [-9.0, 2.0, 34.0] degrees_north : longitude(cf_role=timeseries_id(3)) = [-23.0, 0.0, 67.0] degrees_east : height(cf_role=timeseries_id(3)) = [0.5, 12.6, 23.7] m : altitude(cf_role=timeseries_id(3), ncdim%timeseries(26), ncdim%profile_1(4)) = [[[2.07, ..., --]]] km : cf_role=timeseries_id(cf_role=timeseries_id(3)) = [b'station1', b'station2', b'station3'] : long_name=some kind of station info(cf_role=timeseries_id(3)) = [-10, -9, -8] : cf_role=profile_id(cf_role=timeseries_id(3), ncdim%timeseries(26)) = [[102, ..., --]] >>> f.data.get_compression_type() '' >>> g = f.compress('indexed_contiguous', ... count_properties={'long_name': 'number of obs for each profile'}) >>> g.data.get_compression_type() 'ragged indexed contiguous' >> g.data.get_count() <CF Count: long_name=number of obs for each profile(58) > >>> g.data.get_index() <CF Index: (58) > >>> print(g.data.get_index().array) [0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2]