cf.Field.collapse¶

Field.
collapse
(method, axes=None, squeeze=False, mtol=1, weights=None, ddof=1, a=None, inplace=False, group=None, regroup=False, within_days=None, within_years=None, over_days=None, over_years=None, coordinate=None, group_by=None, group_span=None, group_contiguous=1, measure=False, scale=1, radius='earth', great_circle=False, verbose=None, remove_vertical_crs=True, _create_zero_size_cell_bounds=False, _update_cell_methods=True, i=False, _debug=False, **kwargs)[source]¶ Collapse axes of the field.
Collapsing one or more dimensions reduces their size and replaces the data along those axes with representative statistical values. The result is a new field construct with consistent metadata for the collapsed values.
By default all axes with size greater than 1 are collapsed completely (i.e. to size 1) with a given collapse method.
 Example:
Find the minimum of the entire data:
>>> b = a.collapse('minimum')
The collapse can also be applied to any subset of the field construct’s dimensions. In this case, the domain axis and coordinate constructs for the noncollapsed dimensions remain the same. This is implemented either with the axes keyword, or with a CFnetCDF cell methodslike syntax for describing both the collapse dimensions and the collapse method in a single string. The latter syntax uses construct identities instead of netCDF dimension names to identify the collapse axes.
Statistics may be created to represent variation over one dimension or a combination of dimensions.
 Example:
Two equivalent techniques for creating a field construct of temporal maxima at each horizontal location:
>>> b = a.collapse('maximum', axes='T') >>> b = a.collapse('T: maximum')
 Example:
Find the horizontal maximum, with two equivalent techniques.
>>> b = a.collapse('maximum', axes=['X', 'Y']) >>> b = a.collapse('X: Y: maximum')
Variation over horizontal area may also be specified by the special identity ‘area’. This may be used for any horizontal coordinate reference system.
 Example:
Find the horizontal maximum using the special identity ‘area’:
>>> b = a.collapse('area: maximum')
Collapse methods
The following collapse methods are available (see https://ncascms.github.io/cfpython/analysis.html#collapsemethods for precise definitions):
Method
Description
'maximum'
The maximum of the values.
'minimum'
The minimum of the values.
'maximum_absolute_value'
The maximum of the absolute values.
'minimum_absolute_value'
The minimum of the absolute values.
'mid_range'
The average of the maximum and the minimum of the values.
'median'
The median of the values.
'range'
The absolute difference between the maximum and the minimum of the values.
'sum'
The sum of the values.
'sum_of_squares'
The sum of the squares of values.
'sample_size'
The sample size, i.e. the number of nonmissing values.
'sum_of_weights'
The sum of weights, as would be used for other calculations.
'sum_of_weights2'
The sum of squares of weights, as would be used for other calculations.
'mean'
The weighted or unweighted mean of the values.
'mean_absolute_value'
The mean of the absolute values.
'mean_of_upper_decile'
The mean of the upper group of data values defined by the upper tenth of their distribution.
'variance'
The weighted or unweighted variance of the values, with a given number of degrees of freedom.
'standard_deviation'
The weighted or unweighted standard deviation of the values, with a given number of degrees of freedom.
'root_mean_square'
The square root of the weighted or unweighted mean of the squares of the values.
'integral'
The integral of values.
Data type and missing data
In all collapses, missing data array elements are accounted for in the calculation.
Any collapse method that involves a calculation (such as calculating a mean), as opposed to just selecting a value (such as finding a maximum), will return a field containing double precision floating point numbers. If this is not desired then the data type can be reset after the collapse with the
dtype
attribute of the field construct.Collapse weights
The calculations of means, standard deviations and variances are, by default, not weighted. For weights to be incorporated in the collapse, the axes to be weighted must be identified with the weights keyword.
Weights are either derived from the field construct’s metadata (such as cell sizes), or may be provided explicitly in the form of other field constructs containing data of weights values. In either case, the weights actually used are those derived by the
weights
method of the field construct with the same weights keyword value. Collapsed axes that are not identified by the weights keyword are unweighted during the collapse operation. Example:
Create a weighted time average:
>>> b = a.collapse('T: mean', weights=True)
 Example:
Calculate the mean over the time and latitude axes, with weights only applied to the latitude axis:
>>> b = a.collapse('T: Y: mean', weights='Y')
 Example
Alternative syntax for specifying area weights:
>>> b = a.collapse('area: mean', weights=True)
An alternative technique for specifying weights is to set the weights keyword to the output of a call to the
weights
method. Example
Alternative syntax for specifying weights:
>>> b = a.collapse('area: mean', weights=a.weights('area'))
Multiple collapses
Multiple collapses normally require multiple calls to
collapse
: one on the original field construct and then one on each interim field construct. Example:
Calculate the temporal maximum of the weighted areal means using two independent calls:
>>> b = a.collapse('area: mean', weights=True).collapse('T: maximum')
If preferred, multiple collapses may be carried out in a single call by using the CFnetCDF cell methodslike syntax (note that the colon (:) is only used after the construct identity that specifies each axis, and a space delimits the separate collapses).
 Example:
Calculate the temporal maximum of the weighted areal means in a single call, using the cfnetCDF cell methodslike syntax:
>>> b =a.collapse('area: mean T: maximum', weights=True)
Grouped collapses
A grouped collapse is one for which as axis is not collapsed completely to size 1. Instead the collapse axis is partitioned into nonoverlapping groups and each group is collapsed to size 1. The resulting axis will generally have more than one element. For example, creating 12 annual means from a timeseries of 120 months would be a grouped collapse.
Selected statistics for overlapping groups can be calculated with the
moving_window
method.The group keyword defines the size of the groups. Groups can be defined in a variety of ways, including with
Query
,TimeDuration
andData
instances.An element of the collapse axis can not be a member of more than one group, and may be a member of no groups. Elements that are not selected by the group keyword are excluded from the result.
 Example:
Create annual maxima from a time series, defining a year to start on 1st December.
>>> b = a.collapse('T: maximum', group=cf.Y(month=12))
 Example:
Find the maximum of each group of 6 elements along an axis.
>>> b = a.collapse('T: maximum', group=6)
 Example:
Create December, January, February maxima from a time series.
>>> b = a.collapse('T: maximum', group=cf.djf())
 Example:
Create maxima for each 3month season of a timeseries (DJF, MAM, JJA, SON).
>>> b = a.collapse('T: maximum', group=cf.seasons())
 Example:
Calculate zonal means for the western and eastern hemispheres.
>>> b = a.collapse('X: mean', group=cf.Data(180, 'degrees'))
Groups can be further described with the group_span parameter (to include groups whose actual span is not equal to a given value) and the group_contiguous parameter (to include noncontiguous groups, or any contiguous group containing overlapping cells).
Climatological statistics
Climatological statistics may be derived from corresponding portions of the annual cycle in a set of years (e.g. the average January temperatures in the climatology of 19611990, where the values are derived by averaging the 30 Januarys from the separate years); or from corresponding portions of the diurnal cycle in a set of days (e.g. the average temperatures for each hour in the day for May 1997). A diurnal climatology may also be combined with a multiannual climatology (e.g. the minimum temperature for each hour of the average day in May from a 19611990 climatology).
Calculation requires two or three collapses, depending on the quantity being created, all of which are grouped collapses. Each collapse method needs to indicate its climatological nature with one of the following qualifiers,
Method qualifier
Associated keyword
within years
within_years
within days
within_days
over years
over_years (optional)
over days
over_days (optional)
and the associated keyword specifies how the method is to be applied.
 Example
Calculate the multiannual average of the seasonal means:
>>> b = a.collapse('T: mean within years T: mean over years', ... within_years=cf.seasons(), weights=True)
 Example:
Calculate the multiannual variance of the seasonal minima. Note that the units of the result have been changed from ‘K’ to ‘K2’:
>>> b = a.collapse('T: minimum within years T: variance over years', ... within_years=cf.seasons(), weights=True)
When collapsing over years, it is assumed by default that each portion of the annual cycle is collapsed over all years that are present. This is the case in the above two examples. It is possible, however, to restrict the years to be included, or group them into chunks, with the over_years keyword.
 Example:
Calculate the multiannual average of the seasonal means in 5 year chunks:
>>> b = a.collapse( ... 'T: mean within years T: mean over years', weights=True, ... within_years=cf.seasons(), over_years=cf.Y(5) ... )
 Example:
Calculate the multiannual average of the seasonal means, restricting the years from 1963 to 1968:
>>> b = a.collapse( ... 'T: mean within years T: mean over years', weights=True, ... within_years=cf.seasons(), ... over_years=cf.year(cf.wi(1963, 1968)) ... )
Similarly for collapses over days, it is assumed by default that each portion of the diurnal cycle is collapsed over all days that are present, But it is possible to restrict the days to be included, or group them into chunks, with the over_days keyword.
The calculation can be done with multiple collapse calls, which can be useful if the interim stages are needed independently, but be aware that the interim field constructs will have nonCFcompliant cell method constructs.
 Example:
Calculate the multiannual maximum of the seasonal standard deviations with two separate collapse calls:
>>> b = a.collapse('T: standard_deviation within years', ... within_years=cf.seasons(), weights=True)
New in version 1.0.
See also
bin
,cell_area
,convolution_filter
,moving_window
,radius
,weights
 Parameters
 method:
str
Define the collapse method. All of the axes specified by the axes parameter are collapsed simultaneously by this method. The method is given by one of the following strings (see https://ncascms.github.io/cfpython/analysis.html#collapsemethods for precise definitions):
method
Description
Weighted
'maximum'
The maximum of the values.
Never
'minimum'
The minimum of the values.
Never
'maximum_absolute_value'
The maximum of the absolute values.
Never
'minimum_absolute_value'
The minimum of the absolute values.
Never
'mid_range'
The average of the maximum and the minimum of the values.
Never
'median'
The median of the values.
Never
'range'
The absolute difference between the maximum and the minimum of the values.
Never
'sum'
The sum of the values.
Never
'sum_of_squares'
The sum of the squares of values.
Never
'sample_size'
The sample size, i.e. the number of nonmissing values.
Never
'sum_of_weights'
The sum of weights, as would be used for other calculations.
Never
'sum_of_weights2'
The sum of squares of weights, as would be used for other calculations.
Never
'mean'
The weighted or unweighted mean of the values.
May be
'mean_absolute_value'
The mean of the absolute values.
May be
'mean_of_upper_decile'
The mean of the upper group of data values defined by the upper tenth of their distribution.
May be
'variance'
The weighted or unweighted variance of the values, with a given number of degrees of freedom.
May be
'standard_deviation'
The weighted or unweighted standard deviation of the values, with a given number of degrees of freedom.
May be
'root_mean_square'
The square root of the weighted or unweighted mean of the squares of the values.
May be
'integral'
The integral of values.
Always
Collapse methods that are “Never” weighted ignore the weights parameter, even if it is set.
Collapse methods that “May be” weighted will only be weighted if the weights parameter is set.
Collapse methods that are “Always” weighted require the weights parameter to be set.
An alternative form of providing the collapse method is to provide a CF cell methodslike string. In this case an ordered sequence of collapses may be defined and both the collapse methods and their axes are provided. The axes are interpreted as for the axes parameter, which must not also be set. For example:
>>> g = f.collapse( ... 'time: max (interval 1 hr) X: Y: mean dim3: sd')
is equivalent to:
>>> g = f.collapse('max', axes='time') >>> g = g.collapse('mean', axes=['X', 'Y']) >>> g = g.collapse('sd', axes='dim3')
Climatological collapses are carried out if a method string contains any of the modifiers
'within days'
,'within years'
,'over days'
or'over years'
. For example, to collapse a time axis into multiannual means of calendar monthly minima:>>> g = f.collapse( ... 'time: minimum within years T: mean over years', ... within_years=cf.M() ... )
which is equivalent to:
>>> g = f.collapse( ... 'time: minimum within years', within_years=cf.M()) >>> g = g.collapse('mean over years', axes='T')
 axes: (sequence of)
str
, optional The axes to be collapsed, defined by those which would be selected by passing each given axis description to a call of the field construct’s
domain_axis
method. For example, for a value of'X'
, the domain axis construct returned byf.domain_axis('X')
is selected. If a selected axis has size 1 then it is ignored. By default all axes with size greater than 1 are collapsed. Parameter example:
axes='X'
 Parameter example:
axes=['X']
 Parameter example:
axes=['X', 'Y']
 Parameter example:
axes=['Z', 'time']
If the axes parameter has the special value
'area'
then it is assumed that the X and Y axes are intended. Parameter example:
axes='area'
is equivalent toaxes=['X', 'Y']
. Parameter example:
axes=['area', Z']
is equivalent toaxes=['X', 'Y', 'Z']
.
 weights: optional
Specify the weights for the collapse axes. The weights are, in general, those that would be returned by this call of the field construct’s
weights
method:f.weights(weights, axes=axes, measure=measure, scale=scale, radius=radius, great_circle=great_circle, components=True)
. See the axes, measure, scale, radius and great_circle parameters andcf.Field.weights
for details, and note that the value of scale may be modified depending on the value of measure.Note
By default weights is
None
, resulting in unweighted calculations.If the alternative form of providing the collapse method and axes combined as a CF cell methodslike string via the method parameter has been used, then the axes parameter is ignored and the axes are derived from the method parameter. For example, if method is
'T: area: minimum'
then this defines axes of['T', 'area']
. If method specifies multiple collapses, e.g.'T: minimum area: mean'
then this implies axes of'T'
for the first collapse, and axes of'area'
for the second collapse.Note
Setting weights to
True
is generally a good way to ensure that all collapses are appropriately weighted according to the field construct’s metadata. In this case, if it is not possible to create weights for any axis then an exception will be raised.However, care needs to be taken if weights is
True
when cell volume weights are desired. The volume weights will be taken from a “volume” cell measure construct if one exists, otherwise the cell volumes will be calculated as being proportional to the sizes of onedimensional vertical coordinate cells. In the latter case if the vertical dimension coordinates do not define the actual height or depth thickness of every cell in the domain then the weights will be incorrect. Parameter example:
To specify weights based on the field construct’s metadata for all collapse axes use
weights=True
. Parameter example:
To specify weights based on cell areas use
weights='area'
. Parameter example:
To specify weights based on cell areas and linearly in time you could set
weights=('area', 'T')
.
 measure:
bool
, optional Create weights which are cell measures, i.e. which describe actual cell sizes (e.g. cell area) with appropriate units (e.g. metres squared). By default the weights are normalised and have arbitrary units.
Cell measures can be created for any combination of axes. For example, cell measures for a time axis are the time span for each cell with canonical units of seconds; cell measures for the combination of four axes representing time and three dimensional space could have canonical units of metres cubed seconds.
When collapsing with the
'integral'
method, measure must be True, and the units of the weights are incorporated into the units of the returned field construct.Note
Specifying cell volume weights via
weights=['X', 'Y', 'Z']
orweights=['area', 'Z']
(or other equivalents) will produce an incorrect result if the vertical dimension coordinates do not define the actual height or depth thickness of every cell in the domain. In this case,weights='volume'
should be used instead, which requires the field construct to have a “volume” cell measure construct.If
weights=True
then care also needs to be taken, as a “volume” cell measure construct will be used if present, otherwise the cell volumes will be calculated using the size of the vertical coordinate cells.New in version 3.0.2.
 scale: number or
None
, optional If set to a positive number then scale the weights so that they are less than or equal to that number. If set to
None
the weights are not scaled. In general the default is for weights to be scaled to lie between 0 and 1; however if measure is True then the weights are never scaled and the value of scale is taken asNone
, regardless of its setting. Parameter example:
To scale all weights so that they lie between 0 and 10
scale=10
.
New in version 3.0.2.
 radius: optional
Specify the radius used for calculating the areas of cells defined in spherical polar coordinates. The radius is that which would be returned by this call of the field construct’s
radius
method:f.radius(radius)
. See thecf.Field.radius
for details.By default radius is
'earth'
which means that if and only if the radius can not found from the datums of any coordinate reference constructs, then the default radius taken as 6371229 metres.New in version 3.0.2.
 great_circle:
bool
, optional If True then allow, if required, the derivation of i) area weights from polygon geometry cells by assuming that each cell part is a spherical polygon composed of great circle segments; and ii) and the derivation of linelength weights from line geometry cells by assuming that each line part is composed of great circle segments.
New in version 3.2.0.
 squeeze:
bool
, optional If True then size 1 collapsed axes are removed from the output data array. By default the axes which are collapsed are retained in the result’s data array.
 mtol: number, optional
Set the fraction of input data elements which is allowed to contain missing data when contributing to an individual output data element. Where this fraction exceeds mtol, missing data is returned. The default is 1, meaning that a missing datum in the output array occurs when its contributing input array elements are all missing data. A value of 0 means that a missing datum in the output array occurs whenever any of its contributing input array elements are missing data. Any intermediate value is permitted.
 Parameter example:
To ensure that an output array element is a missing datum if more than 25% of its input array elements are missing data:
mtol=0.25
.
 ddof: number, optional
The delta degrees of freedom in the calculation of a standard deviation or variance. The number of degrees of freedom used in the calculation is (Nddof) where N represents the number of nonmissing elements. By default ddof is 1, meaning the standard deviation and variance of the population is estimated according to the usual formula with (N1) in the denominator to avoid the bias caused by the use of the sample mean (Bessel’s correction).
 coordinate: optional
Specify how the cell coordinate values for collapsed axes are placed. This has no effect on the cell bounds for the collapsed axes, which always represent the extrema of the input coordinates.
The coordinate parameter may be one of:
coordinate
Description
This is the default.
If the collapse is a climatological time collapse over years or over days then assume a value of
'min'
, otherwise assume value of'mid_range'
.'mid_range'
An output coordinate is the mean of first and last input coordinate bounds (or the first and last coordinates if there are no bounds). This is the default.
'minimum'
An output coordinate is the minimum of the input coordinates.
'maximum'
An output coordinate is the maximum of the input coordinates.
 Parameter example:
coordinate='minimum'
 group: optional
A grouped collapse is one for which an axis is not collapsed completely to size 1. Instead, the collapse axis is partitioned into nonoverlapping groups and each group is collapsed to size 1, independently of the other groups. The results of the collapses are concatenated so that the output axis has a size equal to the number of groups.
An element of the collapse axis can not be a member of more than one group, and may be a member of no groups. Elements that are not selected by the group parameter are excluded from the result.
The group parameter defines how the axis elements are partitioned into groups, and may be one of:
group
Description
Define groups by coordinate values that span the given range. The first group starts at the first coordinate bound of the first axis element (or its coordinate if there are no bounds) and spans the defined group size. Each subsequent group immediately follows the preceding one. By default each group contains the consecutive run of elements whose coordinate values lie within the group limits (see the group_by parameter).
By default each element will be in exactly one group (see the group_by, group_span and group_contiguous parameters).
By default groups may contain different numbers of elements.
If no units are specified then the units of the coordinates are assumed.
Define groups by a time interval spanned by the coordinates. The first group starts at or before the first coordinate bound of the first axis element (or its coordinate if there are no bounds) and spans the defined group size. Each subsequent group immediately follows the preceding one. By default each group contains the consecutive run of elements whose coordinate values lie within the group limits (see the group_by parameter).
By default each element will be in exactly one group (see the group_by, group_span and group_contiguous parameters).
By default groups may contain different numbers of elements.
The start of the first group may be before the first first axis element, depending on the offset defined by the time duration. For example, if
group=cf.Y(month=12)
then the first group will start on the closest 1st December to the first axis element.
Define groups from elements whose coordinates satisfy the query condition. Multiple groups are created: one for each maximally consecutive run within the selected elements.
If a sequence of
Query
is provided then groups are defined for each query.If a coordinate does not satisfy any of the query conditions then its element will not be in a group.
By default groups may contain different numbers of elements.
If no units are specified then the units of the coordinates are assumed.
If an element is selected by two or more queries then the latest one in the sequence defines which group it will be in.
Define groups that contain the given number of elements. The first group starts with the first axis element and spans the defined number of consecutive elements. Each subsequent group immediately follows the preceding one.
By default each group has the defined number of elements, apart from the last group which may contain fewer elements (see the group_span parameter).
Define groups by selecting elements that map to the same value in the
numpy
array. The array must contain integers and have the same length as the axis to be collapsed and its sequence of values correspond to the axis elements. Each group contains the elements which correspond to a common nonnegative integer value in the numpy array. Upon output, the collapsed axis is arranged in order of increasing group number. See the regroup parameter, which allows the creation of such anumpy.array
for a given grouped collapse.The groups do not have to be in runs of consecutive elements; they may be scattered throughout the axis.
An element which corresponds to a negative integer in the array will not be in any group.
 Parameter example:
To define groups of 10 kilometres:
group=cf.Data(10, 'km')
. Parameter example:
To define groups of 5 days, starting and ending at midnight on each day:
group=cf.D(5)
(seecf.D
). Parameter example:
To define groups of 1 calendar month, starting and ending at day 16 of each month:
group=cf.M(day=16)
(seecf.M
). Parameter example:
To define groups of the season MAM in each year:
group=cf.mam()
(seecf.mam
). Parameter example:
To define groups of the seasons DJF and JJA in each year:
group=[cf.jja(), cf.djf()]
. To define groups for seasons DJF, MAM, JJA and SON in each year:group=cf.seasons()
(seecf.djf
,cf.jja
andcf.season
). Parameter example:
To define groups for longitude elements less than or equal to 90 degrees and greater than 90 degrees:
group=[cf.le(90, 'degrees'), cf.gt(90, 'degrees')]
(seecf.le
andcf.gt
). Parameter example:
To define groups of 5 elements:
group=5
. Parameter example:
For an axis of size 8, create two groups, the first containing the first and last elements and the second containing the 3rd, 4th and 5th elements, whilst ignoring the 2nd, 6th and 7th elements:
group=numpy.array([0, 1, 4, 4, 4, 1, 2, 0])
.
 regroup:
bool
, optional If True then, for grouped collapses, do not collapse the field construct, but instead return a
numpy.array
of integers which identifies the groups defined by the group parameter. Each group contains the elements which correspond to a common nonnegative integer value in the numpy array. Elements corresponding to negative integers are not in any group. The array may subsequently be used as the value of the group parameter in a separate collapse.For example:
>>> groups = f.collapse('time: mean', group=10, regroup=True) >>> g = f.collapse('time: mean', group=groups)
is equivalent to:
>>> g = f.collapse('time: mean', group=10)
 group_by: optional
Specify how coordinates are assigned to the groups defined by the group, within_days or within_years parameters. Ignored unless one of these parameters is set to a
Data
orTimeDuration
object.The group_by parameter may be one of:
group_by
Description
This is the default.
If the groups are defined by the group parameter (i.e. collapses other than climatological time collapses) then assume a value of
'coords'
.If the groups are defined by the within_days or within_years parameter (i.e. climatological time collapses) then assume a value of
'bounds'
.'coords'
Each group contains the axis elements whose coordinate values lie within the group limits. Every element will be in a group.
'bounds'
Each group contains the axis elements whose upper and lower coordinate bounds both lie within the group limits. Some elements may not be inside any group, either because the group limits do not coincide with coordinate bounds or because the group size is sufficiently small.
 group_span: optional
Specify how to treat groups that may not span the desired range. For example, when creating 3month means, the group_span parameter can be used to allow groups which only contain 1 or 2 months of data.
By default, group_span is
None
. This means that only groups whose span equals the size specified by the definition of the groups are collapsed; unless the groups have been defined by one or moreQuery
objects, in which case then the default behaviour is to collapse all groups, regardless of their size.In effect, the group_span parameter defaults to
True
unless the groups have been defined by one or moreQuery
objects, in which case group_span defaults toFalse
.The different behaviour when the groups have been defined by one or more
Query
objects is necessary because aQuery
object can only define the composition of a group, and not its size (see the parameter examples below for how to specify a group span in this case).Note
Prior to version 3.1.0, the default value of group_span was effectively
False
.In general, the span of a group is the absolute difference between the lower bound of its first element and the upper bound of its last element. The only exception to this occurs if group_span is (by default or by explicit setting) an integer, in which case the span of a group is the number of elements in the group. See also the group_contiguous parameter for how to deal with groups that have gaps in their coverage.
The group_span parameter is only applied to groups defined by the group, within_days or within_years parameters, and is otherwise ignored.
The group_span parameter may be one of:
group_span
Description
This is the default. Apply a value of
True
orFalse
depending on how the groups have been defined.Ignore groups whose span is not equal to the size specified by the definition of the groups. Only applicable if the groups are defined by a
Data
,TimeDuration
orint
object, and this is the default in this case.Collapse all groups, regardless of their size. This is the default if the groups are defined by one to more
Query
objects.Ignore groups whose span is not equal to the given size. If no units are specified then the units of the coordinates are assumed.
Ignore groups whose span is not equals to the given time duration.
Ignore groups that contain fewer than the given number of elements
 Parameter example:
To collapse into groups of 10km, ignoring any groups that span less than that distance:
group=cf.Data(10, 'km'), group_span=True
. Parameter example:
To collapse a daily timeseries into monthly groups, ignoring any groups that span less than 1 calendar month: monthly values:
group=cf.M(), group_span=True
(seecf.M
). Parameter example:
To collapse a timeseries into seasonal groups, ignoring any groups that span less than three months:
group=cf.seasons(), group_span=cf.M(3)
(seecf.seasons
andcf.M
).
 group_contiguous:
int
, optional Specify how to treat groups whose elements are not contiguous or have overlapping cells. For example, when creating a December to February means, the group_contiguous parameter can be used to allow groups which have no data for January.
A group is considered to be contiguous unless it has coordinates with bounds that do not coincide for adjacent cells. The definition may be expanded to include groups whose coordinate bounds that overlap.
By default group_contiguous is
1
, meaning that noncontiguous groups, and those whose coordinate bounds overlap, are not collapsedNote
Prior to version 3.1.0, the default value of group_contiguous was
0
.The group_contiguous parameter is only applied to groups defined by the group, within_days or within_years parameters, and is otherwise ignored.
The group_contiguous parameter may be one of:
group_contiguous
Description
0
Allow noncontiguous groups, and those containing overlapping cells.
1
This is the default. Ignore noncontiguous groups, as well as contiguous groups containing overlapping cells.
2
Ignore noncontiguous groups, allowing contiguous groups containing overlapping cells.
 Parameter example:
To allow noncontiguous groups, and those containing overlapping cells:
group_contiguous=0
.
 within_days: optional
Define the groups for creating CF “within days” climatological statistics.
Each group contains elements whose coordinates span a time interval of up to one day. The results of the collapses are concatenated so that the output axis has a size equal to the number of groups.
Note
For CF compliance, a “within days” collapse should be followed by an “over days” collapse.
The within_days parameter defines how the elements are partitioned into groups, and may be one of:
within_days
Description
Defines the group size in terms of a time interval of up to one day. The first group starts at or before the first coordinate bound of the first axis element (or its coordinate if there are no bounds) and spans the defined group size. Each subsequent group immediately follows the preceding one. By default each group contains the consecutive run of elements whose coordinate cells lie within the group limits (see the group_by parameter).
Groups may contain different numbers of elements.
The start of the first group may be before the first first axis element, depending on the offset defined by the time duration. For example, if
group=cf.D(hour=12)
then the first group will start on the closest midday to the first axis element.
Define groups from elements whose coordinates satisfy the query condition. Multiple groups are created: one for each maximally consecutive run within the selected elements.
If a sequence of
Query
is provided then groups are defined for each query.Groups may contain different numbers of elements.
If no units are specified then the units of the coordinates are assumed.
If a coordinate does not satisfy any of the conditions then its element will not be in a group.
If an element is selected by two or more queries then the latest one in the sequence defines which group it will be in.
 Parameter example:
To define groups of 6 hours, starting at 00:00, 06:00, 12:00 and 18:00:
within_days=cf.h(6)
(seecf.h
). Parameter example:
To define groups of 1 day, starting at 06:00:
within_days=cf.D(1, hour=6)
(seecf.D
). Parameter example:
To define groups of 00:00 to 06:00 within each day, ignoring the rest of each day:
within_days=cf.hour(cf.le(6))
(seecf.hour
andcf.le
). Parameter example:
To define groups of 00:00 to 06:00 and 18:00 to 24:00 within each day, ignoring the rest of each day:
within_days=[cf.hour(cf.le(6)), cf.hour(cf.gt(18))]
(seecf.gt
,cf.hour
andcf.le
).
 within_years: optional
Define the groups for creating CF “within years” climatological statistics.
Each group contains elements whose coordinates span a time interval of up to one calendar year. The results of the collapses are concatenated so that the output axis has a size equal to the number of groups.
Note
For CF compliance, a “within years” collapse should be followed by an “over years” collapse.
The within_years parameter defines how the elements are partitioned into groups, and may be one of:
within_years
Description
Define the group size in terms of a time interval of up to one calendar year. The first group starts at or before the first coordinate bound of the first axis element (or its coordinate if there are no bounds) and spans the defined group size. Each subsequent group immediately follows the preceding one. By default each group contains the consecutive run of elements whose coordinate cells lie within the group limits (see the group_by parameter).
Groups may contain different numbers of elements.
The start of the first group may be before the first first axis element, depending on the offset defined by the time duration. For example, if
group=cf.Y(month=12)
then the first group will start on the closest 1st December to the first axis element.
Define groups from elements whose coordinates satisfy the query condition. Multiple groups are created: one for each maximally consecutive run within the selected elements.
If a sequence of
Query
is provided then groups are defined for each query.The first group may start outside of the range of coordinates (the start of the first group is controlled by parameters of the
TimeDuration
).If group boundaries do not coincide with coordinate bounds then some elements may not be inside any group.
If the group size is sufficiently small then some elements may not be inside any group.
Groups may contain different numbers of elements.
 Parameter example:
To define groups of 90 days:
within_years=cf.D(90)
(seecf.D
). Parameter example:
To define groups of 3 calendar months, starting on the 15th of a month:
within_years=cf.M(3, day=15)
(seecf.M
). Parameter example:
To define groups for the season MAM within each year:
within_years=cf.mam()
(seecf.mam
). Parameter example:
To define groups for February and for November to December within each year:
within_years=[cf.month(2), cf.month(cf.ge(11))]
(seecf.month
andcf.ge
).
 over_days: optional
Define the groups for creating CF “over days” climatological statistics.
By default (or if over_days is
None
) each group contains all elements for which the time coordinate cell lower bounds have a common time of day but different dates, and for which the time coordinate cell upper bounds also have a common time of day but different dates. The collapsed dime axis will have a size equal to the number of groups that were found.For example, elements corresponding to the two time coordinate cells
19991231 06:00:00/19991231 18:00:00
20000101 06:00:00/20000101 18:00:00
would be together in a group; and elements corresponding to the two time coordinate cells
19991231 00:00:00/20000101 00:00:00
20000101 00:00:00/20000102 00:00:00
would also be together in a different group.
Note
For CF compliance, an “over days” collapse should be preceded by a “within days” collapse.
The default groups may be split into smaller groups if the over_days parameter is one of:
over_days
Description
Split each default group into smaller groups which span the given time duration, which must be at least one day.
Groups may contain different numbers of elements.
The start of the first group may be before the first first axis element, depending on the offset defined by the time duration. For example, if
group=cf.M(day=15)
then the first group will start on the closest 15th of a month to the first axis element.
Split each default group into smaller groups whose coordinate cells satisfy the query condition.
If a sequence of
Query
is provided then groups are defined for each query.Groups may contain different numbers of elements.
If a coordinate does not satisfy any of the conditions then its element will not be in a group.
If an element is selected by two or more queries then the latest one in the sequence defines which group it will be in.
 Parameter example:
To define groups for January and for June to December, ignoring all other months:
over_days=[cf.month(1), cf.month(cf.wi(6, 12))]
(seecf.month
andcf.wi
). Parameter example:
To define groups spanning 90 days:
over_days=cf.D(90)
orover_days=cf.h(2160)
. (seecf.D
andcf.h
). Parameter example:
To define groups that each span 3 calendar months, starting and ending at 06:00 in the first day of each month:
over_days=cf.M(3, hour=6)
(seecf.M
). Parameter example:
To define groups that each span a calendar month
over_days=cf.M()
(seecf.M
). Parameter example:
To define groups for January and for June to December, ignoring all other months:
over_days=[cf.month(1), cf.month(cf.wi(6, 12))]
(seecf.month
andcf.wi
).
 over_years: optional
Define the groups for creating CF “over years” climatological statistics.
By default (or if over_years is
None
) each group contains all elements for which the time coordinate cell lower bounds have a common date of the year but different years, and for which the time coordinate cell upper bounds also have a common date of the year but different years. The collapsed dime axis will have a size equal to the number of groups that were found.For example, elements corresponding to the two time coordinate cells
19991201 00:00:00/20000101 00:00:00
20001201 00:00:00/20010101 00:00:00
would be together in a group.
Note
For CF compliance, an “over years” collapse should be preceded by a “within years” or “over days” collapse.
The default groups may be split into smaller groups if the over_years parameter is one of:
over_years
Description
Split each default group into smaller groups which span the given time duration, which must be at least one day.
Groups may contain different numbers of elements.
The start of the first group may be before the first first axis element, depending on the offset defined by the time duration. For example, if
group=cf.Y(month=12)
then the first group will start on the closest 1st December to the first axis element.
Split each default group into smaller groups whose coordinate cells satisfy the query condition.
If a sequence of
Query
is provided then groups are defined for each query.Groups may contain different numbers of elements.
If a coordinate does not satisfy any of the conditions then its element will not be in a group.
If an element is selected by two or more queries then the latest one in the sequence defines which group it will be in.
 Parameter example:
An element with coordinate bounds {19990601 06:00:00, 19990901 06:00:00} matches an element with coordinate bounds {20000601 06:00:00, 20000901 06:00:00}.
 Parameter example:
An element with coordinate bounds {19991201 00:00:00, 20001201 00:00:00} matches an element with coordinate bounds {20001201 00:00:00, 20011201 00:00:00}.
 Parameter example:
To define groups spanning 10 calendar years:
over_years=cf.Y(10)
orover_years=cf.M(120)
(seecf.M
andcf.Y
). Parameter example:
To define groups spanning 5 calendar years, starting and ending at 06:00 on 01 December of each year:
over_years=cf.Y(5, month=12, hour=6)
(seecf.Y
). Parameter example:
To define one group spanning 1981 to 1990 and another spanning 2001 to 2005:
over_years=[cf.year(cf.wi(1981, 1990), cf.year(cf.wi(2001, 2005)]
(seecf.year
andcf.wi
).
 remove_vertical_crs:
bool
, optional If True, the default, then remove a vertical coordinate reference construct and all of its domain ancillary constructs if any of its coordinate constructs or domain ancillary constructs span any collapse axes.
If False then only the vertical coordinate reference construct’s domain ancillary constructs that span any collapse axes are removed, but the vertical coordinate reference construct remains. This could result in
compute_vertical_coordinates
returning incorrect nonparametric vertical coordinate values.New in version 3.14.1.
 inplace:
bool
, optional If True then do the operation inplace and return
None
. i: deprecated at version 3.0.0
Use the inplace parameter instead.
kwargs: deprecated at version 3.0.0
 method:
 Returns
Field
ornumpy.ndarray
The collapsed field construct. Alternatively, if the regroup parameter is True then a
numpy
array is returned.
Examples
There are further worked examples in https://ncascms.github.io/cfpython/analysis.html#statisticalcollapses