# cf.Field.bin¶

Field.bin(method, digitized, weights=None, measure=False, scale=None, mtol=1, ddof=1, radius='earth', return_indices=False, verbose=False)[source]

Collapse the data values that lie in N-dimensional bins.

The data values of the field construct are binned according to how they correspond to the N-dimensionsal histogram bins of another set of variables (see cf.histogram for details), and each bin of values is collapsed with one of the collapse methods allowed by the method parameter.

The number of dimensions of the output binned data is equal to the number of field constructs provided by the digitized argument. Each such field construct defines a sequence of bins and provides indices to the bins that each value of another field construct belongs. There is no upper limit to the number of dimensions of the output binned data.

The output bins are defined by the exterior product of the one-dimensional bins of each digitized field construct. For example, if only one digitized field construct is provided then the output bins simply comprise its one-dimensional bins; if there are two digitized field constructs then the output bins comprise the two-dimensionsal matrix formed by all possible combinations of the two sets of one-dimensional bins; etc.

An output value for a bin is formed by collapsing (using the method given by the method parameter) the elements of the data for which the corresponding locations in the digitized field constructs, taken together, index that bin. Note that it may be the case that not all output bins are indexed by the digitized field constructs, and for these bins missing data is returned.

The returned field construct will have a domain axis construct for each dimension of the output bins, with a corresponding dimension coordinate construct that defines the bin boundaries.

New in version 3.0.2.

Parameters:
method: str

The collapse method used to combine values that map to each cell of the output field construct. The following methods are available (see https://ncas-cms.github.io/cf-python/tutorial.html#collapse-methods for precise definitions):

method Description Weighted
'maximum' The maximum of the values. Never
'minimum' The minimum of the values. Never
'maximum_absolute_value' The maximum of the absolute values. Never
'minimum_absolute_value' The minimum of the absolute values. Never
'mid_range' The average of the maximum and the minimum of the values. Never
'range' The absolute difference between the maximum and the minimum of the values. Never
'sum' The sum of the values. Never
'sum_of_squares' The sum of the squares of values. Never
'sample_size' The sample size, i.e. the number of non-missing values. Never
'sum_of_weights' The sum of weights, as would be used for other calculations. Never
'sum_of_weights2' The sum of squares of weights, as would be used for other calculations. Never
'mean' The weighted or unweighted mean of the values. May be
'mean_absolute_value' The mean of the absolute. May be
'variance' The weighted or unweighted variance of the values, with a given number of degrees of freedom. May be
'standard_deviation' The square root of the weighted or unweighted variance. May be
'root_mean_square' The square root of the weighted or unweighted mean of the squares of the values. May be
'integral' The integral of values. Always
• Collapse methods that are “Never” weighted ignore the weights parameter, even if it is set.
• Collapse methods that “May be” weighted will only be weighted if the weights parameter is set.
• Collapse methods that are “Always” weighted require the weights parameter to be set.
digitized: (sequence of) Field

One or more field constructs that contain digitized data with corresponding metadata, as would be output by cf.Field.digitize. Each field construct contains indices to the one-dimensionsal bins to which each value of an original field construct belongs; and there must be bin_count and bin_bounds properties as defined by the digitize method (and any of the extra properties defined by that method are also recommended).

The bins defined by the bin_count and bin_bounds properties are used to create a dimension coordinate construct for the output field construct.

Each digitized field construct must be transformable so that it is broadcastable to the input field contruct’s data. This is done by using the metadata constructs of the to create a mapping of physically compatible dimensions between the fields, and then manipulating the dimensions of the digitized field construct’s data to ensure that broadcasting can occur.

weights: optional

Specify the weights for the collapse calculations. The weights are those that would be returned by this call of the field construct’s weights method: f.weights(weights, measure=measure, scale=scale, components=True). See the measure and scale parameters and cf.Field.weights for details.

Note

By default weights is None, resulting in unweighted calculations.

Parameter example:

To specify weights based on cell areas use weights='area'.

Parameter example:

To specify weights based on cell areas and linearly in time you could set weights=('area', 'T').

measure: bool, optional

Create weights, as defined by the weights parameter, which are cell measures, i.e. which describe actual cell sizes (e.g. cell areas) with appropriate units (e.g. metres squared). By default the weights are scaled to lie between 0 and 1 and have arbitrary units (see the scale parameter).

Cell measures can be created for any combination of axes. For example, cell measures for a time axis are the time span for each cell with canonical units of seconds; cell measures for the combination of four axes representing time and three dimensional space could have canonical units of metres cubed seconds.

When collapsing with the 'integral' method, measure must be True, and the units of the weights are incorporated into the units of the returned field construct.

Note

Specifying cell volume weights via weights=['X', 'Y', 'Z'] or weights=['area', 'Z'] (or other equivalents) will produce an incorrect result if the vertical dimension coordinates do not define the actual height or depth thickness of every cell in the domain. In this case, weights='volume' should be used instead, which requires the field construct to have a “volume” cell measure construct.

scale: number, optional

If set to a positive number then scale the weights, as defined by the weights parameter, so that they are less than or equal to that number. By default the weights are scaled to lie between 0 and 1 (i.e. scale is 1), and have arbitrary units.

Parameter example:

To scale all weights so that they lie between 0 and 0.5: scale=0.5.

mtol: number, optional

Set the fraction of input array elements which is allowed to contain missing data when contributing to an individual output array element. Where this fraction exceeds mtol, missing data is returned. The default is 1, meaning that a missing datum in the output array occurs when its contributing input array elements are all missing data. A value of 0 means that a missing datum in the output array occurs whenever any of its contributing input array elements are missing data. Any intermediate value is permitted.

Parameter example:

To ensure that an output array element is a missing datum if more than 25% of its input array elements are missing data: mtol=0.25.

ddof: number, optional

The delta degrees of freedom in the calculation of a standard deviation or variance. The number of degrees of freedom used in the calculation is (N-ddof) where N represents the number of non-missing elements contributing to the calculation. By default ddof is 1, meaning the standard deviation and variance of the population is estimated according to the usual formula with (N-1) in the denominator to avoid the bias caused by the use of the sample mean (Bessel’s correction).

Specify the radius used for calculating the areas of cells defined in spherical polar coordinates. The radius is that which would be returned by this call of the field construct’s radius method: f.radius(radius). See the cf.Field.radius for details.

By default radius is 'earth' which means that if and only if the radius can not found from the datums of any coordinate reference constucts, then the default radius taken as 6371229 metres.

verbose: bool, optional

If True then print a description of the binned field construct creation process.

Returns:
Field

The field construct containing the binned values.

Examples:

Find the range of values that lie in each bin:

>>> print(q)
Field: specific_humidity (ncvar%q)
----------------------------------
Data            : specific_humidity(latitude(5), longitude(8)) 0.001 1
Cell methods    : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> print(q.array)
[[  7.  34.   3.  14.  18.  37.  24.  29.]
[ 23.  36.  45.  62.  46.  73.   6.  66.]
[110. 131. 124. 146.  87. 103.  57.  11.]
[ 29.  59.  39.  70.  58.  72.   9.  17.]
[  6.  36.  19.  35.  18.  37.  34.  13.]]
>>> indices = q.digitize(10)
>>> b = q.bin('range', digitized=indices)
>>> print(b)
Field: specific_humidity
------------------------
Data            : specific_humidity(specific_humidity(10)) 0.001 1
Cell methods    : latitude: longitude: range
Dimension coords: specific_humidity(10) = [10.15, ..., 138.85000000000002] 0.001 1
>>> print(b.array)
[14. 11. 11. 13. 11.  0.  0.  0.  7.  0.]


Find various metrics describing how tendency_of_sea_water_potential_temperature_expressed_as_heat_content data varies with sea_water_potential_temperature and sea_water_salinity:

>>> t
Field: sea_water_potential_temperature (ncvar%sea_water_potential_temperature)
------------------------------------------------------------------------------
Data            : sea_water_potential_temperature(time(1), depth(1), latitude(5), longitude(8)) K
Cell methods    : area: mean time(1): mean
Dimension coords: time(1) = [2290-06-01 00:00:00] 360_day
: depth(1) = [3961.89990234375] m
: latitude(5) = [-1.875, ..., 3.125] degrees_north
: longitude(8) = [75.0, ..., 83.75] degrees_east
Auxiliary coords: model_level_number(depth(1)) = [18]
>>> s
Field: sea_water_salinity (ncvar%sea_water_salinity)
----------------------------------------------------
Data            : sea_water_salinity(time(1), depth(1), latitude(5), longitude(8)) psu
Cell methods    : area: mean time(1): mean
Dimension coords: time(1) = [2290-06-01 00:00:00] 360_day
: depth(1) = [3961.89990234375] m
: latitude(5) = [-1.875, ..., 3.125] degrees_north
: longitude(8) = [75.0, ..., 83.75] degrees_east
Auxiliary coords: model_level_number(depth(1)) = [18]
>>> x
Field: tendency_of_sea_water_potential_temperature_expressed_as_heat_content (ncvar%tend)
-----------------------------------------------------------------------------------------
Data            : tendency_of_sea_water_potential_temperature_expressed_as_heat_content(time(1), depth(1), latitude(5), longitude(8)) W m-2
Cell methods    : area: mean time(1): mean
Dimension coords: time(1) = [2290-06-01 00:00:00] 360_day
: depth(1) = [3961.89990234375] m
: latitude(5) = [-1.875, ..., 3.125] degrees_north
: longitude(8) = [75.0, ..., 83.75] degrees_east
Auxiliary coords: model_level_number(depth(1)) = [18]
>>> print(x.array)
[[[[-209.72  340.86   94.75  154.21   38.54 -262.75  158.22  154.58]
[ 311.67  245.91 -168.16   47.61 -219.66 -270.33  226.1    52.0 ]
[     -- -112.34  271.67  189.22    9.92  232.39  221.17  206.0 ]
[     --      --  -92.31 -285.57  161.55  195.89 -258.29    8.35]
[     --      --   -7.82 -299.79  342.32 -169.38  254.5   -75.4 ]]]]

>>> t_indices = t.digitize(6)
>>> s_indices = s.digitize(4)

>>> n = x.bin('sample_size', [t_indices, s_indices])
>>> print(n)
Field: number_of_observations
-----------------------------
Data            : number_of_observations(sea_water_salinity(4), sea_water_potential_temperature(6)) 1
Cell methods    : latitude: longitude: point
Dimension coords: sea_water_salinity(4) = [6.3054151982069016, ..., 39.09366758167744] psu
: sea_water_potential_temperature(6) = [278.1569468180338, ..., 303.18466695149743] K
>>> print(n.array)
[[ 1  2 2  2 --  2]
[ 2  1 3  3  3  2]
[-- -- 3 --  1 --]
[ 1 -- 1  3  2  1]]

>>> m = x.bin('mean', [t_indices, s_indices], weights=['X', 'Y', 'Z', 'T'])
>>> print(m)
Field: tendency_of_sea_water_potential_temperature_expressed_as_heat_content
----------------------------------------------------------------------------
Data            : tendency_of_sea_water_potential_temperature_expressed_as_heat_content(sea_water_salinity(4), sea_water_potential_temperature(6)) W m-2
Cell methods    : latitude: longitude: mean
Dimension coords: sea_water_salinity(4) = [6.3054151982069016, ..., 39.09366758167744] psu
: sea_water_potential_temperature(6) = [278.1569468180338, ..., 303.18466695149743] K
>>> print(m.array)
[[ 189.22 131.36    6.75 -41.61     --  100.04]
[-116.73 232.38   -4.82 180.47 134.25 -189.55]
[     --     --  180.69     --  47.61      --]
[158.22      -- -262.75  64.12 -51.83 -219.66]]

>>> i = x.bin('integral', [t_indices, s_indices], weights=['X', 'Y', 'Z', 'T'], measure=True)
>>> print(i)
Field: long_name=integral of tendency_of_sea_water_potential_temperature_expressed_as_heat_content
--------------------------------------------------------------------------------------------------
Data            : long_name=integral of tendency_of_sea_water_potential_temperature_expressed_as_heat_content(sea_water_salinity(4), sea_water_potential_temperature(6)) 86400 m3.kg.s-2
Cell methods    : latitude: longitude: sum
Dimension coords: sea_water_salinity(4) = [6.3054151982069016, ..., 39.09366758167744] psu
: sea_water_potential_temperature(6) = [278.1569468180338, ..., 303.18466695149743] K
>>> print(i.array)
[[ 3655558758400.0 5070927691776.0   260864491520.0 -1605439586304.0               --  3863717609472.0]
[-4509735059456.0 4489564127232.0  -280126521344.0 10454746267648.0  7777254113280.0 -7317268463616.0]
[              --              -- 10470463373312.0               --   919782031360.0               --]
[ 3055211773952.0              -- -5073676009472.0  3715958833152.0 -2000787079168.0 -4243632160768.0]]

>>> w = x.bin('sum_of_weights', [t_indices, s_indices], weights=['X', 'Y', 'Z', 'T'], measure=True)
Field: long_name=sum_of_weights of tendency_of_sea_water_potential_temperature_expressed_as_heat_content
--------------------------------------------------------------------------------------------------------
Data            : long_name=sum_of_weights of tendency_of_sea_water_potential_temperature_expressed_as_heat_content(sea_water_salinity(4), sea_water_potential_temperature(6)) 86400 m3.s
Cell methods    : latitude: longitude: sum
Dimension coords: sea_water_salinity(4) = [7.789749830961227, ..., 36.9842486679554] psu
: sea_water_potential_temperature(6) = [274.50717671712243, ..., 302.0188242594401] K
>>> print(w.array)
[[19319093248.0 38601412608.0 38628990976.0 38583025664.0            --  38619795456.0]
[38628990976.0 19319093248.0 57957281792.0 57929699328.0 57929695232.0  38601412608.0]
[         --              -- 57948086272.0            -- 19319093248.0             --]
[19309897728.0            -- 19309897728.0 57948086272.0 38601412608.0  19319093248.0]]


Demonstrate that the integral divided by the sum of the cell measures is equal to the mean:

>>> print(i/w)
Field:
-------
Data            : (sea_water_salinity(4), sea_water_potential_temperature(6)) kg.s-3
Cell methods    : latitude: longitude: sum
Dimension coords: sea_water_salinity(4) = [7.789749830961227, ..., 36.9842486679554] psu
: sea_water_potential_temperature(6) = [274.50717671712243, ..., 302.0188242594401] K
>>> (i/w == m).all()
True