cf.histogram¶
-
cf.
histogram
(*digitized)[source]¶ Return the distribution of a set of variables in the form of an N-dimensional histogram.
The number of dimensions of the histogram is equal to the number of field constructs provided by the digitized argument. Each such field construct defines a sequence of bins and provides indices to the bins that each value of one of the variables belongs. There is no upper limit to the number of dimensions of the histogram.
The output histogram bins are defined by the exterior product of the one-dimensional bins of each digitized field construct. For example, if only one digitized field construct is provided then the histogram bins simply comprise its one-dimensional bins; if there are two digitized field constructs then the histogram bins comprise the two-dimensional matrix formed by all possible combinations of the two sets of one-dimensional bins; etc.
An output value for an histogram bin is formed by counting the number cells for which the digitized field constructs, taken together, index that bin. Note that it may be the case that not all output bins are indexed by the digitized field constructs, and for these bins missing data is returned.
The returned field construct will have a domain axis construct for each dimension of the histogram, with a corresponding dimension coordinate construct that defines the bin boundaries.
New in version 3.0.2.
- Parameters
- digitized: one or more
Field
One or more field constructs that contain digitized data with corresponding metadata, as would be output by
cf.Field.digitize
. Each field construct contains indices to the one-dimensional bins to which each value of an original field construct belongs; and there must bebin_count
andbin_bounds
properties as defined by thecf.Field.digitize
method (and any of the extra properties defined by that method are also recommended).The bins defined by the
bin_count
andbin_bounds
properties are used to create a dimension coordinate construct for the output field construct.Each digitized field construct must be transformable so that its data is broadcastable to any other digitized field contruct’s data. This is done by using the metadata constructs of the to create a mapping of physically compatible dimensions between the fields, and then manipulating the dimensions of the digitized field construct’s data to ensure that broadcasting can occur.
- digitized: one or more
- Returns
Field
The field construct containing the histogram.
Examples
Create a one-dimensional histogram based on 10 equally-sized bins that exactly span the data range:
>>> f = cf.example_field(0) >>> print(f) Field: specific_humidity (ncvar%q) ---------------------------------- Data : specific_humidity(latitude(5), longitude(8)) 1 Cell methods : area: mean Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north : longitude(8) = [22.5, ..., 337.5] degrees_east : time(1) = [2019-01-01 00:00:00] >>> print(f.array) [[0.007 0.034 0.003 0.014 0.018 0.037 0.024 0.029] [0.023 0.036 0.045 0.062 0.046 0.073 0.006 0.066] [0.11 0.131 0.124 0.146 0.087 0.103 0.057 0.011] [0.029 0.059 0.039 0.07 0.058 0.072 0.009 0.017] [0.006 0.036 0.019 0.035 0.018 0.037 0.034 0.013]] >>> indices, bins = f.digitize(10, return_bins=True) >>> print(indices) Field: long_name=Bin index to which each 'specific_humidity' value belongs (ncvar%q) ------------------------------------------------------------------------------------ Data : long_name=Bin index to which each 'specific_humidity' value belongs(latitude(5), longitude(8)) Cell methods : area: mean Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north : longitude(8) = [22.5, ..., 337.5] degrees_east : time(1) = [2019-01-01 00:00:00] >>> print(bins.array) [[0.003 0.0173] [0.0173 0.0316] [0.0316 0.0459] [0.0459 0.0602] [0.0602 0.0745] [0.0745 0.0888] [0.0888 0.1031] [0.1031 0.1174] [0.1174 0.1317] [0.1317 0.146 ]] >>> h = cf.histogram(indices) >>> rint(h) Field: number_of_observations ----------------------------- Data : number_of_observations(specific_humidity(10)) 1 Cell methods : latitude: longitude: point Dimension coords: specific_humidity(10) = [0.01015, ..., 0.13885] 1 >>> print(h.array) [9 7 9 4 5 1 1 1 2 1] >>> print(h.coordinate('specific_humidity').bounds.array) [[0.003 0.0173] [0.0173 0.0316] [0.0316 0.0459] [0.0459 0.0602] [0.0602 0.0745] [0.0745 0.0888] [0.0888 0.1031] [0.1031 0.1174] [0.1174 0.1317] [0.1317 0.146 ]]
Create a two-dimensional histogram based on specific humidity and temperature bins. The temperature bins in this example are derived from a dummy temperature field construct with the same shape as the specific humidity field construct already in use:
>>> g = f.copy() >>> g.standard_name = 'air_temperature' >>> import numpy >>> g[...] = numpy.random.normal(loc=290, scale=10, size=40).reshape(5, 8) >>> g.override_units('K', inplace=True) >>> print(g) Field: air_temperature (ncvar%q) -------------------------------- Data : air_temperature(latitude(5), longitude(8)) K Cell methods : area: mean Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north : longitude(8) = [22.5, ..., 337.5] degrees_east : time(1) = [2019-01-01 00:00:00]
>>> indices_t = g.digitize(5) >>> h = cf.histogram(indices, indices_t) >>> print(h) Field: number_of_observations ----------------------------- Data : number_of_observations(air_temperature(5), specific_humidity(10)) 1 Cell methods : latitude: longitude: point Dimension coords: air_temperature(5) = [281.1054839143287, ..., 313.9741786365939] K : specific_humidity(10) = [0.01015, ..., 0.13885] 1 >>> print(h.array) [[2 1 5 3 2 -- -- -- -- --] [1 1 2 -- 1 -- 1 1 -- --] [4 4 2 1 1 1 -- -- 1 1] [1 1 -- -- 1 -- -- -- 1 --] [1 -- -- -- -- -- -- -- -- --]] >>> h.sum() <CF Data(): 40 1>