Analysis¶
Version 3.16.1 for version 1.11 of the CF conventions.
All of the Python code in this tutorial is available in an executable
script (download
, 8kB).
Note that this page is duplicated in the tutorial.
Sample datasets¶
This tutorial uses a number of small sample datasets, all of which can
be found in the zip file cf_tutorial_files.zip
(download
,
164kB):
$ unzip -q cf_tutorial_files.zip
$ ls -1
air_temperature.nc
cf_tutorial_files.zip
contiguous.nc
external.nc
file2.nc
file.nc
gathered.nc
parent.nc
precipitation_flux.nc
timeseries.nc
umfile.pp
vertical.nc
wind_components.nc
The tutorial examples assume that the Python session is being run from the directory that contains the zip file and its unpacked contents, and no other files.
The tutorial files may be also found in the downloads directory of the on-line code repository.
Statistical collapses¶
Collapsing one or more dimensions reduces their size and replaces the
data along those axes with representative statistical values. The
result is a new field construct with consistent metadata for the
collapsed values. Collapses are carried with the collapse
method of the field construct.
By default all axes with size greater than 1 are collapsed completely (i.e. to size 1) with a given collapse method.
>>> import cf
>>> a = cf.read('timeseries.nc')[0]
>>> print(a)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(120), latitude(5), longitude(8)) K
Cell methods : area: mean
Dimension coords: time(120) = [1959-12-16 12:00:00, ..., 1969-11-16 00:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> b = a.collapse('minimum')
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(1), latitude(1), longitude(1)) K
Cell methods : area: mean time(1): latitude(1): longitude(1): minimum
Dimension coords: time(1) = [1964-11-30 12:00:00]
: latitude(1) = [0.0] degrees_north
: longitude(1) = [180.0] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print(b.array)
[[[198.9]]]
In the above example, note that the operation has been recorded in a
new cell method construct (time(1): latitude(1): longitude(1):
minimum
) in the output field construct, and the dimension coordinate
constructs each now have a single cell. The air pressure time
dimension was not included in the collapse because it already had size
1 in the original field construct.
The collapse can also be applied to any subset of the field construct’s dimensions. In this case, the domain axis and coordinate constructs for the non-collapsed dimensions remain the same. This is implemented either with the axes keyword, or with a CF-netCDF cell methods-like syntax for describing both the collapse dimensions and the collapse method in a single string. The latter syntax uses construct identities instead of netCDF dimension names to identify the collapse axes.
Statistics may be created to represent variation over one dimension or a combination of dimensions.
>>> b = a.collapse('maximum', axes='T')
>>> b = a.collapse('T: maximum')
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(1), latitude(5), longitude(8)) K
Cell methods : area: mean time(1): maximum
Dimension coords: time(1) = [1964-11-30 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print(b.array)
[[[310.6 309.1 309.9 311.2 310.4 310.1 310.7 309.6]
[310. 310.7 311.1 311.3 310.9 311.2 310.6 310. ]
[308.9 309.8 311.2 311.2 311.2 309.3 311.1 310.7]
[310.1 310.3 308.8 311.1 310. 311.3 311.2 309.7]
[310.9 307.9 310.3 310.4 310.8 310.9 311.3 309.3]]]
>>> b = a.collapse('maximum', axes=['X', 'Y'])
>>> b = a.collapse('X: Y: maximum')
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(120), latitude(1), longitude(1)) K
Cell methods : area: mean latitude(1): longitude(1): maximum
Dimension coords: time(120) = [1959-12-16 12:00:00, ..., 1969-11-16 00:00:00]
: latitude(1) = [0.0] degrees_north
: longitude(1) = [180.0] degrees_east
: air_pressure(1) = [850.0] hPa
Variation over horizontal area may also be specified by the special
identity 'area'
. This may be used for any horizontal coordinate
reference system.
>>> b = a.collapse('area: maximum')
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(120), latitude(1), longitude(1)) K
Cell methods : area: mean area: maximum
Dimension coords: time(120) = [1959-12-16 12:00:00, ..., 1969-11-16 00:00:00]
: latitude(1) = [0.0] degrees_north
: longitude(1) = [180.0] degrees_east
: air_pressure(1) = [850.0] hPa
Collapse methods¶
The following collapse methods are available, over any subset of the domain axes. The “Cell method” column in the table gives the method name, defined by the CF conventions, of the new cell method construct (if one is created).
Method |
Description |
Cell method |
---|---|---|
|
The sample size, \(N\), equal to the number of non-missing values. |
|
|
The maximum of the non-missing values.
\[m_x=\max\{x_0, \ldots, x_N\}\]
|
|
|
The minimum of the non-missing values.
\[m_n=\min\{x_0, \ldots, x_N\}\]
|
|
|
The maximum of the non-missing absolute values.
\[\max\{|x_0|, \ldots, |x_N|\}\]
|
|
|
The minimum of the non-missing absolute values.
\[\min\{|x_0|, \ldots, |x_N|\}\]
|
|
|
The average of the maximum and the minimum of the non-missing values.
\[\frac{m_x + m_n}{2}\]
|
|
|
The absolute difference between the maximum and the minimum of the non-missing values.
\[m_x - m_n\]
|
|
|
The median of the \(N\) non-missing values. |
|
|
The sum of the \(N\) weights \(w_i\) that correspond to non-missing values.
\[V_{1}=\sum_{i=1}^{N} w_i\]
|
|
|
The sum of the squares of the \(N\) weights \(w_i\) that correspond to non-missing values.
\[V_{2}=\sum_{i=1}^{N} w_i^{2}\]
|
|
|
The unweighted sum of the \(N\) non-missing values \(x_i\) is
\[t=\sum_{i=1}^{N} x_i\]
The weighted sum of the \(N\) non-missing values \(x_i\) with corresponding weights \(w_i\) is
\[\hat{t}=\sum_{i=1}^{N} w_i x_i\]
|
|
|
The unweighted sum of the squares of the \(N\) non-missing values \(x_i\) is
\[t_2=\sum_{i=1}^{N} x_{i}^{2}\]
The weighted sum of the squares of the \(N\) non-missing values \(x_i\) with corresponding weights \(w_i\) is
\[\hat{t}_2=\sum_{i=1}^{N}
w_i x_{i}^{2}\]
|
|
|
The weighted sum of the \(N\) non-missing values \(x_i\) with corresponding cell measures \(m_i\).
\[i=\sum_{i=1}^{N} m_i x_i\]
Note The integral differs from a weighted sum in that the units of the measures are incorporated into the result. |
|
|
The unweighted mean of the \(N\) non-missing values \(x_i\) is
\[\mu=\frac{1}{N}\sum_{i=1}^{N}
x_i\]
The weighted mean of the \(N\) non-missing values \(x_i\) with corresponding weights \(w_i\). is
\[\hat{\mu}=\frac{1}{V_{1}}
\sum_{i=1}^{N}
w_i x_i\]
|
|
|
The unweighted mean of the \(N\) non-missing absolute values \(x_i\) is
\[\mu_{abs}=\frac{1}{N}
\sum_{i=1}^{N}|x_i|\]
The weighted mean of the \(N\) non-missing absolute values \(x_i\) with corresponding weights \(w_i\) is
\[\hat{\mu}_{abs}=
\frac{1}{V_{1}}
\sum_{i=1}^{N} w_i |x_i|\]
|
|
|
The weighted or unweighted mean of the upper group of data values defined by the upper tenth of their distribution |
|
|
The unweighted variance of the \(N\) non-missing values \(x_i\) and with \(N-ddof\) degrees of freedom (\(ddof\ge0\)) is
\[s_{N-ddof}^{2}=
\frac{1}{N-ddof}
\sum_{i=1}^{N} (x_i - \mu)^2\]
The unweighted biased estimate of the variance (\(s_{N}^{2}\)) is given by \(ddof=0\) and the unweighted unbiased estimate of the variance using Bessel’s correction (\(s^{2}=s_{N-1}^{2}\)) is given by \(ddof=1\). The weighted biased estimate of the variance of the \(N\) non-missing values \(x_i\) with corresponding weights \(w_i\) is
\[\hat{s}_{N}^{2}=
\frac{1}{V_{1}}
\sum_{i=1}^{N}
w_i(x_i -
\hat{\mu})^{2}\]
The weighted unbiased estimate of the variance of the \(N\) non-missing values \(x_i\) with corresponding weights \(w_i\) is
\[\hat{s}^{2}=\frac{1}{V_{1} -
(V_{1}/V_{2})}
\sum_{i=1}^{N}
w_i(x_i -
\hat{\mu})^{2}\]
Note The weights used in the variance calculations are assumed to be non-random reliability weights, as opposed to frequency weights. |
|
|
The standard deviation is the square root of the unweighted or weighted variance. |
|
|
The unweighted root mean square of the \(N\) non-missing values \(x_i\) is
\[r=\sqrt{\frac{1}{N}
\sum_{i=1}^{N}
x_{i}^2}\]
The weighted root mean square of the \(N\) non-missing values \(x_i\) with corresponding weights \(w_i\) is
\[\hat{r}=\sqrt{
\frac{1}{V_{1}}
\sum_{i=1}^{N} w_i
x_{i}^2}\]
|
|
Data type and missing data¶
In all collapses, missing data array elements are accounted for in the calculation.
Any collapse method that involves a calculation (such as calculating a
mean), as opposed to only selecting a value (such as finding a
maximum), will return a field containing double precision floating
point numbers. If this is not desired then the data type can be reset
after the collapse with the dtype
attribute of the field
construct.
Collapse weights¶
For weights to be incorporated in the collapse, the axes to be weighted must be identified with the weights keyword. A collapse by a particular method is either never weighted, or may be weighted, or is always weighted, as described in the following table:
Method |
Description |
Weighted |
---|---|---|
|
The maximum of the values. |
Never |
|
The minimum of the values. |
Never |
|
The maximum of the absolute. |
Never |
|
The minimum of the absolute. |
Never |
|
The average of the maximum and the minimum of the values. |
Never |
|
The absolute difference between the maximum and the minimum of the values. |
Never |
|
The median of the values. |
Never |
|
The sum of the values. |
Never |
|
The sum of the squares of values. |
Never |
|
The sample size, i.e. the number of non-missing values. |
Never |
|
The sum of weights, as would be used for other calculations. |
Never |
|
The sum of squares of weights, as would be used for other calculations. |
Never |
|
The weighted or unweighted mean of the values. |
May be |
|
The mean of the absolute values. |
May be |
|
The mean of the upper group of data values defined by the upper tenth of their distribution. |
May be |
|
The weighted or unweighted variance of the values, with a given number of degrees of freedom. |
May be |
|
The weighted or unweighted standard deviation of the values with a given number of degrees of freedom. |
May be |
|
The square root of the weighted or unweighted mean of the squares of the values. |
May be |
|
The integral of values. |
Always |
Collapse methods that are “Never” weighted ignore the weights parameter, even if it is set.
Collapse methods that “May be” weighted will only be weighted if the weights parameter is set.
Collapse methods that are “Always” weighted require the weights parameter to be set.
Weights are either derived from the field construct’s metadata (such
as cell sizes), or may be provided explicitly in the form of other
field constructs containing data of weights values. In either case,
the weights actually used are those derived by the weights
method of the field construct called with the same weights keyword
value. Collapsed axes that are not identified by the weights keyword
are unweighted during the collapse operation.
>>> b = a.collapse('T: mean', weights=True)
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(1), latitude(5), longitude(8)) K
Cell methods : area: mean time(1): mean
Dimension coords: time(1) = [1964-11-30 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print (b.array)
[[[254.03120723 255.89723515 253.06490556 254.17815494 255.18458801 253.3684369 253.26624692 253.63818779]
[248.92058582 253.99597591 259.67957843 257.08967972 252.57333698 252.5746236 258.90938954 253.86939502]
[255.94716671 254.77330961 254.35929373 257.91478237 251.87670408 252.72723789 257.26038872 258.19698878]
[258.08639474 254.8087873 254.9881741 250.98064604 255.3513003 256.66337257 257.86895702 259.49299206]
[263.80016425 253.35825349 257.8026006 254.3173556 252.2061867 251.74150014 255.60930742 255.06260608]]]
To inspect the weights, call the weights
method directly.
>>> w = a.weights(True)
>>> print(w)
Field: long_name=weights (ncvar%air_potential_temperature)
----------------------------------------------------------
Data : long_name=weights(time(120)) d
Dimension coords: time(120) = [1959-12-16 12:00:00, ..., 1969-11-16 00:00:00]
>>> print(w.array)
[31. 31. 29. 31. 30. 31. 30. 31. 31. 30. 31. 30. 31. 31. 28. 31. 30. 31.
30. 31. 31. 30. 31. 30. 31. 31. 28. 31. 30. 31. 30. 31. 31. 30. 31. 30.
31. 31. 28. 31. 30. 31. 30. 31. 31. 30. 31. 30. 31. 31. 29. 31. 30. 31.
30. 31. 31. 30. 31. 30. 31. 31. 28. 31. 30. 31. 30. 31. 31. 30. 31. 30.
31. 31. 28. 31. 30. 31. 30. 31. 31. 30. 31. 30. 31. 31. 28. 31. 30. 31.
30. 31. 31. 30. 31. 30. 31. 31. 29. 31. 30. 31. 30. 31. 31. 30. 31. 30.
31. 31. 28. 31. 30. 31. 30. 31. 31. 30. 31. 30.]
>>> b = a.collapse('T: Y: mean', weights='Y')
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(1), latitude(1), longitude(8)) K
Cell methods : area: mean time(1): latitude(1): mean
Dimension coords: time(1) = [1964-11-30 12:00:00]
: latitude(1) = [0.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print (b.array)
[[[256.15819444 254.625 255.73666667 255.43041667 253.19444444 253.31277778 256.68236111 256.42055556]]]
Specifying weighting by horizontal cell area may also use the special
'area'
syntax.
>>> b = a.collapse('area: mean', weights=True)
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(120), latitude(1), longitude(1)) K
Cell methods : area: mean area: mean
Dimension coords: time(120) = [1959-12-16 12:00:00, ..., 1969-11-16 00:00:00]
: latitude(1) = [0.0] degrees_north
: longitude(1) = [180.0] degrees_east
: air_pressure(1) = [850.0] hPa
An alternative technique for specifying weights is to set the
weights keyword to the output of a call to the weights
method.
>>> b = a.collapse('area: mean', weights=a.weights('area'))
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(120), latitude(1), longitude(1)) K
Cell methods : area: mean area: mean
Dimension coords: time(120) = [1959-12-16 12:00:00, ..., 1969-11-16 00:00:00]
: latitude(1) = [0.0] degrees_north
: longitude(1) = [180.0] degrees_east
: air_pressure(1) = [850.0] hPa
See the weights
method for full details on how weights may be
specified.
Multiple collapses¶
Multiple collapses normally require multiple calls to
collapse
: one on the original field construct and then one on
each interim field construct.
>>> b = a.collapse('area: mean', weights=True).collapse('T: maximum')
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(1), latitude(1), longitude(1)) K
Cell methods : area: mean latitude(1): longitude(1): mean time(1): maximum
Dimension coords: time(1) = [1964-11-30 12:00:00]
: latitude(1) = [0.0] degrees_north
: longitude(1) = [180.0] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print(b.array)
[[[271.77199724]]]
If preferred, multiple collapses may be carried out in a single call
to collapse
by using the CF-netCDF cell methods-like
syntax (note that the colon (:
) is only used after the construct
identity that specifies each axis, and a space delimits the separate
collapses).
>>> b = a.collapse('area: mean T: maximum', weights=True)
>>> print(b.array)
[[[271.77199724]]]
Grouped collapses¶
A grouped collapse is one for which an axis is not collapsed completely to size 1. Instead the collapse axis is partitioned into non-overlapping groups and each group is collapsed to size 1. The resulting axis will generally have more than one element. For example, creating 12 annual means from a timeseries of 120 months would be a grouped collapse. The groups do not need to be created from adjacent cells, as would be the case when creating 12 multi-annual monthly means from a timeseries of 120 months.
Selected statistics for overlapping groups can be calculated with the
moving_window
method of the field construct.
The group keyword of collapse
defines the size of the
groups. Groups can be defined in a variety of ways, including with
cf.Query
, cf.TimeDuration
(see the Time duration section)
and cf.Data
instances.
An element of the collapse axis can not be a member of more than one group, and may be a member of no groups. Elements that are not selected by the group keyword are excluded from the result.
>>> y = cf.Y(month=12)
>>> y
<CF TimeDuration: P1Y (Y-12-01 00:00:00)>
>>> b = a.collapse('T: maximum', group=y)
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(10), latitude(5), longitude(8)) K
Cell methods : area: mean time(10): maximum
Dimension coords: time(10) = [1960-06-01 00:00:00, ..., 1969-06-01 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> b = a.collapse('T: maximum', group=6)
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(20), latitude(5), longitude(8)) K
Cell methods : area: mean time(20): maximum
Dimension coords: time(20) = [1960-03-01 12:00:00, ..., 1969-08-31 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> b = a.collapse('T: maximum', group=cf.djf())
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(10), latitude(5), longitude(8)) K
Cell methods : area: mean time(10): maximum time(10): maximum
Dimension coords: time(10) = [1960-01-15 12:00:00, ..., 1969-01-15 00:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> c = cf.seasons()
>>> c
[<CF Query: month[(ge 12) | (le 2)]>
<CF Query: month(wi (3, 5))>,
<CF Query: month(wi (6, 8))>,
<CF Query: month(wi (9, 11))>]
>>> b = a.collapse('T: maximum', group=c)
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(40), latitude(5), longitude(8)) K
Cell methods : area: mean time(40): maximum time(40): maximum
Dimension coords: time(40) = [1960-01-15 12:00:00, ..., 1969-10-16 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> b = a.collapse('X: mean', group=cf.Data(180, 'degrees'))
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(120), latitude(5), longitude(2)) K
Cell methods : area: mean longitude(2): mean longitude(2): mean
Dimension coords: time(120) = [1959-12-16 12:00:00, ..., 1969-11-16 00:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(2) = [90.0, 270.0] degrees_east
: air_pressure(1) = [850.0] hPa
Groups can be further described with the group_span (to include
groups whose actual span is not equal to a given value) and the
group_contiguous (to include non-contiguous groups, or any
contiguous group containing overlapping cells) keywords of
collapse
.
Climatological statistics¶
Climatological statistics may be derived from corresponding portions of the annual cycle in a set of years (e.g. the average January temperatures in the climatology of 1961-1990, where the values are derived by averaging the 30 Januarys from the separate years); or from corresponding portions of the diurnal cycle in a set of days (e.g. the average temperatures for each hour in the day for May 1997). A diurnal climatology may also be combined with a multiannual climatology (e.g. the minimum temperature for each hour of the average day in May from a 1961-1990 climatology).
Calculation requires two or three collapses, depending on the quantity being created, all of which are grouped collapses. Each collapse method needs to indicate its climatological nature with one of the following qualifiers,
Method qualifier |
Associated keyword |
---|---|
|
within_years |
|
within_days |
|
over_years (optional) |
|
over_days (optional) |
and the associated keyword to collapse
specifies how the
method is to be applied.
>>> b = a.collapse('T: mean within years T: mean over years',
... within_years=cf.seasons(), weights=True)
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(4), latitude(5), longitude(8)) K
Cell methods : area: mean time(4): mean within years time(4): mean over years
Dimension coords: time(4) = [1960-01-15 12:00:00, ..., 1960-10-16 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print(b.coordinate('T').bounds.datetime_array)
[[cftime.DatetimeGregorian(1959-12-01 00:00:00) cftime.DatetimeGregorian(1969-03-01 00:00:00)]
[cftime.DatetimeGregorian(1960-03-01 00:00:00) cftime.DatetimeGregorian(1969-06-01 00:00:00)]
[cftime.DatetimeGregorian(1960-06-01 00:00:00) cftime.DatetimeGregorian(1969-09-01 00:00:00)]
[cftime.DatetimeGregorian(1960-09-01 00:00:00) cftime.DatetimeGregorian(1969-12-01 00:00:00)]]
>>> b = a.collapse('T: minimum within years T: variance over years',
... within_years=cf.seasons(), weights=True)
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(4), latitude(5), longitude(8)) K2
Cell methods : area: mean time(4): minimum within years time(4): variance over years
Dimension coords: time(4) = [1960-01-15 12:00:00, ..., 1960-10-16 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print(b.coordinate('T').bounds.datetime_array)
[[cftime.DatetimeGregorian(1959-12-01 00:00:00) cftime.DatetimeGregorian(1969-03-01 00:00:00)]
[cftime.DatetimeGregorian(1960-03-01 00:00:00) cftime.DatetimeGregorian(1969-06-01 00:00:00)]
[cftime.DatetimeGregorian(1960-06-01 00:00:00) cftime.DatetimeGregorian(1969-09-01 00:00:00)]
[cftime.DatetimeGregorian(1960-09-01 00:00:00) cftime.DatetimeGregorian(1969-12-01 00:00:00)]]
When collapsing over years, it is assumed by default that the each
portion of the annual cycle is collapsed over all years that are
present. This is the case in the above two examples. It is possible,
however, to restrict the years to be included, or group them into
chunks, with the over_years keyword to collapse
.
>>> b = a.collapse('T: mean within years T: mean over years', weights=True,
... within_years=cf.seasons(), over_years=cf.Y(5))
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(8), latitude(5), longitude(8)) K
Cell methods : area: mean time(8): mean within years time(8): mean over years
Dimension coords: time(8) = [1960-01-15 12:00:00, ..., 1965-10-16 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print(b.coordinate('T').bounds.datetime_array)
[[cftime.DatetimeGregorian(1959-12-01 00:00:00) cftime.DatetimeGregorian(1964-03-01 00:00:00)]
[cftime.DatetimeGregorian(1960-03-01 00:00:00) cftime.DatetimeGregorian(1964-06-01 00:00:00)]
[cftime.DatetimeGregorian(1960-06-01 00:00:00) cftime.DatetimeGregorian(1964-09-01 00:00:00)]
[cftime.DatetimeGregorian(1960-09-01 00:00:00) cftime.DatetimeGregorian(1964-12-01 00:00:00)]
[cftime.DatetimeGregorian(1964-12-01 00:00:00) cftime.DatetimeGregorian(1969-03-01 00:00:00)]
[cftime.DatetimeGregorian(1965-03-01 00:00:00) cftime.DatetimeGregorian(1969-06-01 00:00:00)]
[cftime.DatetimeGregorian(1965-06-01 00:00:00) cftime.DatetimeGregorian(1969-09-01 00:00:00)]
[cftime.DatetimeGregorian(1965-09-01 00:00:00) cftime.DatetimeGregorian(1969-12-01 00:00:00)]]
>>> b = a.collapse('T: mean within years T: mean over years', weights=True,
... within_years=cf.seasons(), over_years=cf.year(cf.wi(1963, 1968)))
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(4), latitude(5), longitude(8)) K
Cell methods : area: mean time(4): mean within years time(4): mean over years
Dimension coords: time(4) = [1963-01-15 00:00:00, ..., 1963-10-16 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print(b.coordinate('T').bounds.datetime_array)
[[cftime.DatetimeGregorian(1962-12-01 00:00:00) cftime.DatetimeGregorian(1968-03-01 00:00:00)]
[cftime.DatetimeGregorian(1963-03-01 00:00:00) cftime.DatetimeGregorian(1968-06-01 00:00:00)]
[cftime.DatetimeGregorian(1963-06-01 00:00:00) cftime.DatetimeGregorian(1968-09-01 00:00:00)]
[cftime.DatetimeGregorian(1963-09-01 00:00:00) cftime.DatetimeGregorian(1968-12-01 00:00:00)]]
Similarly for collapses over days, it is assumed by default that the
each portion of the diurnal cycle is collapsed over all days that are
present, But it is possible to restrict the days to be included, or
group them into chunks, with the over_days keyword to
collapse
.
The calculation can be done with multiple collapse calls, which can be useful if the interim stages are needed independently, but be aware that the interim field constructs will have non-CF-compliant cell method constructs.
>>> b = a.collapse('T: standard_deviation within years',
... within_years=cf.seasons(), weights=True)
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(40), latitude(5), longitude(8)) K
Cell methods : area: mean time(40): standard_deviation within years
Dimension coords: time(40) = [1960-01-15 12:00:00, ..., 1969-10-16 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> c = b.collapse('T: maximum over years')
>>> print(c)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(4), latitude(5), longitude(8)) K
Cell methods : area: mean time(4): standard_deviation within years time(4): maximum over years
Dimension coords: time(4) = [1960-01-15 12:00:00, ..., 1960-10-16 12:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
Other statistical operations¶
Cumulative sums¶
The cumsum
method of the field construct calculates the
cumulative sum of elements along a given axis. The cell bounds of the
axis are updated to describe the ranges over which the sums apply, and
a new sum
cell method construct is added to the resulting field
construct.
>>> a = cf.read('timeseries.nc')[0]
>>> print(a)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(120), latitude(5), longitude(8)) K
Cell methods : area: mean
Dimension coords: time(120) = [1959-12-16 12:00:00, ..., 1969-11-16 00:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> b = a.cumsum('T')
>>> print(b)
Field: air_potential_temperature (ncvar%air_potential_temperature)
------------------------------------------------------------------
Data : air_potential_temperature(time(120), latitude(5), longitude(8)) K
Cell methods : area: mean time(120): sum
Dimension coords: time(120) = [1959-12-16 12:00:00, ..., 1969-11-16 00:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: air_pressure(1) = [850.0] hPa
>>> print(a.coordinate('T').bounds[-1].dtarray)
[[cftime.DatetimeGregorian(1969-11-01 00:00:00)
cftime.DatetimeGregorian(1969-12-01 00:00:00))]]
>>> print(b.coordinate('T').bounds[-1].dtarray)
[[cftime.DatetimeGregorian(1959-11-01 00:00:00)
cftime.DatetimeGregorian(1969-12-01 00:00:00))]]
The treatment of missing values can be specified, as well as the positioning of coordinate values in the summed axis of the returned field construct.
Histograms¶
The cf.histogram
function is used to record the distribution of a
set of variables in the form of an N-dimensional histogram.
Each dimension of the histogram is defined by a field construct
returned by the digitize
method of a field construct. This
“digitized” field construct defines a sequence of bins and provides
indices to the bins that each value of one of the variables belongs.
>>> q, t = cf.read('file.nc')
Field: specific_humidity (ncvar%q)
----------------------------------
Data : specific_humidity(latitude(5), longitude(8)) 1
Cell methods : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> print(q.array)
[[0.007 0.034 0.003 0.014 0.018 0.037 0.024 0.029]
[0.023 0.036 0.045 0.062 0.046 0.073 0.006 0.066]
[0.11 0.131 0.124 0.146 0.087 0.103 0.057 0.011]
[0.029 0.059 0.039 0.07 0.058 0.072 0.009 0.017]
[0.006 0.036 0.019 0.035 0.018 0.037 0.034 0.013]]
>>> indices, bins = q.digitize(10, return_bins=True)
>>> print(indices)
Field: long_name=Bin index to which each 'specific_humidity' value belongs (ncvar%q)
------------------------------------------------------------------------------------
Data : long_name=Bin index to which each 'specific_humidity' value belongs(latitude(5), longitude(8))
Cell methods : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_eastg
: time(1) = [2019-01-01 00:00:00]
>>> print(indices.array)
[[0 2 0 0 1 2 1 1]
[1 2 2 4 3 4 0 4]
[7 8 8 9 5 6 3 0]
[1 3 2 4 3 4 0 0]
[0 2 1 2 1 2 2 0]]
>>> print(bins.array)
[[0.003 0.0173]
[0.0173 0.0316]
[0.0316 0.0459]
[0.0459 0.0602]
[0.0602 0.0745]
[0.0745 0.0888]
[0.0888 0.1031]
[0.1031 0.1174]
[0.1174 0.1317]
[0.1317 0.146 ]]
>>> h = cf.histogram(indices)
>>> print(h)
Field: number_of_observations
-----------------------------
Data : number_of_observations(specific_humidity(10)) 1
Cell methods : latitude: longitude: point
Dimension coords: specific_humidity(10) = [10.15, ..., 138.85000000000002] 1
>>> print(h.array)
[9 7 9 4 5 1 1 1 2 1]
>>> print(h.coordinate('specific_humidity').bounds.array)
[[0.003 0.0173]
[0.0173 0.0316]
[0.0316 0.0459]
[0.0459 0.0602]
[0.0602 0.0745]
[0.0745 0.0888]
[0.0888 0.1031]
[0.1031 0.1174]
[0.1174 0.1317]
[0.1317 0.146 ]]
>>> g = q.copy()
>>> g.standard_name = 'air_temperature'
>>> import numpy
>>> g[...] = numpy.random.normal(loc=290, scale=10, size=40).reshape(5, 8)
>>> g.override_units('K', inplace=True)
>>> print(g)
Field: air_temperature (ncvar%q)
--------------------------------
Data : air_temperature(latitude(5), longitude(8)) K
Cell methods : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> indices_t = g.digitize(5)
>>> h = cf.histogram(indices, indices_t)
>>> print(h)
Field: number_of_observations
-----------------------------
Data : number_of_observations(air_temperature(5), specific_humidity(10)) 1
Cell methods : latitude: longitude: point
Dimension coords: air_temperature(5) = [281.1054839143287, ..., 313.9741786365939] K
: specific_humidity(10) = [0.01015, ..., 0.13885] 1
>>> print(h.array)
[[2 1 5 3 2 -- -- -- -- --]
[1 1 2 -- 1 -- 1 1 -- --]
[4 4 2 1 1 1 -- -- 1 1]
[1 1 -- -- 1 -- -- -- 1 --]
[1 -- -- -- -- -- -- -- -- --]]
>>> h.sum()
<CF Data(): 40 1>
Binning operations¶
The bin
method of the field construct groups its data into
bins, where each group is defined by the elements that correspond to
an N-dimensional histogram bin of another set of variables, and collapses the elements in each group to a single
representative value. The same collapse methods and weighting options as
the collapse
method are available.
The result of the binning operation is a field construct whose domain axis and dimension coordinate constructs describe the sizes of the N-dimensional bins of the other set of variables.
>>> q, t = cf.read('file.nc')
Field: specific_humidity (ncvar%q)
----------------------------------
Data : specific_humidity(latitude(5), longitude(8)) 0.001 1
Cell methods : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> print(q.array)
[[0.007 0.034 0.003 0.014 0.018 0.037 0.024 0.029]
[0.023 0.036 0.045 0.062 0.046 0.073 0.006 0.066]
[0.11 0.131 0.124 0.146 0.087 0.103 0.057 0.011]
[0.029 0.059 0.039 0.07 0.058 0.072 0.009 0.017]
[0.006 0.036 0.019 0.035 0.018 0.037 0.034 0.013]]
>>> indices = q.digitize(5)
>>> b = q.bin('range', digitized=indices)
>>> print(b)
Field: specific_humidity
------------------------
Data : specific_humidity(specific_humidity(5)) 1
Cell methods : latitude: longitude: range
Dimension coords: specific_humidity(5) = [0.0173, ..., 0.1317] 1
>>> print(b.array)
[0.026 0.025 0.025 0.007 0.022]
>>> print(b.coordinate('specific_humidity').bounds.array)
[[0.003 0.0316]
[0.0316 0.0602]
[0.0602 0.0888]
[0.0888 0.1174]
[0.1174 0.146 ]]
>>> p, t = cf.read('file2.nc')
>>> print(t)
Field: air_temperature (ncvar%t)
--------------------------------
Data : air_temperature(latitude(5), longitude(8)) degreesC
Cell methods : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> print(p)
Field: air_pressure (ncvar%p)
-----------------------------
Data : air_pressure(latitude(5), longitude(8)) hPa
Cell methods : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> t_indices = t.digitize(4)
>>> p_indices = p.digitize(6)
>>> b = q.bin('mean', digitized=[t_indices, p_indices], weights='area')
>>> print(b)
Field: specific_humidity
------------------------
Data : specific_humidity(air_pressure(6), air_temperature(4)) 1
Cell methods : latitude: longitude: mean
Dimension coords: air_pressure(6) = [966.6225003326126, ..., 1033.6456080043665] hPa
: air_temperature(4) = [-12.735821567738295, ..., 9.9702610462581] degreesC
>>> print(b.array)
[[ -- -- -- 0.011 ]
[0.131 0.0145 0.0345 0.05052]
[0.05742 0.01727 0.06392 0.0105 ]
[ -- 0.04516 0.05272 0.10194]
[0.124 0.024 0.059 0.006 ]
[ -- 0.08971 -- --]]
Percentiles¶
Percentiles of the data can be computed along any subset of the axes
with the percentile
method of the field construct.
>>> q, t = cf.read('file.nc')
>>> print(q)
Field: specific_humidity
------------------------
Data : specific_humidity(latitude(5), longitude(8)) 1
Cell methods : area: mean
Dimension coords: time(1) = [2019-01-01 00:00:00]
: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
>>> print(q.array)
[[0.007 0.034 0.003 0.014 0.018 0.037 0.024 0.029]
[0.023 0.036 0.045 0.062 0.046 0.073 0.006 0.066]
[0.11 0.131 0.124 0.146 0.087 0.103 0.057 0.011]
[0.029 0.059 0.039 0.07 0.058 0.072 0.009 0.017]
[0.006 0.036 0.019 0.035 0.018 0.037 0.034 0.013]]
>>> p = q.percentile([20, 40, 50, 60, 80])
>>> print(p)
Field: specific_humidity
------------------------
Data : specific_humidity(long_name=Percentile ranks for latitude, longitude dimensions(5), latitude(1), longitude(1)) 1
Dimension coords: time(1) = [2019-01-01 00:00:00]
: latitude(1) = [0.0] degrees_north
: longitude(1) = [180.0] degrees_east
: long_name=Percentile ranks for latitude, longitude dimensions(5) = [20, ..., 80]
>>> print(p.array)
[[[0.0164]]
[[0.032 ]]
[[0.036 ]]
[[0.0414]]
[[0.0704]]]
>>> p80 = q.percentile(80)
>>> print(p80)
Field: specific_humidity
------------------------
Data : specific_humidity(latitude(1), longitude(1)) 1
Dimension coords: time(1) = [2019-01-01 00:00:00]
: latitude(1) = [0.0] degrees_north
: longitude(1) = [180.0] degrees_east
: long_name=Percentile ranks for latitude, longitude dimensions(1) = [80]
>>> g = q.where(q<=p80, cf.masked)
>>> print(g.array)
[[ -- -- -- -- -- -- -- --]
[ -- -- -- -- -- 0.073 -- --]
[0.11 0.131 0.124 0.146 0.087 0.103 -- --]
[ -- -- -- -- -- 0.072 -- --]
[ -- -- -- -- -- -- -- --]]
>>> g.collapse('standard_deviation', weights=True).data
<CF Data(1, 1): [[0.024609938742357642]] 1>
>>> p45 = q.percentile(45, axes='X')
>>> print(p45.array)
[[0.0189 ]
[0.04515]
[0.10405]
[0.04185]
[0.02125]]
>>> g = q.where(q<=p45, cf.masked)
>>> print(g.array)
[[ -- 0.034 -- -- -- 0.037 0.024 0.029]
[ -- -- -- 0.062 0.046 0.073 -- 0.066]
[0.11 0.131 0.124 0.146 -- -- -- --]
[ -- 0.059 -- 0.07 0.058 0.072 -- --]
[ -- 0.036 -- 0.035 -- 0.037 0.034 --]]
>>> print(g.collapse('X: mean', weights=True).array)
[[0.031 ]
[0.06175]
[0.12775]
[0.06475]
[0.0355 ]]
>>> bins = q.percentile([0, 10, 50, 90, 100], squeeze=True)
>>> print(bins.array)
[0.003 0.0088 0.036 0.1037 0.146 ]
>>> i = q.digitize(bins, closed_ends=True)
>>> print(i.array)
[[0 1 0 1 1 2 1 1]
[1 2 2 2 2 2 0 2]
[3 3 3 3 2 2 2 1]
[1 2 2 2 2 2 1 1]
[0 2 1 1 1 2 1 1]]
Regridding¶
Regridding, also called remapping or interpolation, is the process of changing the domain of a field construct whilst preserving the qualities of the original data.
The field construct has two regridding methods: regrids
for
regridding data between domains with spherical coordinate systems; and
regridc
for regridding data between domains with Cartesian
coordinate systems. The interpolation is carried by out using the
esmpy package (formerly
called ESMF
), a Python interface to the Earth System Modeling
Framework regridding utility.
As with statistical collapses, regridding may be applied over a subset of the domain axes, and the domain axis constructs and coordinate constructs for the non-regridded dimensions remain the same.
Domain ancillary constructs whose data spans the regridding dimensions are also regridded, but field ancillary constructs whose data spans the regridding dimensions are removed from the regridded field construct.
Regridding methods¶
The following regridding methods are available (in this table, “source” and “destination” refer to the domain of the field construct being regridded, and the domain that it is being regridded to, respectively):
Method |
Description |
---|---|
Linear |
Linear interpolation in the number of dimensions being regridded. For two dimensional regridding this is bilinear interpolation, and for three dimensional regridding this is trilinear interpolation. |
First-order conservative |
First order conservative interpolation. Preserve the area integral of the data across the interpolation from source to destination. It uses the proportion of the area of the overlapping source and destination cells to determine appropriate weights. In particular, the weight of a source cell is the ratio of the area of intersection of the source and destination cells to the area of the whole destination cell. It does not account for the field gradient across the source cell, unlike the second-order conservative method. |
Second-order conservative |
Second-order conservative interpolation. As with first order (see above), preserves the area integral of the field between source and destination using a weighted sum, with weights based on the proportionate area of intersection. Unlike first-order, the second-order method incorporates further terms to take into consideration the gradient of the field across the source cell, thereby typically producing a smoother result of higher accuracy. |
Higher-order patch recovery |
Higher-order patch recovery interpolation. A second degree polynomial regridding method, which uses a least squares algorithm to calculate the polynomial. This method gives better derivatives in the resulting destination data than the linear method. |
Nearest neighbour |
Nearest neighbour interpolation for which each destination point is mapped to the closest source point, or vice versa. Useful for extrapolation of categorical data. When mapping destination to source points, a given destination point may receive input from multiple source points, but no source point will map to more than one destination point. When mapping source to destination points, a given destination receives input at most one source point. |
Spherical regridding¶
Regridding from and to spherical coordinate systems using the
regrids
method is only available for the ‘X’ and ‘Y’ axes
simultaneously. All other axes are unchanged. The calculation of the
regridding weights is based on areas and distances on the surface of
the sphere, rather in Euclidean space.
The following combinations of spherical source and destination domain
coordinate systems are available to the regrids
method:
Spherical source domain |
Spherical destination domain |
---|---|
The most convenient usage is when the destination domain exists in another field construct. In this case, all you need to specify is the field construct having the desired destination domain and the regridding method to use:
>>> a = cf.read('air_temperature.nc')[0]
>>> b = cf.read('precipitation_flux.nc')[0]
>>> print(a)
Field: air_temperature (ncvar%tas)
----------------------------------
Data : air_temperature(time(2), latitude(73), longitude(96)) K
Cell methods : time(2): mean
Dimension coords: time(2) = [1860-01-16 00:00:00, 1860-02-16 00:00:00] 360_day
: latitude(73) = [-90.0, ..., 90.0] degrees_north
: longitude(96) = [0.0, ..., 356.25] degrees_east
: height(1) = [2.0] m
>>> print(b)
Field: precipitation_flux (ncvar%tas)
-------------------------------------
Data : precipitation_flux(time(1), latitude(64), longitude(128)) kg m-2 day-1
Cell methods : time(1): mean (interval: 1.0 month)
Dimension coords: time(1) = [0450-11-16 00:00:00] noleap
: latitude(64) = [-87.86380004882812, ..., 87.86380004882812] degrees_north
: longitude(128) = [0.0, ..., 357.1875] degrees_east
: height(1) = [2.0] m
>>> c = a.regrids(b, method='conservative')
>>> print(c)
Field: air_temperature (ncvar%tas)
----------------------------------
Data : air_temperature(time(2), latitude(64), longitude(128)) K
Cell methods : time(2): mean
Dimension coords: time(2) = [1860-01-16 00:00:00, 1860-02-16 00:00:00] 360_day
: latitude(64) = [-87.86380004882812, ..., 87.86380004882812] degrees_north
: longitude(128) = [0.0, ..., 357.1875] degrees_east
: height(1) = [2.0] m
It is generally not necessary to specify which are the ‘X’ and ‘Y’
axes in the domains of both the source and destination field
constructs, since they will be automatically identified by their
metadata. However, in cases when this is not possible (such as for
tripolar domains) the src_axes or dst_axes keywords of the
regrids
method can be used.
It may be that the required destination domain does not exist in a field construct. In this case, the latitude and longitudes of the destination domain may be defined solely by dimension or auxiliary coordinate constructs.
>>> import numpy
>>> domain = cf.Domain.create_regular((0, 360, 5.0), (-90, 90, 2.5))
>>> c = a.regrids(domain, method='linear')
Field: air_temperature (ncvar%tas)
----------------------------------
Data : air_temperature(time(2), latitude(72), longitude(72)) K
Cell methods : time(2): mean
Dimension coords: time(2) = [1860-01-16 00:00:00, 1860-02-16 00:00:00] 360_day
: latitude(72) = [-88.75, ..., 88.75] degrees_north
: longitude(72) = [2.5, ..., 357.5] degrees_east
: height(1) = [2.0] m
A destination domain defined by two-dimensional (curvilinear) latitude and longitude auxiliary coordinate constructs can also be specified in a similar manner.
An axis is cyclic if cells at both of its ends are actually geographically adjacent. In spherical regridding, only the ‘X’ axis has the potential for being cyclic. For example, a longitude cell spanning 359 to 360 degrees east is proximate to the cell spanning 0 to 1 degrees east.
When a cyclic dimension can not be automatically detected, such as
when its dimension coordinate construct does not have bounds,
cyclicity may be set with the src_cyclic or dst_cyclic keywords of
the regrids
method.
To find out whether a dimension is cyclic use the iscyclic
method of the field construct, or to manually set its cyclicity use
the cyclic
method. If the destination domain has been defined by a
dictionary of dimension coordinate constructs, then cyclicity can be
registered by setting a period of cyclicity with the
period
method of the dimension coordinate
construct.
Cartesian regridding¶
Cartesian regridding with the regridc
method is very
similar to spherical regridding, except
regridding dimensions are not restricted to the horizontal plane, the
source and destination domains are assumed to be Euclidian spaces for the purpose of
calculating regridding weights, and all dimensions are assumed to be
non-cyclic by default.
Cartesian regridding can be done in up to three dimensions. It is often used for regridding along the time dimension. A plane projection coordinate system can be regridded with Cartesian regridding, which will produce similar results to using using spherical regridding.
>>> time = cf.DimensionCoordinate.create_regular(
... (0.5, 60.5, 1),
... units=cf.Units("days since 1860-01-01", calendar="360_day"),
... standard_name="time",
... )
>>> time
<CF DimensionCoordinate: time(60) days since 1860-01-01 360_day>
>>> c = a.regridc([time], axes='T', method='linear')
Field: air_temperature (ncvar%tas)
----------------------------------
Data : air_temperature(time(60), latitude(73), longitude(96)) K
Cell methods : time(60): mean
Dimension coords: time(60) = [1860-01-02 00:00:00, ..., 1860-03-01 00:00:00] 360_day
: latitude(73) = [-90.0, ..., 90.0] degrees_north
: longitude(96) = [0.0, ..., 356.25] degrees_east
: height(1) = [2.0] m
Cartesian regridding to the dimension of another field construct is also possible, similarly to spherical regridding.
Regridding masked data¶
The data mask of the source field construct is taken into account,
such that the regridded data will be masked in regions where the
source data is masked. By default the mask of the destination field
construct is not used, but can be taken into account by setting
use_dst_mask keyword to the regrids
or regridc
methods. For example, this is useful when part of the destination
domain is not being used (such as the land portion of an ocean grid).
For conservative regridding, masking is done on cells. Masking a destination cell means that the cell won’t participate in the regridding. For all other regridding methods, masking is done on points. For these methods, masking a destination point means that the point will not participate in the regridding.
Vertical regridding¶
The only option for regridding along a vertical axis is to use Cartesian regridding. However, care must be taken to ensure that the vertical axis is transformed so that it’s coordinate values vary linearly. For example, to regrid data on one set of vertical pressure coordinates to another set, the pressure coordinates may first be transformed into the logarithm of pressure, and then changed back to pressure coordinates after the regridding operation.
>>> v = cf.read('vertical.nc')[0]
>>> print(v)
Field: eastward_wind (ncvar%ua)
-------------------------------
Data : eastward_wind(time(3), air_pressure(5), grid_latitude(11), grid_longitude(10)) m s-1
Cell methods : time(3): mean
Dimension coords: time(3) = [1979-05-01 12:00:00, 1979-05-02 12:00:00, 1979-05-03 12:00:00] gregorian
: air_pressure(5) = [850.0, ..., 50.0] hPa
: grid_latitude(11) = [23.32, ..., 18.92] degrees
: grid_longitude(10) = [-20.54, ..., -16.58] degrees
Auxiliary coords: latitude(grid_latitude(11), grid_longitude(10)) = [[67.12, ..., 66.07]] degrees_north
: longitude(grid_latitude(11), grid_longitude(10)) = [[-45.98, ..., -31.73]] degrees_east
Coord references: grid_mapping_name:rotated_latitude_longitude
>>> z_p = v.construct('Z')
>>> print(z_p.array)
[850. 700. 500. 250. 50.]
>>> z_ln_p = z_p.log()
>>> z_ln_p.axis = 'Z'
>>> print(z_ln_p.array)
[6.74523635 6.55108034 6.2146081 5.52146092 3.91202301]
>>> _ = v.replace_construct('Z', new=z_ln_p)
>>> new_z_p = cf.DimensionCoordinate(data=cf.Data([800, 705, 632, 510, 320.], 'hPa'))
>>> new_z_ln_p = new_z_p.log()
>>> new_z_ln_p.axis = 'Z'
>>> new_v = v.regridc([new_z_ln_p], axes='Z', method='linear')
>>> new_v.replace_construct('Z', new=new_z_p)
>>> print(new_v)
Field: eastward_wind (ncvar%ua)
-------------------------------
Data : eastward_wind(time(3), Z(5), grid_latitude(11), grid_longitude(10)) m s-1
Cell methods : time(3): mean
Dimension coords: time(3) = [1979-05-01 12:00:00, 1979-05-02 12:00:00, 1979-05-03 12:00:00] gregorian
: Z(5) = [800.0, ..., 320.0] hPa
: grid_latitude(11) = [23.32, ..., 18.92] degrees
: grid_longitude(10) = [-20.54, ..., -16.58] degrees
Auxiliary coords: latitude(grid_latitude(11), grid_longitude(10)) = [[67.12, ..., 66.07]] degrees_north
: longitude(grid_latitude(11), grid_longitude(10)) = [[-45.98, ..., -31.73]] degrees_east
Coord references: grid_mapping_name:rotated_latitude_longitude
Note that the replace_construct
method of the field construct
is used to directly replace the vertical dimension coordinate construct,
without having to manually match up the corresponding domain axis
construct and construct key.
Mathematical operations¶
Binary arithmetical operations¶
A field construct may be arithmetically combined with another field construct, or any other object that is broadcastable to its data. See the comprehensive list of available binary operations.
When combining with another field construct, its data is actually combined, but only after being transformed so that it is broadcastable to the first field construct’s data. This is done by using the metadata constructs of the two field constructs to create a mapping of physically compatible dimensions between the fields, and then manipulating the dimensions of the other field construct’s data to ensure that they are broadcastable.
In any case, a field construct may appear as the left or right operand, and augmented assignments are possible.
Automatic units conversions are also carried out between operands during operations, and if one operand has no units then the units of the other are assumed.
>>> q, t = cf.read('file.nc')
>>> t.data.stats()
{'min': <CF Data(): 260.0 K>,
'mean': <CF Data(): 269.9244444444445 K>,
'max': <CF Data(): 280.0 K>,
'range': <CF Data(): 20.0 K>,
'mid_range': <CF Data(): 270.0 K>,
'standard_deviation': <CF Data(): 5.942452002538104 K>,
'sample_size': 90}
>>> x = t + t
>>> x
<CF Field: air_temperature(atmosphere_hybrid_height_coordinate(1), grid_latitude(10), grid_longitude(9)) K>
>>> x.min()
<CF Data(): 520.0 K>
>>> (t - 2).min()
<CF Data(): 258.0 K>
>>> (2 + t).min()
<CF Data(): 262.0 K>
>>> (t * list(range(9))).min()
<CF Data(): 0.0 K>
>>> (t + cf.Data(numpy.arange(20, 29), '0.1 K')).min()
<CF Data(): 262.6 K>
>>> u = t.copy()
>>> u.transpose(inplace=True)
>>> u.Units -= 273.15
>>> u[0]
<CF Field: air_temperature(grid_longitude(1), grid_latitude(10), atmosphere_hybrid_height_coordinate(1)) K @ 273.15>
>>> t + u[0]
<CF Field: air_temperature(atmosphere_hybrid_height_coordinate(1), grid_latitude(10), grid_longitude(9)) K>
If the physical nature of the result differs from both operands, then the “standard_name” and “long_name” properties are removed. This is the case if the units of the result differ from both operands, or if they have different standard names.
>>> t.identities()
['air_temperature',
'Conventions=CF-1.7',
'project=research',
'units=K',
'standard_name=air_temperature',
'ncvar%ta']
>>> u = t * cf.Data(10, 'm s-1')
>>> u.identities()
['Conventions=CF-1.7',
'project=research',
'units=m.s-1.K',
'ncvar%ta']
The domain metadata constructs of the result of a successful arithmetical operation between two field constructs are unambiguously well defined: The domain metadata constructs of the result of a successful operation are copied from the left hand side (LHS) operand, except when a coordinate construct in the LHS operand has size 1 and the corresponding coordinate construct in right hand side (RHS) field construct operand has size greater than 1. In this case the coordinate construct from the RHS operand is used in the result, to match up with the data broadcasting that will have occurred during the operation.
In circumstances when domain metadata constructs in the result can not be inferred unambiguously then an exception will be raised. For example, this will be the case if both operands are field constructs with corresponding coordinate constructs of size greater than 1 and with different coordinate values. In such circumstances, the field constructs’ data instances may be operated on directly, bypassing any checks on the metadata. See Operating on the field constructs’ data for more details. (This will be made easier in a future release with a new function for combining such field constructs.)
Bounds
For binary operations involving constructs that have bounds, the
result of binary operation will, by default, only have bounds if both
operands have bounds; and the bounds of the result will be the result
of the same binary operation on bounds objects. This behaviour may
modified by the cf.bounds_combination_mode
function.
>>> x = q.dimension_coordinate('X')
>>> x.dump()
Dimension coordinate: longitude
standard_name = 'longitude'
units = 'degrees_east'
Data(longitude(8)) = [22.5, ..., 337.5] degrees_east
Bounds:units = 'degrees_east'
Bounds:Data(longitude(8), 2) = [[0.0, ..., 360.0]]
>>> (x + x).dump()
Dimension coordinate: longitude
standard_name = 'longitude'
units = 'degrees_east'
Data(8) = [45.0, ..., 675.0] degrees_east
Bounds:units = 'degrees_east'
Bounds:Data(8, 2) = [[0.0, ..., 720.0]] degrees_east
>>> (x + 50).dump()
Dimension coordinate: longitude
standard_name = 'longitude'
units = 'degrees_east'
Data(8) = [72.5, ..., 387.5] degrees_east
>>> old = cf.bounds_combination_mode('OR')
>>> (x + 50).dump()
Dimension coordinate: longitude
standard_name = 'longitude'
units = 'degrees_east'
Data(8) = [72.5, ..., 387.5] degrees_east
Bounds:units = 'degrees_east'
Bounds:Data(8, 2) = [[50.0, ..., 410.0]] degrees_east
>>> x2 = x.copy()
>>> x2.del_bounds()
<CF Bounds: longitude(8, 2) degrees_east>
>>> (x2 + x).dump()
Dimension coordinate: longitude
standard_name = 'longitude'
units = 'degrees_east'
Data(8) = [45.0, ..., 675.0] degrees_east
Bounds:units = 'degrees_east'
Bounds:Data(8, 2) = [[22.5, ..., 697.5]] degrees_east
>>> cf.bounds_combination_mode(old)
'OR'
The setting of the bounds combination mode may also be set in a context manager:
>>> with cf.bounds_combination_mode('OR'):
... (x2 + x).dump()
...
Dimension coordinate: longitude
standard_name = 'longitude'
units = 'degrees_east'
Data(8) = [45.0, ..., 675.0] degrees_east
Bounds:units = 'degrees_east'
Bounds:Data(8, 2) = [[22.5, ..., 697.5]] degrees_east
Warning
Care must be taken when combining a construct with a
numpy
array or a Data
instance, due to the ways in
which both of these objects allow themselves to be
combined with other types:
If the construct is on the left hand side (LHS) of the operation then, as expected, a construct is returned whose data is the combination of the original construct’s data and the
numpy
array orData
instance on the right hand side (RHS).If, however, the construct is on the RHS then a
numpy
array orData
instance (which ever type is on the LHS) is returned, containing the same data as in the first case.
>>> import numpy
>>> t = cf.example_field(0)
>>> a = numpy.array(1000)
>>> type(t * a)
cf.field.Field
>>> type(a + t)
numpy.ndarray
>>> b = numpy.random.randn(t.size).reshape(t.shape)
>>> type(t * b)
cf.field.Field
>>> type(b * t)
numpy.ndarray
>>> type(t - cf.Data(b))
cf.field.Field
>>> type(cf.Data(b) * t)
cf.data.data.Data
Unary operations¶
Python unary operators also work on the field construct’s data, returning a new field construct with modified data values. See the comprehensive list of available unary operations.
>>> q, t = cf.read('file.nc')
>>> print(q.array)
[[0.007 0.034 0.003 0.014 0.018 0.037 0.024 0.029]
[0.023 0.036 0.045 0.062 0.046 0.073 0.006 0.066]
[0.11 0.131 0.124 0.146 0.087 0.103 0.057 0.011]
[0.029 0.059 0.039 0.07 0.058 0.072 0.009 0.017]
[0.006 0.036 0.019 0.035 0.018 0.037 0.034 0.013]]
>>> print(-q.array)
[[-0.007 -0.034 -0.003 -0.014 -0.018 -0.037 -0.024 -0.029]
[-0.023 -0.036 -0.045 -0.062 -0.046 -0.073 -0.006 -0.066]
[-0.11 -0.131 -0.124 -0.146 -0.087 -0.103 -0.057 -0.011]
[-0.029 -0.059 -0.039 -0.07 -0.058 -0.072 -0.009 -0.017]
[-0.006 -0.036 -0.019 -0.035 -0.018 -0.037 -0.034 -0.013]]
>>> print(abs(-q).array)
[[0.007 0.034 0.003 0.014 0.018 0.037 0.024 0.029]
[0.023 0.036 0.045 0.062 0.046 0.073 0.006 0.066]
[0.11 0.131 0.124 0.146 0.087 0.103 0.057 0.011]
[0.029 0.059 0.039 0.07 0.058 0.072 0.009 0.017]
[0.006 0.036 0.019 0.035 0.018 0.037 0.034 0.013]]
Relational operations¶
A field construct may compared with another field construct, or any other object that is broadcastable to its data. See the comprehensive list of available relational operations. The result is a field construct with Boolean data values.
When comparing with another field construct, its data is actually combined, but only after being transformed so that it is broadcastable to the first field construct’s data. This is done by using the metadata constructs of the two field constructs to create a mapping of physically compatible dimensions between the fields, and then manipulating the dimensions of the other field construct’s data to ensure that they are broadcastable.
In any case, a field construct may appear as the left or right operand.
Automatic units conversions are also carried out between operands during operations, and if one operand has no units then the units of the other are assumed.
>>> q, t = cf.read('file.nc')
>>> print(q.array)
[[0.007 0.034 0.003 0.014 0.018 0.037 0.024 0.029]
[0.023 0.036 0.045 0.062 0.046 0.073 0.006 0.066]
[0.11 0.131 0.124 0.146 0.087 0.103 0.057 0.011]
[0.029 0.059 0.039 0.07 0.058 0.072 0.009 0.017]
[0.006 0.036 0.019 0.035 0.018 0.037 0.034 0.013]]
>>> print((q == q).array)
[[ True True True True True True True True]
[ True True True True True True True True]
[ True True True True True True True True]
[ True True True True True True True True]
[ True True True True True True True True]]
>>> print((q < 0.05).array)
[[ True True True True True True True True]
[ True True True False True False True False]
[False False False False False False False True]
[ True False True False False False True True]
[ True True True True True True True True]]
>>> print((q >= q[0]).array)
[[ True True True True True True True True]
[ True True True True True True False True]
[ True True True True True True True False]
[ True True True True True True False False]
[False True True True True True True False]]
The “standard_name” and “long_name” properties are removed from the result, which also has no units.
>>> q.identities()
['specific_humidity',
'Conventions=CF-1.7',
'project=research',
'units=1',
'standard_name=specific_humidity',
'ncvar%q']
>>> r = q > q.mean()
>>> r.identities()
['Conventions=CF-1.7',
'project=research',
'units=',
'ncvar%q']
The domain metadata constructs of the result of a successful relational operation between two field constructs are unambiguously well defined: The domain metadata constructs of the result of a successful operation are copied from the left hand side (LHS) operand, except when a coordinate construct in the LHS operand has size 1 and the corresponding coordinate construct in right hand side (RHS) field construct operand has size greater than 1. In this case the coordinate construct from the RHS operand is used in the result, to match up with the data broadcasting that will have occurred during the operation.
In circumstances when domain metadata constructs in the result can not be inferred unambiguously then an exception will be raised. For example, this will be the case if both operands are field constructs with corresponding coordinate constructs of size greater than 1 and with different coordinate values. In such circumstances, the field constructs’ data instances may be operated on directly, bypassing any checks on the metadata. See Operating on the field constructs’ data for more details. (This will be made easier in a future release with a new function for combining such field constructs.)
Arithmetical and relational operations with insufficient metadata¶
When both operands of a binary arithmetical or relational operation are field constructs then the
creation of the mapping of physically compatible dimensions relies on
there being sufficient metadata. By default, the mapping relies on
their being “strict” identities for the metadata constructs with
multi-valued data. The strict identity is restricted standard_name
property (or id
attribute), and may be returned by the identity
method of a construct:
>>> y = q.coordinate('Y')
>>> y.identity(strict=True)
'latitude'
>>> del y.standard_name
>>> y.identity(strict=True)
''
If there is insufficient metadata to create a mapping of physically compatible dimensions, then there are various techniques that allow the operation to proceed:
Option 1: The operation may applied to the field constructs’ data instances instead. See Operating on the field constructs’ data for more details.
Option 2: If the mapping is not possible due to the absence of
standard_name
properties (orid
attributes) on metadata constructs that are known to correspond, then setting “relaxed identities” with thecf.relaxed_identities
function may help. Setting relaxed identities toTrue
allows thelong_name
property and netCDF variable name (see the netCDF interface), to also be considered when identifying constructs.Option 3: Add more metadata to the field and metadata constructs.
Operating on the field constructs’ data¶
Binary arithmetical and relational operations between may also be carried out on their data instances, thereby bypassing any reference to, or checks on, the metadata constructs. This can be useful if there insufficient metadata for determining if the two field constructs are compatible; or if the domain metadata constructs of the result can not be unambiguously defined.
In such cases the data instances may be operated on instead and the
result then inserted into one of the field constructs, either with the
set_data
method of the field construct, or with
indexed assignment. The former technique
is faster and more memory efficient, but the latter technique allows
broadcasting. Alternatively, for augmented assignments, the field
construct data may be changed in-place.
It is up to the user to ensure that the data instances are consistent in terms of size 1 dimensions (to satisfy the numpy broadcasting rules), dimension order and dimension direction, and that the resulting data is compatible with the metadata of the field construct which will contain it. Automatic units conversions are, however, still accounted for when combining the data instances.
>>> t.min()
<CF Data(): 260.0 K>
>>> u = t.copy()
>>> new_data = t.data - t.data
>>> u.set_data(new_data)
>>> u
<CF Field: air_temperature(atmosphere_hybrid_height_coordinate(1), grid_latitude(10), grid_longitude(9)) K>
>>> u.min()
<CF Data(): 0.0 K>
>>> u[...] = new_data
>>> u.min()
<CF Data(): 0.0 K>
>>> t.data -= t.data
>>> t.min()
<CF Data(): 0.0 K>
Trigonometrical and hyperbolic functions¶
The field construct and metadata constructs have methods to apply trigonometric and hyperbolic functions, and their inverses, element-wise to the data. These preserve the metadata but change the construct’s units.
The field construct and metadata constructs support the following trigonometrical methods:
Method |
Description |
---|---|
Take the inverse trigonometric cosine of the data element-wise. |
|
Take the inverse trigonometric sine of the data element-wise. |
|
Take the inverse trigonometric tangent of the data element-wise. |
|
Take the trigonometric cosine of the data element-wise. |
|
Take the trigonometric sine of the data element-wise. |
|
Take the trigonometric tangent of the data element-wise. |
The field construct and metadata constructs also support the following hyperbolic methods:
Method |
Description |
---|---|
Take the inverse hyperbolic cosine of the data element-wise. |
|
Take the inverse hyperbolic sine of the data element-wise. |
|
Take the inverse hyperbolic tangent of the data element-wise. |
|
Take the hyperbolic cosine of the data element-wise. |
|
Take the hyperbolic sine of the data element-wise. |
|
Take the hyperbolic tangent of the data element-wise. |
>>> q, t = cf.read('file.nc')
>>> lat = q.dimension_coordinate('latitude')
>>> lat.data
<CF Data(5): [-75.0, ..., 75.0] degrees_north>
>>> sin_lat = lat.sin()
>>> sin_lat.data
<CF Data(5): [-0.9659258262890683, ..., 0.9659258262890683] 1>
The “standard_name” and “long_name” properties are removed from the result.
Note that a number of the inverse methods have
mathematically restricted domains (see also
here)
and therefore may return
“invalid” values
(nan
or inf
). When applying these methods to constructs with masked
data, you may prefer to output masked values instead of invalid ones. In
this case, you can use masked_invalid
to do the conversion afterwards:
>>> d = cf.Data([2, 1.5, 1, 0.5, 0], mask=[1, 0, 0, 0, 1])
>>> e = d.arctanh()
>>> print(e.array)
[-- nan inf 0.5493061443340548 --]
>>> e.masked_invalid(inplace=True)
>>> print(e.array)
[-- -- -- 0.5493061443340548 --]
Exponential and logarithmic functions¶
The field construct and metadata constructs have exp
and
log
methods for applying exponential and logarithmic
functions respectively element-wise to the data, preserving the
metadata but changing the construct’s units where required.
>>> q
<CF Field: specific_humidity(latitude(5), longitude(8)) 1>
>>> q.log()
<CF Field: specific_humidity(latitude(5), longitude(8)) ln(re 1)>
>>> q.exp()
<CF Field: specific_humidity(latitude(5), longitude(8)) 1>
>>> t
<CF Field: air_temperature(atmosphere_hybrid_height_coordinate(1), grid_latitude(10), grid_longitude(9)) K>
>>> t.log(base=10)
<CF Field: air_temperature(atmosphere_hybrid_height_coordinate(1), grid_latitude(10), grid_longitude(9)) lg(re 1 K)>
>>> t.exp() # Raises Exception
ValueError: Can't take exponential of dimensional quantities: <Units: K>
The “standard_name” and “long_name” properties are removed from the result.
Rounding and truncation¶
The field construct and metadata constructs the following methods to round and truncate their data:
Method |
Description |
---|---|
The ceiling of the data, element-wise. |
|
Limit the values in the data. |
|
Floor the data array, element-wise. |
|
Round the data to the nearest integer, element-wise. |
|
Round the data to the given number of decimals. |
|
Truncate the data, element-wise. |
Moving windows¶
Moving window calculations along an axis may be created with the
moving_window
method of the field construct.
Moving mean, sum, and integral calculations are possible.
By default moving means are unweighted, but weights based on the axis cell sizes (or custom weights) may applied to the calculation.
>>> q, t = cf.read('file.nc')
>>> print(q)
Field: specific_humidity (ncvar%q)
----------------------------------
Data : specific_humidity(latitude(5), longitude(8)) 1
Cell methods : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> print(q.array)
[[0.007 0.034 0.003 0.014 0.018 0.037 0.024 0.029]
[0.023 0.036 0.045 0.062 0.046 0.073 0.006 0.066]
[0.11 0.131 0.124 0.146 0.087 0.103 0.057 0.011]
[0.029 0.059 0.039 0.07 0.058 0.072 0.009 0.017]
[0.006 0.036 0.019 0.035 0.018 0.037 0.034 0.013]]
>>> print(q.coordinate('X').bounds.array)
[[ 0. 45.]
[ 45. 90.]
[ 90. 135.]
[135. 180.]
[180. 225.]
[225. 270.]
[270. 315.]
[315. 360.]]
>>> q.iscyclic('X')
True
>>> g = q.moving_window('mean', 3, axis='X', weights=True)
>>> print(g)
Field: specific_humidity (ncvar%q)
----------------------------------
Data : specific_humidity(latitude(5), longitude(8)) 1
Cell methods : area: mean longitude(8): mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> print(g.array)
[[0.02333 0.01467 0.017 0.01167 0.023 0.02633 0.03 0.02 ]
[0.04167 0.03467 0.04767 0.051 0.06033 0.04167 0.04833 0.03167]
[0.084 0.12167 0.13367 0.119 0.112 0.08233 0.057 0.05933]
[0.035 0.04233 0.056 0.05567 0.06667 0.04633 0.03267 0.01833]
[0.01833 0.02033 0.03 0.024 0.03 0.02967 0.028 0.01767]]
>>> print(g.coordinate('X').bounds.array)
[[-45. 90.]
[ 0. 135.]
[ 45. 180.]
[ 90. 225.]
[135. 270.]
[180. 315.]
[225. 360.]
[270. 360.]]
Note
The moving_window
method can not, in general, be
emulated by the convolution_filter
method, as the
latter i) can not change the window weights as the filter
passes through the axis; and ii) does not update the cell
method constructs.
Convolution filters¶
A convolution of the
field construct data with a filter along a single domain axis can be
calculated, which also updates the bounds of a relevant dimension
coordinate construct to account for the width of the
filter. Convolution filters are carried with the
convolution_filter
method of the field construct.
>>> print(q)
Field: specific_humidity (ncvar%q)
----------------------------------
Data : specific_humidity(latitude(5), longitude(8)) 1
Cell methods : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> q.iscyclic('X')
True
>>> r = q.convolution_filter([0.1, 0.15, 0.5, 0.15, 0.1], axis='X')
>>> print(r)
Field: specific_humidity (ncvar%q)
----------------------------------
Data : specific_humidity(latitude(5), longitude(8)) 1
Cell methods : area: mean
Dimension coords: latitude(5) = [-75.0, ..., 75.0] degrees_north
: longitude(8) = [22.5, ..., 337.5] degrees_east
: time(1) = [2019-01-01 00:00:00]
>>> print(q.dimension_coordinate('X').bounds.array)
[[ 0. 45.]
[ 45. 90.]
[ 90. 135.]
[135. 180.]
[180. 225.]
[225. 270.]
[270. 315.]
[315. 360.]]
>>> print(r.dimension_coordinate('X').bounds.array)
[[-90. 135.]
[-45. 180.]
[ 0. 225.]
[ 45. 270.]
[ 90. 315.]
[135. 360.]
[180. 405.]
[225. 450.]]
The convolution_filter
method of the field construct also has
options to
Specify how the input array is extended when the filter overlaps a border, and
Control the placement position of the filter window.
Note that the scipy.signal.windows
package has suite of window
functions for creating window weights for filtering:
>>> from scipy.signal import windows
>>> exponential_window = windows.exponential(3)
>>> print(exponential_window)
[0.36787944 1. 0.36787944]
>>> r = q.convolution_filter(exponential_window, axis='Y')
>>> print(r.array)
[[-- -- -- -- -- -- -- -- ]
[0.06604 0.0967 0.09172 0.12086 0.08463 0.1245 0.0358 0.08072]
[0.12913 0.16595 0.1549 0.19456 0.12526 0.15634 0.06252 0.04153]
[0.07167 0.12044 0.09161 0.13659 0.09663 0.1235 0.04248 0.02583]
[-- -- -- -- -- -- -- -- ]]
The magnitude of the integral of the filter (i.e. the sum of the
window weights defined by the window parameter) affects the
convolved values. For example, window weights of [0.2, 0.2 0.2, 0.2,
0.2]
will produce a non-weighted 5-point running mean; and window
weights of [1, 1, 1, 1, 1]
will produce a 5-point running
sum. Note that the window weights returned by functions of the
scipy.signal.windows
package do not necessarily sum to 1.
Note
The moving_window
method can not, in general, be
emulated by the convolution_filter
method, as the
latter i) can not change the window weights as the filter
passes through the axis; and ii) does not update the cell
method constructs.
General first order derivative¶
The derivative along a dimension of the field construct’s data can be
calculated as a centred finite difference with the derivative
method. If the axis is cyclic then the
derivative wraps around by default, otherwise it may be forced to wrap
around; a one-sided difference is calculated at the edges; or missing
data is inserted.
>>> r = q.derivative('X')
>>> r = q.derivative('Y', one_sided_at_boundary=True)
Gradient vector¶
The horizontal gradient vector of a scalar field may calculated with
the grad_xy
method when the field has dimension coordinates
of X and Y, in either Cartesian (e.g. plane projection) or spherical
polar coordinate systems.
The horizontal gradient vector in Cartesian coordinates is given by:
The horizontal gradient vector in spherical polar coordinates is given by:
where r is radial distance to the origin, \(\theta\) is the polar angle with respect to polar axis, and \(\phi\) is the azimuthal angle.
See cf.Field.grad_xy
for details and examples.
Laplacian¶
The horizontal Laplacian of a scalar field may be calculated with the
laplacian_xy
method when the field has dimension coordinates
of X and Y, in either Cartesian (e.g. plane projection) or spherical
polar coordinate systems.
The horizontal Laplacian in Cartesian coordinates is given by:
The horizontal Laplacian in spherical polar coordinates is given by:
where r is radial distance to the origin, \(\theta\) is the polar angle with respect to polar axis, and \(\phi\) is the azimuthal angle.
See cf.Field.laplacian_xy
for details and examples.
Divergence¶
The horizontal divergence may be calculated with the cf.div_xy
function from orthogonal vector component fields which have dimension
coordinates of X and Y, in either Cartesian (e.g. plane projection) or
spherical polar coordinate systems.
The horizontal divergence of the \((f_x, f_y)\) vector in Cartesian coordinates is given by:
The horizontal divergence of the \((f_\theta, f_\phi)\) vector in spherical polar coordinates is given by:
where r is radial distance to the origin, \(\theta\) is the polar angle with respect to polar axis, and \(\phi\) is the azimuthal angle.
See cf.div_xy
for details and examples.
Curl¶
The horizontal curl may be calculated with the cf.curl_xy
function
from orthogonal vector component fields which have dimension
coordinates of X and Y, in either Cartesian (e.g. plane projection) or
spherical polar coordinate systems.
Note that the curl of the horizontal wind field is the relative vorticity.
The horizontal curl of the \((f_x, f_y)\) vector in Cartesian coordinates is given by:
The horizontal curl of the \((f_\theta, f_\phi)\) vector in spherical polar coordinates is given by:
where r is radial distance to the origin, \(\theta\) is the polar angle with respect to polar axis, and \(\phi\) is the azimuthal angle.
See cf.curl_xy
for details and examples.