cf.Data.var¶

Data.var(axes=None, weights=None, squeeze=False, mtol=1, ddof=0, split_every=None, inplace=False, i=False)[source]¶

Calculate variances.

Calculates the variance of an array or the variance values along axes.

See https://ncas-cms.github.io/cf-python/analysis.html#collapse-methods for mathematical definitions.

..seealso:: sample_size, mean, sd, sum

Parameters:

axes: (sequence of) int, optional

The axes to be collapsed. By default all axes are collapsed, resulting in output with size 1. Each axis is identified by its positive or negative integer position. If axes is an empty sequence then the collapse is applied to each scalar element and the result has the same shape as the input data.

weights: data_like, dict, or None, optional

Weights associated with values of the data. By default weights is None, meaning that all non-missing elements of the data have a weight of 1 and all missing elements have a weight of 0.

If weights is a data_like object then it must be broadcastable to the array.

If weights is a dictionary then each key specifies axes of the data (an int or tuple of int), with a corresponding value of data_like weights for those axes. The dimensions of a weights value must correspond to its key axes in the same order. Not all of the axes need weights assigned to them. The weights that will be used will be an outer product of the dictionary’s values.

However they are specified, the weights are internally broadcast to the shape of the data, and those weights that are missing data, or that correspond to the missing elements of the data, are assigned a weight of 0.

squeeze: bool, optional

By default, the axes which are collapsed are left in the result as dimensions with size one, so that the result will broadcast correctly against the input array. If set to True then collapsed axes are removed from the data.

mtol: number, optional

The sample size threshold below which collapsed values are set to missing data. It is defined as a fraction (between 0 and 1 inclusive) of the contributing input data values. A missing value in the output array occurs whenever more than 100*mtol% of its contributing input array elements are missing data.

The default of mtol is 1, meaning that a missing value in the output array occurs whenever all of its contributing input array elements are missing data.

Note that for non-zero values of mtol, different collapsed elements may have different sample sizes, depending on the distribution of missing data in the input data.

ddof: number

The delta degrees of freedom, a non-negative number. The number of degrees of freedom used in the calculation is N-ddof where N is the number of non-missing elements. A value of 1 applies Bessel’s correction. If the calculation is weighted then ddof can only be 0 or 1.

By default ddof is 0.

split_every: int or dict, optional

Determines the depth of the dask recursive aggregation. If set to or more than the number of input Dask chunks, the aggregation will be performed in two steps, one partial collapse per input chunk and a single aggregation at the end. If set to less than that, an intermediate aggregation step will be used, so that any of the intermediate or final aggregation steps operates on no more than split_every inputs. The depth of the aggregation graph will be the logarithm to the base split_every of N, the number input chunks along reduced axes. Setting to a low value can reduce cache size and network transfers, at the cost of more CPU and a larger dask graph. See dask.array.reduction for details.

By default, dask heuristically decides on a good value. A default can also be set globally with the split_every key in dask.config.

Added in version 3.14.0.

inplace: bool, optional

If True then do the operation in-place and return None.

i: deprecated at version 3.0.0

Use the inplace parameter instead.

Returns:

Data or None: The collapsed data, or None if the operation was in-place.

Examples

>>> a = np.ma.arange(12).reshape(4, 3)
>>> d = cf.Data(a, 'K')
>>> d[1, 1] = cf.masked
>>> print(d.array)
[[0 1 2]
 [3 -- 5]
 [6 7 8]
 [9 10 11]]
>>> d.var()
<CF Data(1, 1): [[12.776859504132233]] K2>
>>> d.var(ddof=1)
<CF Data(1, 1): [[14.054545454545456]] K2>

>>> w = np.linspace(1, 2, 3)
>>> print(w)
[1.  1.5 2. ]
>>> d.var(ddof=1, weights=w)
<CF Data(1, 1): [[14.030549898167004]] K2>

cf 3.20.1

Related Topics

cf.Data.var¶