cf.Data.mean¶

Data.mean(axes=None, weights=None, squeeze=False, mtol=1, split_every=None, inplace=False, i=False)[source]¶

Calculate mean values.

Calculates the mean value or the mean values along axes.

See https://ncas-cms.github.io/cf-python/analysis.html#collapse-methods for mathematical definitions.

..seealso:: sample_size, mean_abslute_value, sd, sum

Parameters

axes: (sequence of) int, optional

The axes to be collapsed. By default all axes are collapsed, resulting in output with size 1. Each axis is identified by its integer position. If axes is an empty sequence then the collapse is applied to each scalar element and the result has the same shape as the input data.

weights: data_like, dict, or None, optional

Weights associated with values of the data. By default weights is None, meaning that all non-missing elements of the data have a weight of 1 and all missing elements have a weight of 0.

If weights is a data_like object then it must be broadcastable to the array.

If weights is a dictionary then each key specifies axes of the data (an int or tuple of int), with a corresponding value of data_like weights for those axes. The dimensions of a weights value must correspond to its key axes in the same order. Not all of the axes need weights assigned to them. The weights that will be used will be an outer product of the dictionary’s values.

However they are specified, the weights are internally broadcast to the shape of the data, and those weights that are missing data, or that correspond to the missing elements of the data, are assigned a weight of 0.

squeeze: bool, optional

By default, the axes which are collapsed are left in the result as dimensions with size one, so that the result will broadcast correctly against the input array. If set to True then collapsed axes are removed from the data.

mtol: number, optional

The sample size threshold below which collapsed values are set to missing data. It is defined as a fraction (between 0 and 1 inclusive) of the contributing input data values.

The default of mtol is 1, meaning that a missing datum in the output array occurs whenever all of its contributing input array elements are missing data.

For other values, a missing datum in the output array occurs whenever more than 100*mtol% of its contributing input array elements are missing data.

Note that for non-zero values of mtol, different collapsed elements may have different sample sizes, depending on the distribution of missing data in the input data.

split_every: int or dict, optional

Determines the depth of the recursive aggregation. If set to or more than the number of input chunks, the aggregation will be performed in two steps, one partial collapse per input chunk and a single aggregation at the end. If set to less than that, an intermediate aggregation step will be used, so that any of the intermediate or final aggregation steps operates on no more than split_every inputs. The depth of the aggregation graph will be \(log_{split_every}(input chunks along reduced axes)\). Setting to a low value can reduce cache size and network transfers, at the cost of more CPU and a larger dask graph.

By default, dask heuristically decides on a good value. A default can also be set globally with the split_every key in dask.config. See dask.array.reduction for details.

New in version 3.14.0.

inplace: bool, optional

If True then do the operation in-place and return None.

i: deprecated at version 3.0.0

Use the inplace parameter instead.

Returns

Data or None: The collapsed data, or None if the operation was in-place.

Examples

>>> a = np.ma.arange(12).reshape(4, 3)
>>> d = cf.Data(a, 'K')
>>> d[1, 1] = cf.masked
>>> print(d.array)
[[0 1 2]
 [3 -- 5]
 [6 7 8]
 [9 10 11]]
>>> d.mean()
<CF Data(1, 1): [[5.636363636363637]] K>

>>> w = np.linspace(1, 2, 3)
>>> print(w)
[1.  1.5 2. ]
>>> d.mean(weights=w)
<CF Data(1, 1): [[5.878787878787879]] K>

cf 3.15.4

Related Topics

cf.Data.mean¶