cf.Data.sum_of_squares¶
-
Data.
sum_of_squares
(axes=None, weights=None, squeeze=False, mtol=1, split_every=None, inplace=False)[source]¶ Calculate sums of squares.
Calculates the sum of squares or the sum of squares values along axes.
See https://ncas-cms.github.io/cf-python/analysis.html#collapse-methods for mathematical definitions.
- ..seealso::
sample_size
,sum
,sum_of_squares
,
- Parameters
- axes: (sequence of)
int
, optional The axes to be collapsed. By default all axes are collapsed, resulting in output with size 1. Each axis is identified by its integer position. If axes is an empty sequence then the collapse is applied to each scalar element and the result has the same shape as the input data.
- weights: data_like,
dict
, orNone
, optional Weights associated with values of the data. By default weights is
None
, meaning that all non-missing elements of the data have a weight of 1 and all missing elements have a weight of 0.If weights is a data_like object then it must be broadcastable to the array.
If weights is a dictionary then each key specifies axes of the data (an
int
ortuple
ofint
), with a corresponding value of data_like weights for those axes. The dimensions of a weights value must correspond to its key axes in the same order. Not all of the axes need weights assigned to them. The weights that will be used will be an outer product of the dictionary’s values.However they are specified, the weights are internally broadcast to the shape of the data, and those weights that are missing data, or that correspond to the missing elements of the data, are assigned a weight of 0.
- squeeze:
bool
, optional By default, the axes which are collapsed are left in the result as dimensions with size one, so that the result will broadcast correctly against the input array. If set to True then collapsed axes are removed from the data.
- mtol: number, optional
The sample size threshold below which collapsed values are set to missing data. It is defined as a fraction (between 0 and 1 inclusive) of the contributing input data values.
The default of mtol is 1, meaning that a missing datum in the output array occurs whenever all of its contributing input array elements are missing data.
For other values, a missing datum in the output array occurs whenever more than
100*mtol%
of its contributing input array elements are missing data.Note that for non-zero values of mtol, different collapsed elements may have different sample sizes, depending on the distribution of missing data in the input data.
- split_every:
int
ordict
, optional Determines the depth of the recursive aggregation. If set to or more than the number of input chunks, the aggregation will be performed in two steps, one partial collapse per input chunk and a single aggregation at the end. If set to less than that, an intermediate aggregation step will be used, so that any of the intermediate or final aggregation steps operates on no more than
split_every
inputs. The depth of the aggregation graph will be \(log_{split_every}(input chunks along reduced axes)\). Setting to a low value can reduce cache size and network transfers, at the cost of more CPU and a larger dask graph.By default,
dask
heuristically decides on a good value. A default can also be set globally with thesplit_every
key indask.config
. Seedask.array.reduction
for details.New in version 3.14.0.
- inplace:
bool
, optional If True then do the operation in-place and return
None
.
- axes: (sequence of)
- Returns
Examples
>>> a = np.ma.arange(12).reshape(4, 3) >>> d = cf.Data(a, 'K') >>> d[1, 1] = cf.masked >>> print(d.array) [[0 1 2] [3 -- 5] [6 7 8] [9 10 11]] >>> d.sum_of_squares() <CF Data(1, 1): [[490]] K2>
>>> w = np.linspace(1, 2, 3) >>> print(w) [1. 1.5 2. ] >>> d.sum_of_squares(weights=w) <CF Data(1, 1): [[779.0]] K2>
- ..seealso::