cf.Data.stats¶
-
Data.
stats
(all=False, compute=True, minimum=True, mean=True, median=True, maximum=True, range=True, mid_range=True, standard_deviation=True, root_mean_square=True, sample_size=True, minimum_absolute_value=False, maximum_absolute_value=False, mean_absolute_value=False, mean_of_upper_decile=False, sum=False, sum_of_squares=False, variance=False, weights=None)[source]¶ Calculate statistics of the data.
By default the minimum, mean, median, maximum, range, mid-range, standard deviation, root mean square, and sample size are calculated. But this selection may be edited, and other metrics are available.
See also
minimum
,mean
,median
,maximum
,range
,mid_range
,standard_deviation
,root_mean_square
,sample_size
,minimum_absolute_value
,maximum_absolute_value
,mean_absolute_value
,mean_of_upper_decile
,sum
,sum_of_squares
,variance
- Parameters
- all:
bool
, optional Calculate all possible statistics, regardless of the value of individual metric parameters.
- compute:
bool
, optional If True (the default), returned values for the statistical calculations in the output dictionary are computed, else each is given in the form of a delayed
Data
operation.- minimum:
bool
, optional Calculate the minimum of the values.
- maximum:
bool
, optional Calculate the maximum of the values.
- maximum_absolute_value:
bool
, optional Calculate the maximum of the absolute values.
- minimum_absolute_value:
bool
, optional Calculate the minimum of the absolute values.
- mid_range:
bool
, optional Calculate the average of the maximum and the minimum of the values.
- median:
bool
, optional Calculate the median of the values.
- range:
bool
, optional Calculate the absolute difference between the maximum and the minimum of the values.
- sum:
bool
, optional Calculate the sum of the values.
- sum_of_squares:
bool
, optional Calculate the sum of the squares of values.
- sample_size:
bool
, optional Calculate the sample size, i.e. the number of non-missing values.
- mean:
bool
, optional Calculate the weighted or unweighted mean of the values.
- mean_absolute_value:
bool
, optional Calculate the mean of the absolute values.
- mean_of_upper_decile:
bool
, optional Calculate the mean of the upper group of data values defined by the upper tenth of their distribution.
- variance:
bool
, optional Calculate the weighted or unweighted variance of the values, with a given number of degrees of freedom.
- standard_deviation:
bool
, optional Calculate the square root of the weighted or unweighted variance.
- root_mean_square:
bool
, optional Calculate the square root of the weighted or unweighted mean of the squares of the values.
- weights: data_like,
dict
, orNone
, optional Weights associated with values of the data. By default weights is
None
, meaning that all non-missing elements of the data have a weight of 1 and all missing elements have a weight of 0.If weights is a data_like object then it must be broadcastable to the array.
If weights is a dictionary then each key specifies axes of the data (an
int
ortuple
ofint
), with a corresponding value of data_like weights for those axes. The dimensions of a weights value must correspond to its key axes in the same order. Not all of the axes need weights assigned to them. The weights that will be used will be an outer product of the dictionary’s values.However they are specified, the weights are internally broadcast to the shape of the data, and those weights that are missing data, or that correspond to the missing elements of the data, are assigned a weight of 0.
- all:
- Returns
Examples
>>> d = cf.Data([[0, 1, 2], [3, -99, 5]], mask=[[0, 0, 0], [0, 1, 0]]) >>> print(d.array) [[0 1 2] [3 -- 5]] >>> d.stats() {'minimum': 0, 'mean': 2.2, 'median': 2.0, 'maximum': 5, 'range': 5, 'mid_range': 2.5, 'standard_deviation': 1.7204650534085255, 'root_mean_square': 2.792848008753788, 'sample_size': 5} >>> d.stats(all=True) {'minimum': 0, 'mean': 2.2, 'median': 2.0, 'maximum': 5, 'range': 5, 'mid_range': 2.5, 'standard_deviation': 1.7204650534085255, 'root_mean_square': 2.792848008753788, 'minimum_absolute_value': 0, 'maximum_absolute_value': 5, 'mean_absolute_value': 2.2, 'mean_of_upper_decile': 5.0, 'sum': 11, 'sum_of_squares': 39, 'variance': 2.9600000000000004, 'sample_size': 5} >>> d.stats(mean_of_upper_decile=True, range=False) {'minimum': 0, 'mean': 2.2, 'median': 2.0, 'maximum': 5, 'mid_range': 2.5, 'standard_deviation': 1.7204650534085255, 'root_mean_square': 2.792848008753788, 'mean_of_upper_decile': 5.0, 'sample_size': 5}
To ask for delayed operations instead of computed values:
>>> d.stats(compute=False) {'minimum': <CF Data(): 0>, 'mean': <CF Data(): 2.2>, 'median': <CF Data(): 2.0>, 'maximum': <CF Data(): 5>, 'range': <CF Data(): 5>, 'mid_range': <CF Data(): 2.5>, 'standard_deviation': <CF Data(): 1.7204650534085255>, 'root_mean_square': <CF Data(): 2.792848008753788>, 'sample_size': <CF Data(1, 1): [[5]]>}