cf.Data.percentile¶

Data.
percentile
(ranks, axes=None, interpolation='linear', squeeze=False, mtol=1, inplace=False, _preserve_partitions=False)[source]¶ Compute percentiles of the data along the specified axes.
The default is to compute the percentiles along a flattened version of the data.
If the input data are integers, or floats smaller than float64, or the input data contains missing values, then output datatype is float64. Otherwise, the output datatype is the same as that of the input.
If multiple percentile ranks are given then a new, leading data dimension is created so that percentiles can be stored for each percentile rank.
New in version 3.0.4.
See also
Parameters:  ranks: (sequence of) number
Percentile rank, or sequence of percentile ranks, to compute, which must be between 0 and 100 inclusive.
 axes: (sequence of)
int
, optional Select the axes. The axes argument may be one, or a sequence, of integers that select the axis coresponding to the given position in the list of axes of the data array.
By default, of axes is
None
, all axes are selected. interpolation:
str
, optional Specify the interpolation method to use when the desired percentile lies between two data values
i < j
:interpolation Description 'linear'
i+(ji)*fraction
, wherefraction
is the fractional part of the index surrounded byi
andj
'lower'
i
'higher'
j
'nearest'
i
orj
, whichever is nearest'midpoint'
(i+j)/2
By default
'linear'
interpolation is used. squeeze:
bool
, optional If True then all axes over which percentiles are calculated are removed from the returned data. By default axes over which percentiles have been calculated are left in the result as axes with size 1, meaning that the result is guaranteed to broadcast correctly against the original data.
 mtol: number, optional
Set the fraction of input data elements which is allowed to contain missing data when contributing to an individual output data element. Where this fraction exceeds mtol, missing data is returned. The default is 1, meaning that a missing datum in the output array occurs when its contributing input array elements are all missing data. A value of 0 means that a missing datum in the output array occurs whenever any of its contributing input array elements are missing data. Any intermediate value is permitted.
 Parameter example:
To ensure that an output array element is a missing datum if more than 25% of its input array elements are missing data:
mtol=0.25
.
 inplace:
bool
, optional If True then do the operation inplace and return
None
.
Returns: Examples:
>>> d = cf.Data(numpy.arange(12).reshape(3, 4), 'm') >>> print(d.array) [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] >>> p = d.percentile([20, 40, 50, 60, 80]) >>> p <CF Data(4, 1, 1): [[[2.2, ..., 8.8]]] m>
>>> p = d.percentile([20, 40, 50, 60, 80], squeeze=True) >>> print(p.array) [2.2 4.4 5.5 6.6 8.8]
Find the standard deviation of the values above the 80th percentile:
>>> p80 = d.percentile(80) <CF Data(1, 1): [[8.8]] m> >>> e = d.where(d<=p80, cf.masked) >>> print(e.array) [[   ] [   ] [ 9 10 11]] >>> e.sd() <CF Data(1, 1): [[0.816496580927726]] m>
Find the mean of the values above the 45th percentile along the second axis:
>>> p45 = d.percentile(45, axes=1) >>> print(p45.array) [[1.35], [5.35], [9.35]] >>> e = d.where(d<=p45, cf.masked) >>> print(e.array) [[  2 3] [  6 7] [  10 11]] >>> f = e.mean(axes=1) >>> f <CF Data(3, 1): [[2.5, ..., 10.5]] m> >>> print(f.array) [[ 2.5] [ 6.5] [10.5]]
Find the histogram bin boundaries associated with given percentiles, and digitize the data based on these bins:
>>> bins = d.percentile([0, 10, 50, 90, 100], squeeze=True) >>> print(bins.array) [ 0. 1.1 5.5 9.9 11. ] >>> e = d.digitize(bins, closed_ends=True) >>> print(e.array) [[0 0 1 1] [1 1 2 2] [2 2 3 3]]