cf.Data.digitize¶

Data.
digitize
(bins, upper=False, open_ends=False, return_bins=False)[source]¶ Return the indices of the bins to which each value belongs.
Values (including masked values) that do not belong to any bin result in masked values in the output data.
New in version 3.0.2.
Parameters:  bins: array_like
The bin boundaries. One of:
An integer.
Create this many equally sized, contiguous bins spanning the range of the data. I.e. the smallest bin boundary is the minimum of the data and the largest bin boundary is the maximum of the data. In order to guarantee that each data value lies inside a bin, the most extreme open boundary is extended by multiplying it by
1.0  epsilon
or1.0 + epsilon
, whichever extends the boundary in the appropriate direction, whereepsilon
is the smallest positive 64bit float such that1.0 + epsilson != 1.0
. I.e. if upper is False then the largest upper bin boundary is made slightly larger and if upper is True then the lowest lower bin boundary is made slightly lower.A 1d array of numbers.
When sorted into a monotonically increasing sequence, each boundary, with the exception of the two end boundaries, counts as the upper boundary of one bin and the lower boundary of next. If the open_ends parameter is True then the lowest lower bin boundary also defines a leftopen (i.e. not bounded below) bin, and the largest upper bin boundary also defines a rightopen (i.e. not bounded above) bin.
A 2d array of numbers.
The second dimension, that must have size 2, contains the lower and upper bin boundaries. Different bins may share a boundary, but may not overlap. If the open_ends parameter is True then the lowest lower bin boundary also defines a leftopen (i.e. not bounded below) bin, and the largest upper bin boundary also defines a rightopen (i.e. not bounded above) bin.
 upper:
bool
, optional If True then each bin includes its upper bound but not its lower bound. By default the opposite is applied, i.e. each bin includes its lower bound but not its upper bound.
 open_ends:
bool
, optional If True then create leftopen (i.e. not bounded below) and rightopen (i.e. not bounded above) bins from the lowest lower bin boundary and largest upper bin boundary respectively. By default these bins are not created
 return_bins:
bool
, optional If True then also return the bins in their 2d form.
Returns: Examples:
>>> d = cf.Data(numpy.arange(12).reshape(3, 4)) [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]]
Equivalant ways to create indices for the four bins
[inf, 2), [2, 6), [6, 10), [10, inf)
>>> e = d.digitize([2, 6, 10]) >>> e = d.digitize([[2, 6], [6, 10]]) >>> print(e.array) [[0 0 1 1] [1 1 2 2] [2 2 3 3]]
Equivalant ways to create indices for the two bins
(2, 6], (6, 10]
>>> e = d.digitize([2, 6, 10], upper=True, open_ends=False) >>> e = d.digitize([[2, 6], [6, 10]], upper=True, open_ends=False) >>> print(e.array) [[   0] [ 0 0 0 1] [ 1 1 1 ]]
Create indices for the two bins
[2, 6), [8, 10)
, which are noncontiguous>>> e = d.digitize([[2, 6], [8, 10]]) >>> print(e.array) [[ 0 0 1 1] [ 1 1  ] [ 2 2 3 3]]
Masked values result in masked indices in the output array.
>>> d[1, 1] = cf.masked >>> print(d.array) [[ 0 1 2 3] [ 4  6 7] [ 8 9 10 11]] >>> print(d.digitize([2, 6, 10]).array) [[ 0 0 1 1] [ 1  2 2] [ 2 2 3 3]]