cf.Data

class cf.Data(array=None, units=None, calendar=None, fill_value=None, hardmask=True, chunk=True, loadd=None, loads=None, dt=False, source=None, copy=True, dtype=None, mask=None, _use_array=True)[source]

Bases: cfdm.data.data.Data

An N-dimensional data array with units and masked values.

  • Contains an N-dimensional, indexable and broadcastable array with many similarities to a numpy array.

  • Contains the units of the array elements.

  • Supports masked arrays, regardless of whether or not it was initialised with a masked array.

  • Stores and operates on data arrays which are larger then the available memory.

Indexing

A data array is indexable in a similar way to numpy array:

>>> d.shape
(12, 19, 73, 96)
>>> d[...].shape
(12, 19, 73, 96)
>>> d[slice(0, 9), 10:0:-2, :, :].shape
(9, 5, 73, 96)

There are three extensions to the numpy indexing functionality:

  • Size 1 dimensions are never removed bi indexing.

    An integer index i takes the i-th element but does not reduce the rank of the output array by one:

    >>> d.shape
    (12, 19, 73, 96)
    >>> d[0, ...].shape
    (1, 19, 73, 96)
    >>> d[:, 3, slice(10, 0, -2), 95].shape
    (12, 1, 5, 1)
    

    Size 1 dimensions may be removed with the squeeze method.

  • The indices for each axis work independently.

    When more than one dimension’s slice is a 1-d boolean sequence or 1-d sequence of integers, then these indices work independently along each dimension (similar to the way vector subscripts work in Fortran), rather than by their elements:

    >>> d.shape
    (12, 19, 73, 96)
    >>> d[0, :, [0, 1], [0, 13, 27]].shape
    (1, 19, 2, 3)
    
  • Boolean indices may be any object which exposes the numpy array interface.

    >>> d.shape
    (12, 19, 73, 96)
    >>> d[..., d[0, 0, 0]>d[0, 0, 0].min()]
    

Cyclic axes

Miscellaneous

A Data object is picklable.

A Data object is hashable, but note that, since it is mutable, its hash value is only valid whilst the data array is not changed in place.

Initialization

Parameters
array: optional

The array of values. May be any scalar or array-like object, including another Data instance. Ignored if the source parameter is set.

Parameter example:

array=[34.6]

Parameter example:

array=[[1, 2], [3, 4]]

Parameter example:

array=numpy.ma.arange(10).reshape(2, 1, 5)

units: str or Units, optional

The physical units of the data. if a Units object is provided then this an also set the calendar. Ignored if the source parameter is set.

The units (without the calendar) may also be set after initialisation with the set_units method.

Parameter example:

units='km hr-1'

Parameter example:

units='days since 2018-12-01'

calendar: str, optional

The calendar for reference time units. Ignored if the source parameter is set.

The calendar may also be set after initialisation with the set_calendar method.

Parameter example:

calendar='360_day'

fill_value: optional

The fill value of the data. By default, or if set to None, the numpy fill value appropriate to the array’s data-type will be used (see numpy.ma.default_fill_value). Ignored if the source parameter is set.

The fill value may also be set after initialisation with the set_fill_value method.

Parameter example:

fill_value=-999.

dtype: data-type, optional

The desired data-type for the data. By default the data-type will be inferred form the array parameter.

The data-type may also be set after initialisation with the dtype attribute.

Parameter example:

dtype=float

Parameter example:

dtype='float32'

Parameter example:

dtype=numpy.dtype('i2')

mask: optional

Apply this mask to the data given by the array parameter. By default, or if mask is None, no mask is applied. May be any scalar or array-like object (such as a numpy array or Data instance) that is broadcastable to the shape of array. Masking will be carried out where the mask elements evaluate to True.

This mask will applied in addition to any mask already defined by the array parameter.

source: optional

Initialize the array, units, calendar and fill value from those of source.

hardmask: bool, optional

If False then the mask is soft. By default the mask is hard.

dt: bool, optional

If True then strings (such as '1990-12-01 12:00') given by the array parameter are re-interpreted as date-time objects. By default they are not.

loadd: dict, optional

Initialise the data from a dictionary serialization of a cf.Data object. All other arguments are ignored. See the dumpd and loadd methods.

loads: str, optional

Initialise the data array from a string serialization of a Data object. All other arguments are ignored. See the dumps and loads methods.

copy: bool, optional

If False then do not deep copy input parameters prior to initialization. By default arguments are deep copied.

chunk: bool, optional

If False then the data array will be stored in a single partition. By default the data array will be partitioned if it is larger than the chunk size, as returned by the cf.CHUNKSIZE function.

Examples:

>>> d = cf.Data(5)
>>> d = cf.Data([1,2,3], units='K')
>>> import numpy
>>> d = cf.Data(numpy.arange(10).reshape(2,5), units=Units('m/s'), fill_value=-999)
>>> d = cf.Data(tuple('fly'))

Data attributes

array

A numpy array copy the data array.

binary_mask

A binary (0 and 1) mask of the data array.

data

The data as an object identity.

day

The day of each data array element.

datetime_array

An independent numpy array of date-time objects.

dtype

The numpy data-type of the data array.

fill_value

The data array missing data value.

hardmask

Whether the mask is hard (True) or soft (False).

hour

The hour of each data array element.

ismasked

True if the data array has any masked values.

isscalar

True if the data array is a 0-d scalar array.

mask

The boolean missing data mask of the data array.

minute

The minute of each data array element.

month

The month of each data array element.

nbytes

Total number of bytes consumed by the elements of the array.

ndim

Number of dimensions in the data array.

second

The second of each data array element.

shape

Tuple of the data array’s dimension sizes.

size

Number of elements in the data array.

Units

The cf.Units object aining the units of the data array.

varray

A numpy array view the data array.

year

The year of each data array element.

Data methods

add_partitions

Add partition boundaries.

all

Test whether all data array elements evaluate to True.

allclose

Returns True if two broadcastable arrays have equal values, False otherwise.

any

Test whether any data array elements evaluate to True.

arcsinh

Take the inverse hyperbolic sine of the data element-wise.

arctan

Take the trigonometric inverse tangent of the data element-wise.

argmax

Return the indices of the maximum values along an axis.

asdata

Convert the input to a Data object.

ceil

The ceiling of the data, element-wise.

change_calendar

Change the calendar of the data array elements.

chunk

Partition the data array.

clip

Clip (limit) the values in the data array in place.

close

Close all files referenced by the data array.

concatenate

Join a sequence of data arrays together.

concatenate_data

Concatenates a list of Data objects into a single Data object along the specified access (see cf.Data.concatenate for details).

copy

Return a deep copy.

cos

Take the trigonometric cosine of the data array in place.

cosh

Take the hyperbolic cosine of the data array in place.

count

Count the non-masked elements of the data.

count_masked

Count the masked elements of the data.

creation_commands

Return the commands that would create the data object.

cumsum

Return the data cumulatively summed along the given axis.

cyclic

TODO

datum

Return an element of the data array as a standard Python scalar.

del_calendar

Delete the calendar.

del_fill_value

Delete the fill value.

del_units

Delete the units.

dump

Return a string containing a full description of the instance.

digitize

Return the indices of the bins to which each value belongs.

dumpd

Return a serialization of the data array.

dumps

Return a JSON string serialization of the data array.

empty

Create a new data array without initializing the elements.

equals

True if two data arrays are logically equal, False otherwise.

exp

Take the exponential of the data array.

expand_dims

Expand the shape of the data array in place.

files

Return the names of files containing parts of the data array.

filled

TODO

first_element

Return the first element of the data as a scalar.

fits_in_memory

Return True if the master array is small enough to be retained in memory.

fits_in_one_chunk_in_memory

Return True if the master array is small enough to be retained in memory.

flat

Return a flat iterator over elements of the data array.

flatten

Flatten axes of the data

flip

Reverse the direction of axes of the data array.

floor

Return the floor of the data array.

full

Return a new data array of given shape and type, filled with the given value.

func

Apply an element-wise array operation to the data array.

get_calendar

Return the calendar.

get_compressed_axes

Return the dimensions that have compressed in the underlying array.

get_compressed_dimension

Return the position of the compressed dimension in the compressed array.

get_compression_type

Return the type of compression applied to the underlying array.

get_count

Return the count variable for a compressed array.

get_data

TODO

get_fill_value

Return the missing data value.

get_index

Return the index variable for a compressed array.

get_list

Return the list variable for a compressed array.

get_units

Return the units.

has_calendar

TODO Return the units.

has_fill_value

TODO Return the units.

has_units

TODO Return the units.

insert_dimension

Expand the shape of the data array in place.

inspect

Inspect the object for debugging.

integral

TODO

isclose

Return where data are element-wise equal to other, broadcastable data.

last_element

Return the last element of the data as a scalar.

loadd

Reset the data array in place from a data array serialization.

loads

TODO

log

TODO

mask_fpe

Masking of floating-point errors in the results of arithmetic operations.

mask_invalid

Mask the array where invalid values occur (NaN or inf).

max

Alias for maximum

maximum

Collapse axes with their maximum.

maximum_absolute_value

Collapse axes with their maximum absolute value.

mean

Collapse axes with their mean.

mean_absolute_value

Collapse axes with their mean absolute value.

mean_of_upper_decile

TODO

median

TODO

mid_range

Collapse axes with the unweighted average of their maximum and minimum values.

min

Alias for minimum

minimum

Collapse axes with their minimum.

minimum_absolute_value

Collapse axes with their minimum absolute value.

nc_clear_hdf5_chunksizes

TODO

nc_hdf5_chunksizes

TODO

nc_set_hdf5_chunksizes

TODO

ndindex

Return an iterator over the N-dimensional indices of the data array.

ones

TODO

outerproduct

Compute the outer product with another data array.

override_calendar

Override the calendar of the data array elements.

override_units

Override the data array units.

partition_boundaries

Return the partition boundaries for each partition matrix dimension.

partition_configuration

Return parameters for opening and closing array partitions.

percentile

Compute percentiles of the data along the specified axes.

range

Collapse axes with the absolute difference between their maximum and minimum values.

reconstruct_sectioned_data

Expects a dictionary of Data objects with ordering information as keys, as output by the section method when called with a Data object.

rint

Round the data to the nearest integer, element-wise.

roll

A lot like numpy.roll

root_mean_square

TODO Collapse axes with their weighted mean.

round

Evenly round elements of the data array to the given number of decimals.

sample_size

TODO

save_to_disk

sd

Alias for standard_deviation

standard_deviation

Collapse axes by calculating their standard deviation.

second_element

Return the second element of the data as a scalar.

section

Return a dictionary of Data objects, which are the m dimensional sections of this n dimensional Data object, where m <= n.

set_calendar

Set the calendar.

set_fill_value

Set the missing data value.

set_units

Set the units.

seterr

Set how floating-point errors in the results of arithmetic operations are handled.

sin

Take the trigonometric sine of the data array in place.

sinh

Take the hyperbolic sine of the data array in place.

source

Return the underlying array object.

squeeze

Remove size 1 axes from the data array.

standard_deviation

Collapse axes by calculating their standard deviation.

stats

Calculate statistics of the data.

sum

Collapse axes with their sum.

sum_of_squares

Collapse axes with the sum of the squares of the values.

sum_of_weights

TODO

sum_of_weights2

TODO

swapaxes

Interchange two axes of an array.

tan

Take the trigonometric tangent of the data array element-wise.

tanh

Take the hyperbolic tangent of the data array.

to_disk

Store the data array on disk.

to_memory

Store each partition’s data in memory in place if the master array is smaller than the chunk size.

tolist

Return the array as a (possibly nested) list.

transpose

Permute the axes of the data array.

trunc

Return the truncated values of the data array.

uncompress

Uncompress the underlying data.

unique

The unique elements of the array.

var

Alias of variance

variance

Collapse axes with their weighted variance.

where

Assign to data elements depending on a condition.

zeros

TODO

Data static methods

mask_fpe

Masking of floating-point errors in the results of arithmetic operations.

seterr

Set how floating-point errors in the results of arithmetic operations are handled.

Data arithmetic and comparison operations

Arithmetic, bitwise and comparison operations are defined as element-wise data array operations which yield a new cf.Data object or, for augmented assignments, modify the data in-place.

Comparison operators

__lt__

The rich comparison operator <

__le__

The rich comparison operator <=

__eq__

The rich comparison operator ==

__ne__

The rich comparison operator !=

__gt__

The rich comparison operator >

__ge__

The rich comparison operator >=

Truth value of an array

__bool__

Truth value testing and the built-in operation bool

Binary arithmetic operators

__add__

The binary arithmetic operation +

__sub__

The binary arithmetic operation -

__mul__

The binary arithmetic operation *

__div__

The binary arithmetic operation /

__truediv__

The binary arithmetic operation / (true division)

__floordiv__

The binary arithmetic operation //

__pow__

The binary arithmetic operations ** and pow

__mod__

The binary arithmetic operation %

Binary arithmetic operators with reflected (swapped) operands

__radd__

The binary arithmetic operation + with reflected operands

__rsub__

The binary arithmetic operation - with reflected operands

__rmul__

The binary arithmetic operation * with reflected operands

__rdiv__

The binary arithmetic operation / with reflected operands

__rtruediv__

The binary arithmetic operation / (true division) with reflected operands

__rfloordiv__

The binary arithmetic operation // with reflected operands

__rpow__

The binary arithmetic operations ** and pow with reflected operands

__rmod__

The binary arithmetic operation % with reflected operands

Augmented arithmetic assignments

__iadd__

The augmented arithmetic assignment +=

__isub__

The augmented arithmetic assignment -=

__imul__

The augmented arithmetic assignment *=

__idiv__

The augmented arithmetic assignment /=

__itruediv__

The augmented arithmetic assignment /= (true division)

__ifloordiv__

The augmented arithmetic assignment //=

__ipow__

The augmented arithmetic assignment **=

__imod__

The binary arithmetic operation %=

Unary arithmetic operators

__neg__

The unary arithmetic operation -

__pos__

The unary arithmetic operation +

__abs__

The unary arithmetic operation abs

Binary bitwise operators

__and__

The binary bitwise operation &

__or__

The binary bitwise operation |

__xor__

The binary bitwise operation ^

__lshift__

The binary bitwise operation <<

__rshift__

The binary bitwise operation >>

..rubric:: Binary bitwise operators with reflected (swapped) operands

__rand__

The binary bitwise operation & with reflected operands

__ror__

The binary bitwise operation | with reflected operands

__rxor__

The binary bitwise operation ^ with reflected operands

__rlshift__

The binary bitwise operation << with reflected operands

__rrshift__

The binary bitwise operation >> with reflected operands

Augmented bitwise assignments

__iand__

The augmented bitwise assignment &=

__ior__

The augmented bitwise assignment |=

__ixor__

The augmented bitwise assignment ^=

__ilshift__

The augmented bitwise assignment <<=

__irshift__

The augmented bitwise assignment >>=

Unary bitwise operators

__invert__

The unary bitwise operation ~

Special

__array__

The numpy array interface.

__contains__

Membership test operator in

__data__

Returns a new reference to self.

__deepcopy__

Called by the copy.deepcopy function.

__getitem__

Return a subspace of the data defined by indices.

__hash__

The built-in function hash

__iter__

Called when an iterator is required.

__len__

The built-in function len

__query_set__

TODO

__query_wi__

TODO

__query_wo__

TODO

__repr__

Called by the repr built-in function.

__setitem__

Implement indexed assignment.

__str__

Called by the str built-in function.