Performance


Version 1.8.7.0 for version 1.8 of the CF conventions.

Memory

When a dataset is read using cfdm.read, lazy loading is employed for all data arrays, which means that no data is read into memory until the data is required for inspection or to modify the array contents. This maximises the number of field constructs that may be read within a session, and makes the read operation fast. If a subspace of data still in the file is requested then only that subspace is read into memory. These behaviours are inherited from the netCDF4 python package.

When an instance is copied with its copy method, all data are copied with a copy-on-write technique. This means that a copy takes up very little memory, even when the original data comprises a very large array in memory, and the copy operation is fast.


In-place operations

Some methods that create new a instance have an option to perform the operation in-place, rather than creating a new independent object. The in-place operation can be considerably faster. These methods have the inplace keyword parameter, such as the squeeze, transpose, insert_dimension, compress, and uncompress methods of a field construct.

For example, in one particular test using a dataset from the tutorial, transposing the data dimensions of the field construct was ~11 times faster when done in-place, compared with creating a new independent field construct:

Calculate the speed-up of performing the “transpose” operation in-place. The data are brought into memory prior to the tests to remove the time taken to read the dataset from disk from the results.
>>> import timeit
>>> import cfdm
>>> q, t = cfdm.read('file.nc')
>>> t.data.to_memory()
>>> min(timeit.repeat('t.transpose()',
...                   globals=globals(), number=1000))
2.1561493259978306
>>> min(timeit.repeat('t.transpose(inplace=True)',
...                   globals=globals(), number=1000))
0.18915946600100142