Extensions¶
Version 1.11.1.0 for version 1.11 of the CF conventions.
The cfdm package has been designed to be subclassed, so that the creation of a new implementation of the CF data model, based on cfdm, is straight forward. For example:
import cfdm
class my_Field(cfdm.Field):
def info(self):
return 'I am a {!r} instance'.format(
self.__class__.__name__)
The interpretation of CF-netCDF files that is encoded within the
cfdm.read
and cfdm.write
functions is also inheritable, so that an
extended data model implementation need not recreate the complicated
mapping of CF data model constructs to, and from, CF-netCDF
elements. This is made possible by the bridge design pattern, that decouples the
implementation of the CF data model from the CF-netCDF encoding so
that the two can vary independently.
>>> my_implementation = cfdm.implementation()
>>> my_implementation.set_class('Field', my_Field)
>>> import functools
>>> my_read = functools.partial(cfdm.read,
... _implementation=my_implementation)
>>> my_write = functools.partial(cfdm.write,
... _implementation=my_implementation)
>>> q, t = my_read('file.nc')
>>> print(type(q))
<class '__main__.my_Field'>
>>> print(q.info())
I am a 'my_Field' instance
>>> print(repr(q))
<my_Field: specific_humidity(latitude(5), longitude(8)) 1>
>>> print(q.data.array)
[[0.007 0.034 0.003 0.014 0.018 0.037 0.024 0.029]
[0.023 0.036 0.045 0.062 0.046 0.073 0.006 0.066]
[0.11 0.131 0.124 0.146 0.087 0.103 0.057 0.011]
[0.029 0.059 0.039 0.07 0.058 0.072 0.009 0.017]
[0.006 0.036 0.019 0.035 0.018 0.037 0.034 0.013]]
>>> my_write([q, t], 'new_file.nc')
Note that, so far, we have only replaced the field construct class in the new implementation, and not any of the metadata constructs or other component classes:
>>> print(type(q))
<class '__main__.my_Field'>
>>> print(type(q.construct('latitude')))
<class 'cfdm.dimensioncoordinate.DimensionCoordinate'>
If the API of the new implementation is changed such that a given cfdm
functionality has a different API in the new implementation, then the
new read-from-disk and write-to-disk functions defined above can still
be used provided that the new implementation is created from a
subclass of cfdm.CFDMImplementation
, with the new API being applied
in overridden methods.
class my_Field_2(cfdm.Field):
def my_coordinates(self):
"""Get coordinate constructs with a different API."""
c = self.coordinates
if not c:
return {}
return c
class my_CFDMImplementation(cfdm.CFDMImplementation):
def get_coordinates(self, field):
"""Get coordinate constructs from a my_Field_2 instance,
using its different API.
"""
return field.my_coordinates()
my_implementation_2 = my_CFDMImplementation(
cf_version=cfdm.CF(),
Field=my_Field_2,
AuxiliaryCoordinate=cfdm.AuxiliaryCoordinate,
CellMeasure=cfdm.CellMeasure,
CellMethod=cfdm.CellMethod,
CoordinateReference=cfdm.CoordinateReference,
DimensionCoordinate=cfdm.DimensionCoordinate,
DomainAncillary=cfdm.DomainAncillary,
DomainAxis=cfdm.DomainAxis,
FieldAncillary=cfdm.FieldAncillary,
Bounds=cfdm.Bounds,
InteriorRing=cfdm.InteriorRing,
CoordinateConversion=cfdm.CoordinateConversion,
Datum=cfdm.Datum,
List=cfdm.List,
Index=cfdm.Index,
Count=cfdm.Count,
NodeCountProperties=cfdm.NodeCountProperties,
PartNodeCountProperties=cfdm.PartNodeCountProperties,
Data=cfdm.Data,
GatheredArray=cfdm.GatheredArray,
NetCDFArray=cfdm.NetCDFArray,
RaggedContiguousArray=cfdm.RaggedContiguousArray,
RaggedIndexedArray=cfdm.RaggedIndexedArray,
RaggedIndexedContiguousArray=cfdm.RaggedIndexedContiguousArray,
)
As all classes are required for the initialisation of the new implementation class, this demonstrates explicitly that, in the absence of subclasses of the other classes, the cfdm classes may be used.
>>> my_read_2 = functools.partial(cfdm.read,
... _implementation=my_implementation2)
>>> q, t = my_read_2('file.nc')
>>> print(repr(q))
<my_Field_2: specific_humidity(latitude(5), longitude(8)) 1>
Finally, the mapping of CF data model constructs from CF-netCDF elements, and vice versa, may be modified where desired, leaving all other aspects it unchanged
class my_NetCDFRead(cfdm.read_write.netcdf.NetCDFRead):
def read(self, filename):
"""Read my fields from a netCDF file on disk or from
an OPeNDAP server location, using my modified mapping
from netCDF to the CF data model.
"""
print("Reading dataset using my modified mapping")
return super().read(filename)
my_netcdf = my_NetCDFRead(my_implementation_2)
def my_read_3(filename, ):
"""Read my field constructs from a dataset."""
return my_netcdf.read(filename)
>>> q, t = my_read_3('~/cfdm/docs/source/sample_files/file.nc')
Reading dataset using my modified mapping
>>> print(repr(q))
<my_Field_2: specific_humidity(latitude(5), longitude(8)) 1>
In the same manner, cfdm.read_write.netcdf.NetCDFWrite
may be
subclassed, and a new write-to-disk function defined, to override
aspects of the mapping from CF data model constructs to netCDF
elements in a dataset.
The _custom dictionary¶
All cfdm classes have a _custom
attribute that contains a dictionary
meant for use in external subclasses.
It is intended for the storage of extra objects that are required by
an external subclass, yet can be transferred to copied instances using
the inherited cfdm infrastructure. The _custom
dictionary is shallow
copied, rather than deep copied, when using the standard cfdm deep
copy method techniques (i.e. the copy
method, initialisation with
the source parameter, or applying the copy.deepcopy
function) so
that subclasses of cfdm are not committed to potentially expensive
deep copies of the dictionary values, of which cfdm has no
knowledge. Note that calling copy.deepcopy
on a cfdm (sub)class
only invokes its copy
method. The cfdm library itself does not
use the _custom
dictionary, other than to pass on a shallow copy of
it to copied instances.
The consequence of this shallow-copy behaviour is that if an external subclass stores a mutable object within its custom dictionary then, by default, a deep copy will contain the identical mutable object, to which in-place changes will affect both the original and copied instances.
To account for this, the external subclass can either commit to
never updating such mutables in-place (which can be acceptable for
private quantities which are tightly controlled); or else include
extra code that does deep copy such mutables when any deep copy (or
equivalent) operation is called. The latter approach should be
implemented in the subclass’s __init__
method, similarly to this:
import copy
class my_Field_3(cfdm.Field):
def __init__(self, properties=None, source=None, copy=True,
_use_data=True):
super().__init__(properties=properties, source=source,
copy=copy, _use_data=_use_data)
if source and copy:
# Deep copy the custom 'x' value
try:
self._custom['x'] = copy.deepcopy(source._custom['x'])
except (AttributeError, KeyError):
pass
Documentation¶
The cfdm package uses a “docstring rewriter” that allows commonly used parts of class and class method docstrings to be written once in a central location, and then inserted into each class at import time. In addition, parts of a docstring are modified to reflect the appropriate package and class names. This functionality extends to subclasses of cfdm classes. New docstring substitutions may also be defined for the subclasses.
See cfdm.core.meta.DocstringRewriteMeta
for details on how to add to
create new docstring substitutions for extensions, and how to modify
the substitutions defined in the cfdm package.
A complete example¶
See cf-python for a complete example of extending the cfdm package in the manner described above.
cf-python adds more flexible inspection, reading and writing; and provides metadata-aware analytical processing capabilities such as regridding and statistical calculations.
It also has a more sophisticated data class that subclasses
cfdm.Data
, but allows for larger-than-memory manipulations and
parallel processing.
cf-python strictly extends the cfdm API, so that a cfdm command will always work on its cf-python counterpart.