.. currentmodule:: cf .. default-role:: obj .. _Tutorial: **Tutorial** ============ ---- Version |release| for version |version| of the CF conventions. All of the Python code in this tutorial is available in an executable script (:download:`download <../source/tutorial.py>`, 36kB). .. https://stackoverflow.com/questions/24129481/how-to-include-a-local-table-of-contents-into-sphinx-doc .. http://docutils.sourceforge.net/docs/ref/rst/directives.html#table-of-contents .. http://docutils.sourceforge.net/docs/ref/rst/directives.html#list-table .. note:: **This version of cf is for Python 3 only** and there are :ref:`incompatible differences between versions 2.x and 3.x ` of cf. Scripts written for version 2.x but running under version 3.x should either work as expected, or provide informative error messages on the new API usage. However, it is advised that the outputs of older scripts be checked when running with Python 3 versions of the cf library. For version 2.x documentation, see the :ref:`older releases ` page. .. contents:: :local: :backlinks: entry .. _Sample-datasets: **Sample datasets** ------------------- This tutorial uses a number of small sample datasets, all of which can be found in the zip file ``cf_tutorial_files.zip`` (:download:`download <../source/sample_files/cf_tutorial_files.zip>`, 164kB): .. code-block:: shell :caption: *Unpack the sample datasets.* $ unzip -q cf_tutorial_files.zip $ ls -1 air_temperature.nc cf_tutorial_files.zip contiguous.nc external.nc file2.nc file.nc gathered.nc parent.nc precipitation_flux.nc timeseries.nc umfile.pp vertical.nc wind_components.nc The tutorial examples assume that the Python session is being run from the directory that also contains the sample files. The tutorial files may be also found in the `downloads directory `_ of the on-line code repository. ---- .. _Import: **Import** ---------- The cf package is imported as follows: .. code-block:: python :caption: *Import the cf package.* >>> import cf .. _CF-version: CF version ^^^^^^^^^^ The version of the `CF conventions `_ and the :ref:`CF data model ` being used may be found with the `cf.CF` function: .. code-block:: python :caption: *Retrieve the version of the CF conventions.* >>> cf.CF() '1.7' This indicates which version of the CF conventions are represented by this release of the cf package, and therefore the version can not be changed. Note, however, that datasets of different CF versions may be :ref:`read ` from, or :ref:`written ` to netCDF. ---- **Field construct** ------------------- The construct (i.e. element) that is central to CF is the field construct. The field construct, that corresponds to a CF-netCDF data variable, includes all of the metadata to describe it: * descriptive properties that apply to field construct as a whole (e.g. the standard name), * a data array, and * "metadata constructs" that describe the locations of each cell of the data array, and the physical nature of each cell's datum. A field construct is stored in a `cf.Field` instance, and henceforth the phrase "field construct" will be assumed to mean "`cf.Field` instance". ---- .. _Reading-datasets: **Reading field constructs from datasets** ------------------------------------------ The `cf.read` function reads files from disk, or from an `OPeNDAP `_ URLs [#dap]_, and returns the contents in a `cf.FieldList` instance that contains zero or more `cf.Field` instances, each of which represents a field construct. Henceforth, the phrase "field list" will be assumed to mean a `cf.FieldList` instance. A :ref:`field list ` is very much like a Python `list`, with the addition of extra methods that operate on its field construct elements. The following file type can be read: * All formats of netCDF3 and netCDF4 files (including `CFA-netCDF `_ files) can be read, containing datasets for any version of CF up to and including CF-|version|. .. * Files in `CDL format `_, with or without the data array values. .. * :ref:`PP and UM fields files `, whose contents are mapped into field constructs. For example, to read the file ``file.nc``, which contains two field constructs: .. code-block:: python :caption: *Read file.nc and show that the result is a two-element field list.* >>> x = cf.read('file.nc') >>> type(x) >>> len(x) 2 Descriptive properties are always read into memory, but `lazy loading `_ is employed for all data arrays, which means that no data is read into memory until the data is required for inspection or to modify the array contents. This maximises the number of field constructs that may be read within a session, and makes the read operation fast. Multiple files may be read in one command using `UNIX wild card characters `_, or a sequence of file names (each element of which may also contain wild cards). Shell environment variables are also permitted. .. code-block:: python :caption: *Read the ten sample netCDF files, noting that they contain more than ten field constructs.* >>> y = cf.read('*.nc') >>> len(y) 14 .. code-block:: python :caption: *Read two particular files, noting that they contain more than two field constructs.* >>> z = cf.read(['file.nc', 'precipitation_flux.nc']) >>> len(z) 3 All of the datasets in one more directories may also be read by replacing any file name with a directory name. An attempt will be made to read all files in the directory, which will result in an error if any have a non-supported format. Non-supported files may be ignored with the *ignore_read_error* keyword. .. code-block:: python :caption: *Read all of the files in the current working directory.* >>> y = cf.read('$PWD') # Raises Exception Exception: Can't determine format of file cf_tutorial_files.zip >>> y = cf.read('$PWD', ignore_read_error=True) >>> len(y) 15 In all cases, the default behaviour is to aggregate the contents of all input datasets into as few field constructs as possible, and it is these aggregated field constructs are returned by `cf.read`. See the section on :ref:`aggregation ` for full details. The `cf.read` function has optional parameters to * allow the user to provide files that contain :ref:`external variables `; * request :ref:`extra field constructs to be created from "metadata" netCDF variables `, i.e. those that are referenced from CF-netCDF data variables, but which are not regarded by default as data variables in their own right; * display information and warnings about the mapping of the netCDF file contents to CF data model constructs; * remove from, or include, size one dimensions on the field constructs' data; * configure the :ref:`field construct aggregation process `; * configure the reading of directories to allow sub-directories to be read recursively, and to allow directories which resolve to symbolic links; and * configure parameters for :ref:`reading PP and UM fields files `. .. _CF-compliance: CF-compliance ^^^^^^^^^^^^^ If the dataset is partially CF-compliant to the extent that it is not possible to unambiguously map an element of the netCDF dataset to an element of the CF data model, then a field construct is still returned, but may be incomplete. This is so that datasets which are partially conformant may nonetheless be modified in memory and written to new datasets. Such "structural" non-compliance would occur, for example, if the "coordinates" attribute of a CF-netCDF data variable refers to another variable that does not exist, or refers to a variable that spans a netCDF dimension that does not apply to the data variable. Other types of non-compliance are not checked, such whether or not controlled vocabularies have been adhered to. The structural compliance of the dataset may be checked with the `~cf.Field.dataset_compliance` method of the field construct, as well as optionally displayed when the dataset is read. ---- .. _Inspection: **Inspection** -------------- The contents of a field construct may be inspected at three different levels of detail. .. _Minimal-detail: Minimal detail ^^^^^^^^^^^^^^ The built-in `repr` function returns a short, one-line description: .. code-block:: python :caption: *Inspect the contents of the two field constructs from the dataset and create a Python variable for each of them.* >>> x = cf.read('file.nc') >>> x [, ] >>> q = x[0] >>> t = x[1] >>> q This gives the identity of the field construct (e.g. "specific_humidity"), the identities and sizes of the dimensions spanned by the data array ("latitude" and "longitude" with sizes 5 and 8 respectively) and the units of the data ("1", i.e. dimensionless). .. _Medium-detail: Medium detail ^^^^^^^^^^^^^ The built-in `str` function returns similar information as the one-line output, along with short descriptions of the metadata constructs, which include the first and last values of their data arrays: .. code-block:: python :caption: *Inspect the contents of the two field constructs with medium detail.* >>> print(q) Field: specific_humidity (ncvar%q) ---------------------------------- Data : specific_humidity(latitude(5), longitude(8)) 1 Cell methods : area: mean Dimension coords: time(1) = [2019-01-01 00:00:00] : latitude(5) = [-75.0, ..., 75.0] degrees_north : longitude(8) = [22.5, ..., 337.5] degrees_east >>> print(t) Field: air_temperature (ncvar%ta) --------------------------------- Data : air_temperature(atmosphere_hybrid_height_coordinate(1), grid_latitude(10), grid_longitude(9)) K Cell methods : grid_latitude(10): grid_longitude(9): mean where land (interval: 0.1 degrees) time(1): maximum Field ancils : air_temperature standard_error(grid_latitude(10), grid_longitude(9)) = [[0.81, ..., 0.78]] K Dimension coords: time(1) = [2019-01-01 00:00:00] : atmosphere_hybrid_height_coordinate(1) = [1.5] : grid_latitude(10) = [2.2, ..., -1.76] degrees : grid_longitude(9) = [-4.7, ..., -1.18] degrees Auxiliary coords: latitude(grid_latitude(10), grid_longitude(9)) = [[53.941, ..., 50.225]] degrees_N : longitude(grid_longitude(9), grid_latitude(10)) = [[2.004, ..., 8.156]] degrees_E : long_name=Grid latitude name(grid_latitude(10)) = [--, ..., 'kappa'] Cell measures : measure:area(grid_longitude(9), grid_latitude(10)) = [[2391.9657, ..., 2392.6009]] km2 Coord references: atmosphere_hybrid_height_coordinate : rotated_latitude_longitude Domain ancils : ncvar%a(atmosphere_hybrid_height_coordinate(1)) = [10.0] m : ncvar%b(atmosphere_hybrid_height_coordinate(1)) = [20.0] : surface_altitude(grid_latitude(10), grid_longitude(9)) = [[0.0, ..., 270.0]] m Note that :ref:`time values