9. Post processing
This is simply a very basic introduction to some of the more widely used useful tools for viewing, checking, and converting UM input and output data. The tools described below all run on the ARCHER2 login nodes.
9.1. xconv
View data
On ARCHER2 go to the output directory of the global job that you ran previously (the one copied from u-cc654
). Run xconv
on the file ending with, for example, da19880901_04
. This file is an atmosphere start file - this type of file is used to restart the model from the time specified in the file header data.
In the directory above is a file whose name ends in .astart
; run a second instance of xconv
on this file. This is the file used by the model to start its run - created by the reconfiguration program in this case.
The xconv
window lists the fields in the file, the dimensions of those fields (upper left panel), the coordinates of the grid underlying the data, the time(s) of the data (upper right panel), some information about the type of file (lower left panel), and general data about the field (lower right panel.)
Both files have the same fields. Double click on a field to reveal its coordinate data. Check the time for this field (select the “t” checkbox in the upper right panel).
Plot both sets of data - click the Plot Data button.
View the data - this shows numerical data values and their coordinates and can be helpful for finding spurious data values.
Convert UM fields data to netCDF
Select a single-level field (one for which nz=1
), choose Output format to be Netcdf, enter an “Output file name”, and select Convert. Information relevant to the file conversion will appear in the lower left panel.
Use xconv
to view the netcdf file just created.
9.2. uminfo
You can view the header information for the fields in a UM file by using the utility uminfo
- redirect the output to a file or pipe it to less
:
archer2$ uminfo <one-of-your-fields-files> | less
The output from this command is best viewed in conjunction with the Unified Model Documentation Paper F3 which explains in depth the various header fields.
9.3. Mule
Mule consists of a Python API for reading and writing UM files and a set of UM utilities. This section introduces you to some of the most useful UM utilities. Full details of Mule can be found on the MOSRS: https://code.metoffice.gov.uk/doc/um/index.html
Before running the mule commands you will need to load the python environment on ARCHER2 by running:
archer2$ module load cray-python
mule-pumf
This provides another way of seeing header information, but also gives some information about the fields themselves. Its intended use is to aid in quick inspections of files for diagnostic purposes.
Run mule-pumf
on the start file - here’s a couple of examples on one of Ros’ files:
archer2$ mule-pumf --print-columns 2 --headers-only \\
cc654.astart > ~/mule-pumf-header.out
archer2$ mule-pumf --print-columns 2 cc654.astart > ~/mule-pumf.out
Can you see what the difference is in the output of these 2 commands?
Take a look at the man page (mule-pumf -h
) and experiment with some of the other options
mule-summary
This utility is used to print out a summary of the lookup headers which describe the fields from a UM file. Its intended use is to aid in quick inspections of files for diagnostic purposes.
Run mule-summary
on the start file again.
mule-cumf
This utility is used to compare two UM files and report on any differences found in either the headers or field data. Its intended use is to test results from different UM runs against each other to investigate possible changes. Note, differences in header information can arise even when field data is identical. Try out the following:
Run
mule-cumf
on the two start files referred to above (in the “View data” section). You may wish to direct the output to a file.Run the same command but with the
--summary
option. This, as the name suggests, prints a much shorter report of the differences.Run
mule-cumf
on a file and itself.View the help page with
mule-cumf -h
to find view all the available options.
9.4. um-convpp
We have mentioned in the presentations the PP file format - this is a sequential format (a fields file is random access) still much used in the community. PP data is stored as 32-bit, which provides a significant saving of space, but means that a conversion step is required from a fields file (64-bit). The utility to do this is called um-convpp
. um-convpp
converts directly from 64-bit files produced by the UM to 32-bit PP files. You must, however, make sure you are using a version 10.4 or greater - you can check that you are using the right one by typing which um-convpp
.
Set the stack size limit to unlimited, and add the path to um-convpp
to your environment - you can also add this to your ~/.profile
so it is available everytime you log in.
archer2$ ulimit -s unlimited
archer2$ export PATH=$UMDIR/vn11.2/cce/utilities:$PATH
Run um-convpp
on a fieldsfile (E.g cc654a.pc19880901_00)
archer2$ cd /home/n02/n02/ros/cylc-run/u-cc654/share/data/History_Data
archer2$ um-convpp cc654a.pc19880901_00 cc654a.pc19880901_00.pp
archer2$ ls -l cc654a.pc19880901*
-rw-r--r-- 1 ros n02 26447872 Nov 2 10:36 cc654a.pc19880901_00
-rw-r--r-- 1 ros n02 20372768 Nov 2 10:47 cc654a.pc19880901_00.pp
-rw-r--r-- 1 ros n02 26476544 Nov 2 10:36 cc654a.pc19880901_06
Note the reduction in file size. Now use xconv
to examine the contents of the PP file.
9.5. cfa
There is an increasing use of python in the community and we have, and
continue to develop, python tools to do much of the data processing
previously done using IDL or MATLAB and are working to extend that
functionality. cfa
is a python utility which offers a host of
features - we’ll use it to convert UM fields file or PP data to
CF-compliant data in NetCDF format. You first need to set the
environment to run cfa
:
archer2$ export PATH=/home/n02/n02/dch/cf-analysis/bin:$PATH
archer2$ cfa -i -o cc654a.pc19880901_00.nc cc654a.pc19880901_00.pp
Try viewing the NetCDF file with xconv.
cfa
can also view CF fields. It can be run on PP or NetCDF
files, to provide a text representation of the CF fields contained in
the input files. Try it on a PP file and its NetCDF equivalent,
e.g.
archer2$ cfa -vm cc654a.pc19880901_00.pp | less
Field: long_name:HEAVYSIDE FN ON P LEV/UV GRID (ncvar%UM_m01s30i301_vn1100)
---------------------------------------------------------------------------
Data : long_name:HEAVYSIDE FN ON P LEV/UV GRID(time(5), air_pressure(17), latitude(145), longitude(192))
Cell methods : time: point
Axes : time(5) = [1988-09-01T00:00:00Z, ..., 1988-09-01T03:59:59Z] 360_day
: air_pressure(17) = [1000.0, ..., 10.0] hPa
: latitude(145) = [-90.0, ..., 90.0] degrees_north
: longitude(192) = [0.0, ..., 358.125] degrees_east
Field: long_name:VORTICITY 850 (ncvar%UM_m01s30i455_vn1100)
-----------------------------------------------------------
Data : long_name:VORTICITY 850(time(5), latitude(145), longitude(192))
Cell methods : time: point
Axes : air_pressure(1) = [-1.0] hPa
: time(5) = [1988-09-01T00:00:00Z, ..., 1988-09-01T03:59:59Z] 360_day
: latitude(145) = [-90.0, ..., 90.0] degrees_north
: longitude(192) = [0.0, ..., 358.125] degrees_east
9.6. CF-python CF-plot
Many tools exist for analysing data from NWP and climate models and there are many contributing factors for the proliferation of these analysis utilities, for example, the disparity of data formats used by the authors of the models, and/or the availability of the underlying software. There is a strong push towards developing and using python as the underlying language and CF-netCDF as the data format. CMS is home to tools in the CF-netCDF stable - here’s an example of the use of these tools to perform some quite complex data manipulations. The user is insulated from virtually all of the details of the methods allowing them to concentrate on scientific analysis rather than programming intricacies.
Set up the environment and start python.
archer2$ export PATH=/home/n02/n02/dch/cf-analysis/bin:$PATH archer2$ python >>>
We’ll be looking at some AIRS satellite-retrieved temperature data over sea.
Import the cf-python library
>>> import cf
Read in the AIRS data files
>>> f = cf.read('~dch/UM_Training/ta_mon_AIRS-1-0_BE_gn_*.nc')[0]
Inspect the file contents with different amounts of detail
>>> f >>> print(f) >>> f.dump()
Note that the ten AIRS files (one for each year) are automatically aggregated into one field.
Read in another field produced by a GCM, this has a different latitude/longitude grid to the AIRS data
>>> g = cf.read('~dch/UM_Training/tas_Amon_HadGEM3-GC3-1_hist-1p0_r3i1p1f2_gn_185001-201412.nc')[0] >>> print(g)
Regrid the observed AIRS data (
f
) to the grid of the model field (g
) over Europe. Use the “conservative” regridding method that preserves area integrals>>> f = f.regrids(g.subspace(X=cf.wi(-10, 40), Y=cf.wi(35, 70)), method='conservative') >>> print(f)
Note that the latitude and longitude dimensions are now shorter in length.
Average the regridded field over June-August for each year
>>> f = f.collapse('T: mean within years T: mean over years', within_years=cf.jja()) >>> print(f)
Note that the time axis is now of length 1.
Import the cfplot visualisation library
>>> import cfplot
Make a default contour plot of the regridded observed data,
f
at the 1000 hPa level>>> f = f.subspace(Z=cf.eq(1000, 'hPa')) >>> cfplot.con(f)
Make a “blockfill” plot of the regridded observed data,
f
>>> cfplot.con(f, blockfill=True)
Make a default contour plot of the model data,
g
for its first time:>>> g = g.subspace(T=[0]) >>> cfplot.con(g)
Make a “blockfill” plot of the model data,
g
, restricting the view to over Europe>>> cfplot.mapset(lonmin=-10, lonmax=40, latmin=25, latmax=70) >>> cfplot.con(g, blockfill=True)
Write out the regridded data to disk
>>> cf.write(f, 'obs_temperature_Europe_JJA_2003-2010.regridded.nc')
This has just given you a taster of CF-Python & CF-Plot, if you would like to try out some more exercises please take a look at https://github.com/NCAS-CMS/cf-tools-training.