3. Running a UM suite on ARCHER2
3.1. ARCHER2 architecture
In common with many HPC systems, ARCHER2 consists of different types of processor nodes:
Login nodes: This is where you land when you ssh into ARCHER2. Typically these processors are used for file management tasks.
Compute / batch nodes: These make up most of the ARCHER2 system, and this is where the model runs.
Serial / post-processing nodes: This is where less intensive tasks such as compilation and archiving take place.
ARCHER2 has two file systems:
/home: This is relatively small and is only backed up for disaster recovery.
/work: This is much larger, but is not backed up. Note that the batch nodes can only see the work file system. It is optimised for parallel I/O and large files.
Consult the ARCHER2 website for more information: http://www.archer2.ac.uk
3.2. Running a Standard Suite
To demonstrate how to run the UM through Rose we will start by running a standard N48 suite at UM13.0.
Copy the suite
In
rosie go
locate the suite with idx u-cc519 owned by rosalynhatcher.Right click on the suite and select Copy Suite.
This copies an existing suite to a new suite id. The new suite will be owned by you. During the copy process a wizard will launch for you to edit the suite discovery information, if you wish.
A new suite will be created in the MOSRS rosie-u suite repository and will be checked out into your ~/roses
directory.
Edit the suite
Open your new suite in the Rose config editor GUI.
Before you can run the suite you need to change the userid, queue, account code and reservation:
Click on suite conf –> jinja2 in the left hand panel
Set
HPC_USER
(that’s your ARCHER2 username)
If following the tutorial as part of an organised training event:
Set
HPC_ACCOUNT
to ‘n02-training’Set
HPC_QUEUE
to ‘standard’Ensure
RESERVATION
is set to TrueSet
HPC_RESERVATION
to be the reservation code for today. (e.g. ‘n02-training_1055426’)
If following the tutorial as self-study:
Set
HPC_ACCOUNT
to the budget code for your project. (e.g. ‘n02-cms’)Set
HPC_QUEUE
to ‘short’Set
RESERVATION
to False
Notes
Quotes around the variable values are essential otherwise the suite will not run.
In normal practice you submit your suites to the parallel queue (either short or standard) on ARCHER2.
For organised training events, we use processor reservations, whereby we have exclusive access to a prearranged amount of ARCHER2 resource. Reservations are specified by adding an additional setting called the reservation code; e.g. n02-training_226.
Save the suite (File > Save or click the down arrow icon)
Run the suite
The standard suite will build, reconfigure and run the UM.
Click on the triangle symbol on the right end of the menu bar to run the suite.
Doing this will execute the rose suite-run
command (more on this later) and start the Cylc GUI through which you can monitor the progress of your suite graphically. The Cylc GUI will update as the job progresses.
Looking at the queues on ARCHER2
While you’re waiting for the suite to run, let’s log into ARCHER2 and learn how to look at the queues.
Run the following command:
squeue -u <archer2-user-name>
This will show the status of jobs you are running. You will see output similar to the following:
ARCHER2> squeue -u ros
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
148599 standard u-cc519. ros R 0:11 1 nid001001
At this stage you will probably only have a job running or waiting to run in the serial queue. Running squeue
will show all jobs currently on ARCHER2, most of which will be in the parallel queues.
Once your suite has finished running the Cylc GUI will go blank and you should get a message in the bottom left hand corner saying Stopped with succeeded
.
Tip
Cylc is set up so that it polls ARCHER2 to check the status of the task, every 5 minutes. This means that there could be a maximum of 5 minutes delay between the task finishing on ARCHER2 and the Cylc GUI being updated. If you see that the task has finished running but Cylc hasn’t updated then you can manually poll the task by right-clicking on it and selecting Poll from the pop-up menu.
3.3. Standard Suite Output
The output from a standard suite goes to a variety of places, depending on the type of the file. On ARCHER2 you will find all the output from your run under the directory ~/cylc-run/<suitename>
, where <suitename>
is the name of the suite. This is actually a symbolic link to the equivalent location in your /work
directory (E.g. /work/n02/n02/<username>/cylc-run/<suitename>
.
Task output
Note
Rose Bush is a web-based tool for viewing the standard output and errors from suites. Unfortunately this does not work on the current puma server, so we need to browse the log files directly.
Note
To run Rose Bush on Monsoon run: firefox http://localhost/rose-bush
On PUMA2, navigate to the cylc-run directory:
cd ~/cylc-run/<suitename>
ls
You should see directories for each of the suites that you have run. Go to the suite you have just run and into the log directory:
cd <suitename>/log/job/1
ls
You will see directories for each of the tasks in the suite. For this suite there are 4 tasks: fcm_make
(code extraction), fcm_make2
(compilation), recon
& atmos
. Try looking in one of the task directories:
cd recon/NN
ls
Here NN
is a symbolic link created by Rose pointing to the output of the most recently run. You will see several files in this directory. The job.out
and job.err
files are the first places you should look for information when tasks fail.
Compilation output
The output from the compilation is stored on the host upon which the compilation was performed. The output from fcm_make
is inside the directory containing the build, which is inside the share
subdirectory.
~/cylc-run/<suitename>/share/fcm_make/fcm-make2.log
If you come across the word “failed”, chances are your model didn’t build correctly and this file is where you’d search for reasons why.
UM standard output
The output from the UM scripts and the output from PE0 of the model are written to the job.out
and job.err
files for that task. Take a look at the job.out
for the atmos
task, by opening the following file:
~/cylc-run/<suitename>/log/job/1/atmos/NN/job.out
Did the linear solver for the Helmholtz problem converge in the final timestep?
Job Accounting
The sacct
command displays accounting data for all jobs that are run on ARCHER2. sacct
can be used to find out about the resources used by a job. For example; Nodes used, Length of time the job ran for, etc. This information is useful for working out how much resource your runs are using. You should have some idea of the resource requirements for your runs and how that relates to the annual CU budget for your project. Information on resource requirements is also needed when applying for time on the HPC.
Let’s take a look at the resources used by your copy of u-cc519
run.
Locate the SLURM Job Id for your run. This is a 6 digit number and can be found in the
job.status
file in the cylc task log directory. Look for the lineCYLC_BATCH_SYS_JOB_ID=
and take note of the number after the=
sign.
Run the following command:
sacct --job=<slurm-job-id> --format="JobID,JobName,Elapsed,Timelimit,NNodes"
Where <slurm-job-id>
is the number you just noted above. You should get output similar to the following:
ARCHER2-ex> sacct --job=204175 --format="JobID,JobName,Elapsed,Timelimit,NNodes"
JobID JobName Elapsed Timelimit NNodes
------------ ---------- ---------- ---------- --------
204175 u-cc519.a+ 00:00:23 00:20:00 1
204175.batch batch 00:00:23 1
204175.exte+ extern 00:00:23 1
204175.0 um-atmos.+ 00:00:14 1
The important line is the first line.
How much walltime did the run consume?
How much time did you request for the task?
How many CUs (Accounting Units) did the job cost?
Hint
1 node hour currently = 1 CU. See the ARCHER2 website for information about the CU.
There are many other fields that can be output for a job. For more information see the Man page (man sacct
). You can see a list of all the fields that can be specified in the --format
option by running sacct --helpformat
.