Submitting jobs
Once you have logged in to a front-end server, there are two ways to submit a job: as an interactive job, or in batch mode.
Interactive Job
An interactive job is the preferred method for prototyping, debugging, and testing code. It should be remembered that any disruption in connection (due to inactivity, timeouts, etc.) will exit the job and free the allocated resources. For this reason, any job lasting more than an hour should probably instead use the batch mode described in the next section.
To submit an interactive job, use qsub
with the -I
flag:
qsub -I -l nodes={# of nodes}:{node features/properties}:ppn={# of processors per node} -l walltime={HH:MM:SS}
If you need X11 forwarding (for graphical applications), add -X
.
The interactive job starts a command-line shell on the requested node where the job-related commands/scripts can be executed. For example:
l@muse ~> ssh lsh@vortex.sciclone.wm.edu
Password:
...
11 [vortex] qsub -I -l nodes=1:vortex:ppn=1 -l walltime=00:05:00
qsub: waiting for job 5207204 to start
qsub: job 5207204 ready
11 [vx01] sleep 5; echo This is technically a job!
This is technically a job!
12 [vx01] exit
logout
qsub: job 5207204 completed
12 [vortex]
Batch Job
Because of the aforementioned fragility of interactive jobs, and because it may be necessary to wait a significant amount of time for resources to become available for a job to start when the cluster is busy, most jobs on the cluster are run in batch mode, that is, completely in the background with no user interaction. Since you will not be present to issue commands, this requires a job script to be submitted to the batch system that contains all the commands you need to run.
The basic format for job submission is
qsub [OPTIONS] SCRIPT
where SCRIPT
is the name of a file containing commands you would issue to run your application, and optionally #PBS
directives you could also specify as OPTIONS
to qsub
. The most commonly used are
-l nodes={# of nodes}:{node type}:ppn={# of processors per node}
the resources required for the job;
-l walltime={HH:MM:SS}
the maximum length of time the job will run;
-N {job name}
the job name;
-j oe
join output and error output, instead of splitting them into separate files;
-m abe
when a user is sent mail about the job:
a
for if job is aborted by batch system,b
for when the job begins, and/ore
for when the job ends; and-M user1,user2,...
a comma separated list of email addresses that are notified.
See man qsub
for more options and information. In a script, all PBS directives must be mentioned before the execution commands: directives after the first command are ignored.
When batch jobs end, output that would have been written to the screen in an interactive job is instead saved as files in $PBS_O_WORKDIR
(the directory you were in when you submitted the job, or specified with -w
) named Jobname.oJobID
(and Jobname.eJobID
, if you did not use -j oe
). Using tcsh
, such output will always contain
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
which is simply tcsh
warning you that (because it's running in a batch job) it has no access to a terminal and you will not be able use Ctrl-C, Ctrl-Z, etc.
Note that both interactive and batch jobs start in your home directory, regardless of where they were submitted. If you want to run a command in a different directory, first change directory, e.g.
cd $PBS_O_WORKDIR
./my_command
Examples
A serial job requires one node and runs on a single core. In this example, not having specifiednodes=
, we are allowing the job scheduler to assign any processor on any machine in the cluster. If this were not acceptable, we could request a particular node feature/property by adding another #PBS -l
line, specifying nodes=1:property
.#!/bin/tcsh
#PBS -l walltime=4:00:00
#PBS -N my_serial_job
#PBS -j oe
cd $PBS_O_WORKDIR
/path/to/serial_job
An SMP/shared-memory job runs on a single node using several cores, and uses OpenMP or multithreading.
#!/bin/tcsh
#PBS -l nodes=1:x5672:ppn=8
#PBS -l walltime=12:00:00
#PBS -N my_smp_job
#PBS -j oe
cd $PBS_O_WORKDIR
./omp_matrices_addition
A parallel/distributed memory job runs on multiple nodes with multiple cores using, in most cases, a parallel communication library such as MVAPICH2/OpenMPI. The parallel job script is executed on the first allocated node after the job begins.
On our systems, all MVAPICH2/OpenMPI jobs should be initiated using mvp2run
, a wrapper interface between mpirun_rsh
/mpiexec
and our batch system. It provides functionality for selecting the desired physical network, checking processor loads on the destination nodes, managing execution environment, and process mapping and affinity (for MVAPICH2). See mvp2run -h
for more information.
#!/bin/tcsh
#PBS -l nodes=7:vortex:ppn=12
#PBS -l walltime=48:00:00
#PBS -N parallel_fem
#PBS -j oe
cd $PBS_O_WORKDIR
mvp2run -D -c 12 -C 0.2 -e GRIDX=500 -e GRIDY=400 /path/to/code/parallel_fem {args}