Skip to content

Job Handling

Submitting the Job

Submitting the SLURM job is done by command sbatch. SLURM will read the submit file, and schedule the job according to the description in the submit file.

Submitting the job described above is:

$ sbatch example.sh
Submitted batch job 62

Checking Job Status

To check the status of your job, use the squeue command. It will provide information such as:

  • The State (ST) of the job:
    • R - Running
    • PD - Pending - Job is awaiting resource allocation.
    • Additional codes are available on the squeue page.
  • Job Name
  • Run Time
  • Nodes running the job

Checking the status of jobs owned by a specific username, use the -u option

$ squeue -u <username>
  JOBID PARTITION     NAME       USER  ST       TIME  NODES NODELIST(REASON)
     65   sixhour hello-wo <username>   R       0:56      1 g004

Additionally, if you want to see the status of a specific partition, for example if you are part of a partition, you can use the -p option to squeue:

$ squeue -p sixhour
  JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
  73435  sixhour  MyRandom  jayhawk   R   10:35:20      1 r10r29n1
  73436  sixhour  MyRandom  jayhawk   R   10:35:20      1 r10r29n1
  73735  sixhour  SW2\_driv   bigjay   R   10:14:11      1 r31r29n1
  73736  sixhour  SW2\_driv   bigjay   R   10:14:11      1 r31r29n1

Checking Job Start

You may view the start time of your job with the command squeue --start. The output of the command will show the expected start time of the jobs.

$ squeue --start --user jayhawk
  JOBID  PARTITION     NAME     USER  ST           START\_TIME  NODES NODELIST(REASON)
   5822    sixhour  Jobname   bigjay  PD  2018-08-24T00:05:09      3 (Priority)
   5823    sixhour  Jobname   bigjay  PD  2018-08-24T00:07:39      3 (Priority)
   5824    sixhour  Jobname   bigjay  PD  2018-08-24T00:09:09      3 (Priority)
   5825    sixhour  Jobname   bigjay  PD  2018-08-24T00:12:09      3 (Priority)
   5826    sixhour  Jobname   bigjay  PD  2018-08-24T00:12:39      3 (Priority)
   5827    sixhour  Jobname   bigjay  PD  2018-08-24T00:12:39      3 (Priority)
   5828    sixhour  Jobname   bigjay  PD  2018-08-24T00:12:39      3 (Priority)
   5829    sixhour  Jobname   bigjay  PD  2018-08-24T00:13:09      3 (Priority)
   5830    sixhour  Jobname   bigjay  PD  2018-08-24T00:13:09      3 (Priority)
   5831    sixhour  Jobname   bigjay  PD  2018-08-24T00:14:09      3 (Priority)
   5832    sixhour  Jobname   bigjay  PD                  N/A      3 (Priority)

The output shows the expected start time of the jobs, as well as the reason that the jobs are currently idle (in this case, low priority of the user due to running numerous jobs already).

Cancel the Job

Cancelling the job is done with the scancel command. The only argument to the scancel command is the job id. The command is:

$ scancel 2234

Job History

sacct can be used to display currently running jobs and their usage and also previous job usage. It can be customized to look at certain options

$ sacct -u <user>

170          parallel\_+    sixhour        crc          4  COMPLETED      0:0
170.batch         batch                   crc          4  COMPLETED      0:0
171          parallel\_+    sixhour        crc          4 CANCELLED+      0:0
171.batch         batch                   crc          4  CANCELLED     0:15

Show all job information starting form a specific date

$ sacct --starttime 2014-07-01

Show job account information for a specific job

$ sacct -j <jobid>
$ sacct -j <jobid> -l

SLURM Commands

Below are some common, useful SLURM commands:

SLURM Command Function
sacct Used to report job or job step accounting information about active or completed jobs.
sinfo Reports the state of partitions and nodes managed by SLURM. It has a wide variety of filtering, sorting, and formatting options.
srun Used to submit a job for execution or initiate job steps in real time. srun has a wide variety of options to specify resource requirements, including: minimum and maximum node count, processor count, specific nodes to use or not use, and specific node characteristics (so much memory, disk space, certain required features, etc.). A job can contain multiple job steps executing sequentially or in parallel on independent or shared nodes within the job's node allocation.
squeue Reports the state of jobs or job steps. It has a wide variety of filtering, sorting, and formatting options. By default, it reports the running jobs in priority order and then the pending jobs in priority order.
squeue -u <username> Display the jobs submitted by the specified <username>
squeue -p <partition> Display the jobs in the specified <partition>. (Will not show jobs running in the sixhour partition that may be running on an owner partition)
scontrol show job <jobid> Check the status of a job (<jobid>).
squeue --start --job <jobid> Show an estimate of when your job (<jobid>) might start.
scontrol show nodes <node_name> Check the status of a node (<node_name>).
scancel <jobid> Cancel a job.