Skip to end of metadata
Go to start of metadata

User commands

User commands
SLURM
Job submission
sbatch [script_file] 
Queue list
squeue
Queue list (by user)
squeue -u [user_name]
Job deletion
scancel [job_id]
Job information
scontrol show job [job_id]
Job hold
scontrol hold [job_id]
Job release
scontrol release [job_id]
Node list
sinfo --Nodes --long
Cluster status
sinfo or squeue
GUI
sview /** graphical user interface to view and modify SLURM state **/

Job environment

Environment
Description
$SLURM_ARRAY_TASK_ID
Job array ID (index) number
$SLURM_ARRAY_JOB_ID
Job array's master job ID number
$SLURM_JOB_ID 
The ID of the job allocation
$SLURM_JOB_DEPENDENCYSet to value of the --dependency option
$SLURM_JOB_NAMEName of the job
$SLURM_JOB_NODELIST
List of nodes allocated to the job
$SLURM_JOB_NUM_NODES

Total number of nodes in the job's resource allocation

$SLURM_JOB_PARTITION
Name of the partition in which the job is running
$SLURM_MEM_PER_NODE
Memory requested per node
$SLURM_NODEID
ID of the nodes allocated
$SLURM_NTASKS
Number of tasks requested. Same as -n, --ntasks. To be used with mpirun, e.g. mpirun -np $SLURM_NTASKS binary
$SLURM_NTASKS_PER_NODE
Number of tasks requested per node. Only set if the --ntasks-per-node option is specified
$SLURM_PROCID
The MPI rank (or relative process ID) of the current process
$SLURM_RESTART_COUNT
If the job has been restarted due to system failure or has been explicitly requeued, this will be sent to the number of times the job has been restarted
$SLURM_SUBMIT_DIR
The directory from which sbatch was invoked
$SLURM_SUBMIT_HOST
The hostname of the computer from which sbatch was invoked
$SLURM_TASKS_PER_NODE
Number of tasks to be initiated on each node. Values are comma separated and in the same order as $SLURM_JOB_NODELIST
$SLURM_TOPOLOGY_ADDR
The value will be set to the names network switches which may be involved in the job's communications from the system's top level switch down to the leaf switch and ending with node name
$SLURM_TOPOLOGY_ADDR_PATTERN
The value will be set component types listed in $SLURM_TOPOLOGY_ADDR. Each component will be identified as either "switch" or "node". A period is used to separate each hardware component type

Job specification

Job specification
SLURM
Script directive
#SBATCH
Account to charge
--account=[account]
Begin Time
--begin=YYYY-MM-DD[HH:MM[:SS]]
Combine stdout/stderr
(use --output without the --error)
Copy Environment
--export=[ALL|NONE|variable]
CPU Count
--ntasks [count]
CPUs Per Task
--cpus-per-task=[count]
Email Address
--mail-user=[address]
Event Notification
--mail-type=[events] eg. BEGIN, END, FAIL, REQUEUE, and ALL (any state change)
Generic Resources
--gres=[resource_spec] eg. gpu:4 or mic:4
Node features
--constraint=[feature] eg. k20x, s10k and 5110p
Job Arrays
--array=[array_spec] 
Job Dependency
--depend=[state:job_id]
Job host preference
--nodelist=[nodes] AND/OR --exclude=[nodes]
Job Name
--job-name=[name]
Job Restart
--requeue OR --no-requeue
Licenses
--licenses=[license_spec]
Memory Size
--mem=[mem][M][G][T] 
Node Count
--nodes=[min[-max]]
Quality of Service
--qos=[name]
Queue
--partition=[queue]
Resource Sharing
--exclusive OR --shared
Standard Error File
--error=[file_name]
Standard Output File
--output=[file_name]
Tasks Per Node
--ntasks-per-node=[count]
Wall Clock Limit
--time=[min] OR  [days-hh:mm:ss]
Working Directory
--workdir=[dir_name]
  • No labels