Skip to content

Commands

List of useful Slurm commands

  • Check Slurm cluster state:

    • sinfo will display general cluster information.
    • sinfo -s will display the summary of cluster information.
    • sinfo -N -l will display the status of each node
    • scontrol show nodes will display detailed information about the state of each node, helpful for debugging purposes
    • scontrol show partitions will display all available partitions
    • scontrol show partition <partition_name> will display detailed partition information regarding the partition
  • Check Slurm user information:

    • sacctmgr show user will display general user information
    • sacctmgr show association will display user associations (to quotas, resource limitations, etc)
  • Check Slurm job states:

    • watch squeue -u $USER will display information about jobs that are scheduled for execution or are currently running
    • squeue -u $USER -o "%.18i %.20P %.15j %.8u %.8T %.10M %.20R" will also display information about the scheduled jobs with more details
    • watch sacct will display the current state of each job (press Ctrl+C to exit)
    • sacct -N slurm-worker-cpu-1 will show the list of executed jobs in which the given node was involved
    • sacct -u konrad --format=JobID,JobName,Partition,State,Elapsed -S now-1hour will display recent job history per user
    • scontrol show job <job_id> will display a general information regarding the job
    • sstat <job_id> will display a summary information regarding the job
    • scontrol getaddrs $(scontrol show job 10 | grep "NodeList=slurm" | cut -d '=' -f 2) | col2 | cut -d ':' -f 1 will display the worker node IP on which the current interactive job is running
  • Schedule Slurm jobs:

    • use sbatch test_job.batch to schedule a job
    • to schedule a job against one particular partition, for example a GPU partition, use sbatch test_job.batch -p gpu_large command
    • use scancel <job_id> to cancel any jobs
  • Check resource limitations:

    • use the quota -u $USER -s command to check the storage space available
    • sacctmgr show qos format=name%30,MaxJobsPerUser%30,MaxSubmitJobsPerUser%30,MaxTRESPerJob%30 will display detailed information regarding the partition-level resource limitations