Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

The srun Slurm command allows running a job step using the resources allocated to a batch (sbatch) job.

For example, this approach can be used to open a shell directly on the node where the sbatch job is running or even to run short commands against the resources allocated for a sbatch job. The shell or the commands started via the srun will be constrained to use only the resources available to the sbatch job.

The syntax to use is srun --jobid=<jobid_number> ...

First, submit your sbatch job, in this example called my_sbatch_job.sh:

#!/bin/bash

#SBATCH -c 1            # Number of cores requested
#SBATCH -t 4:00:00      # Wall-time
#SBATCH -p priority     # Partition
#SBATCH --mem=2G        # memory per node

# your sbatch job commands here
python3 python_sleep.py

@login03:SLURM sbatch my_sbatch_job.sh
Submitted batch job 10610731

Once the sbatch job is running, it is possible to start a shell as a slurm jobstep using the same resources allocated for the sbatch job (10610731 in this example).

@login03:~ srun --jobid=10610731 --pty bash
@compute-e-16-233:~

Everything executed within that srun shell will run using the same resource already allocated for the sbatch job.

It is also possible to run non-interactive commands using the srun jobstep, for example:

rp189@login03:~ srun --jobid=10610731 hostname
compute-e-16-233.o2.rc.hms.harvard.edu

Where the Linux command hostname was executed on the compute node where job 10610731 was dispatched and using the resources allocated for that job.

What can you do with srun --jobid=<jobid_number> ... :

  • monitor in real time your sbatch running jobs

  • use resources allocated to a running sbatch job that you know might be idles at a given time. For example use GPU or CPU computing power when temporarily idle.

  • Get immediate access to some computational resources if you are in a urgent need. For example, you might be running a GPU jobs and still have enough VRAM (GPU memory) free on that card that could be used to run a separate process.

Note:

Any command you run from srun --jobid=<jobid_number> will use the resources allocated for the running sbatch job, so your command will compete against the same CPU, RAM and GPU resources.

  • No labels