|
This page shows you how to run a regular bash script as a pipeline. The runAsPipeline
script, accessible through the rcbio/1.2
module, converts an input bash script to a pipeline that easily submits jobs to the Slurm scheduler for you.
...
Code Block |
---|
srun --pty -p interactive -t 0-12:0:0 --mem 2000MB -c 1 /bin/bash mkdir /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline cd /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline |
...
This command will generate new bash script named slurmPipeLine.201801100946.sh
in flag folder (201801100946
is the timestamp that runAsPipeline
was invoked at). Then test run it, meaning does not really submit jobs, but only create a fake job id, 123
for each step. If you were to append run
at the end of the command, the pipeline would actually be submitted to the Slurm scheduler.
Ideally, with 'useTmp', the software should run faster using local /tmp
disk space for database/reference than the network storage. For this small query, the difference is small, or even slower if you use local /tmp
. If you don't need /tmp,
you can use noTmp.
With useTmp, the pipeline runner copy related data to /tmp
and all file paths will be automatically updated to reflect a file's location in /tmp when using the useTmp option.
...
Note that only step 2 used -t 50:0
, and all other steps used the default -t 10:0
. The default walltime limit was set in the runAsPipeline
command, and the walltime parameter for step 2 was set in the bash_script_v2.sh
script.
Code Block |
---|
runAsPipeline bashScriptV2.sh "sbatch -p short -t 10:0 -c 1" useTmp # Below is the output: converting bashScriptV2.sh to flag/slurmPipeLine.201801161424.sh find loopStart: for i in A B; do find job marker: #@1,0,find1,u: find job: grep -H John $u >> John.txt; grep -H Mike $u >> Mike.txt find job marker: #@2,0,find2,u,sbatch -p short -c 1 -t 50:0 sbatch options: sbatch -p short -c 1 -t 50:0 find job: grep -H Nick $u >> Nick.txt; grep -H Julia $u >> Julia.txt find loopend: done find job marker: #@3,1.2,merge: find job: cat John.txt Mike.txt Nick.txt Julia.txt > all.txt flag/slurmPipeLine.201801161424.sh .sh is ready to run. Starting to run ... Running flag/slurmPipeLine.201801161424.sh bashScriptV2.sh --------------------------------------------------------- step: 1, depends on: 0, job name: find1, flag: find1.A reference: .u depend on no job sbatch -p short -t 10:0 -c 1 --nodes=1 -J 1.0.find1.A -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.sh # Submitted batch job 123 step: 2, depends on: 0, job name: find2, flag: find2.A reference: .u depend on no job sbatch -p short -c 1 -t 50:0 --nodes=1 -J 2.0.find2.A -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.sh # Submitted batch job 123 step: 1, depends on: 0, job name: find1, flag: find1.B reference: .u depend on no job sbatch -p short -t 10:0 -c 1 --nodes=1 -J 1.0.find1.B -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.sh # Submitted batch job 123 step: 2, depends on: 0, job name: find2, flag: find2.B reference: .u depend on no job sbatch -p short -c 1 -t 50:0 --nodes=1 -J 2.0.find2.B -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.sh # Submitted batch job 123 step: 3, depends on: 1.2, job name: merge, flag: merge reference: depend on multiple jobs sbatch -p short -t 10:0 -c 1 --nodes=1 --dependency=afterok:123:123:123:123 -J 3.1.2.merge -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.sh # Submitted batch job 123 all submitted jobs: job_id depend_on job_flag 123 null 1.0.find1.A 123 null 2.0.find2.A 123 null 1.0.find1.B 123 null 2.0.find2.B 123 ..123.123..123.123 3.1.2.merge --------------------------------------------------------- |
...
Code Block |
---|
runAsPipeline bashScriptV2.sh "sbatch -p short -t 10:0 -c 1" useTmp run # Below is the output converting bashScriptV2.sh to flag/slurmPipeLine.201801101002.run.sh find loopStart: #loopStart,i find job marker: for i in A B; do find job: grep -H John $u >> John.txt; grep -H Mike $u >> Mike.txt find job marker: #@2,0,find2,u,sbatch -p short -c 1 -t 50:0 sbatch options: sbatch -p short -c 1 -t 50:0 find job: grep -H Nick $u >> Nick.txt; grep -H Julia $u >> Julia.txt find loopend: done find job marker: #@3,1.2,merge: find job: cat John.txt Mike.txt Nick.txt Julia.txt > all.txt flag/slurmPipeLine.201801101002.run.sh is ready to run. Starting to run ... Running flag/slurmPipeLine.201801101002.run.sh --------------------------------------------------------- step: 1, depends on: 0, job name: find1, flag: find1.A reference: .u depend on no job sbatch -p short -t 10:0 -c 1 --kill-on-invalid-dep=yes --nodes=1 -J 1.0.find1.A -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.sh # Submitted batch job 8091045 step: 2, depends on: 0, job name: find2, flag: find2.A reference: .u depend on no job sbatch -p short -c 1 -t 50:0 --kill-on-invalid-dep=yes --nodes=1 -J 2.0.find2.A -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.sh # Submitted batch job 8091046 step: 1, depends on: 0, job name: find1, flag: find1.B reference: .u depend on no job sbatch -p short -t 10:0 -c 1 --kill-on-invalid-dep=yes --nodes=1 -J 1.0.find1.B -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.sh # Submitted batch job 8091047 step: 2, depends on: 0, job name: find2, flag: find2.B reference: .u depend on no job sbatch -p short -c 1 -t 50:0 --kill-on-invalid-dep=yes --nodes=1 -J 2.0.find2.B -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.sh # Submitted batch job 8091048 step: 3, depends on: 1.2, job name: merge, flag: merge reference: depend on multiple jobs sbatch -p short -t 10:0 -c 1 --kill-on-invalid-dep=yes --nodes=1 --dependency=afterok:8091045:8091047:8091046:8091048 -J 3.1.2.merge -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.sh # Submitted batch job 8091049 all submitted jobs: job_id depend_on job_flag 8091045 null 1.0.find1.A 8091046 null 2.0.find2.A 8091047 null 1.0.find1.B 8091048 null 2.0.find2.B 8091049 ..8091045.8091047..8091046.8091048 3.1.2.merge --------------------------------------------------------- |
...
If you have a bash script with multiple steps and you wish to run it as Slurm pipeline, modify your old script and add the notation to mark the start and end of any loops, and the start of any step for which you want to submit as an sbatch
job. Then you can use runAsPipeline
with your modified bash script, as detailed above.
...
Code Block |
---|
sbatch -p short -t 10:0 -o flag.out -e flag.out flag.sh |
sendJobFinishEmail.sh
is in /n/app/rcbio/1.2/bin/
Let us know if you have any questions by emailing rchelp@hms.harvard.edu. Please include your working folder and the commands used in your email. Any comments and suggestions are welcome!
...