Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

This page shows you how to run a regular bash script as a pipeline. The runAsPipeline script, accessible through the rcbio/1.2 module, converts an input bash script to a pipeline that easily submits jobs to the Slurm scheduler for you.

...

Code Block
srun --pty -p interactive -t 0-12:0:0 --mem 2000MB -c 1 /bin/bash
mkdir /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline  
cd /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline

...


This command will generate new bash script named slurmPipeLine.201801100946.sh in flag folder (201801100946 is the timestamp that runAsPipeline was invoked at). Then test run it, meaning does not really submit jobs, but only create a fake job id, 123 for each step. If you were to append run at the end of the command, the pipeline would actually be submitted to the Slurm scheduler.

Ideally, with 'useTmp', the software should run faster using local /tmp disk space for database/reference than the network storage. For this small query, the difference is small, or even slower if you use local /tmp. If you don't need /tmp, you can use noTmp.

With useTmp, the pipeline runner copy related data to /tmp and all file paths will be automatically updated to reflect a file's location in /tmp when using the useTmp option. 

...

Note that only step 2 used -t 50:0, and all other steps used the default -t 10:0. The default walltime limit was set in the runAsPipeline command, and the walltime parameter for step 2 was set in the bash_script_v2.sh script.

Code Block
runAsPipeline bashScriptV2.sh "sbatch -p short -t 10:0 -c 1" useTmp

# Below is the output: 
converting bashScriptV2.sh to flag/slurmPipeLine.201801161424.sh

find loopStart: for i in A B; do 	

find job marker:
#@1,0,find1,u:     

find job:
grep -H John $u >>  John.txt; grep -H Mike $u >>  Mike.txt        

find job marker:
#@2,0,find2,u,sbatch -p short -c 1 -t 50:0
sbatch options: sbatch -p short -c 1 -t 50:0

find job:
grep -H Nick $u >>  Nick.txt; grep -H Julia $u >>  Julia.txt
find loopend: done                    

find job marker:
#@3,1.2,merge:           

find job:
cat John.txt Mike.txt Nick.txt Julia.txt > all.txt
flag/slurmPipeLine.201801161424.sh .sh is ready to run. Starting to run ...
Running flag/slurmPipeLine.201801161424.sh bashScriptV2.sh
---------------------------------------------------------

step: 1, depends on: 0, job name: find1, flag: find1.A reference: .u
depend on no job
sbatch -p short -t 10:0 -c 1 --nodes=1  -J 1.0.find1.A -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.sh 
# Submitted batch job 123

step: 2, depends on: 0, job name: find2, flag: find2.A reference: .u
depend on no job
sbatch -p short -c 1 -t 50:0 --nodes=1  -J 2.0.find2.A -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.sh 
# Submitted batch job 123

step: 1, depends on: 0, job name: find1, flag: find1.B reference: .u
depend on no job
sbatch -p short -t 10:0 -c 1 --nodes=1  -J 1.0.find1.B -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.sh 
# Submitted batch job 123

step: 2, depends on: 0, job name: find2, flag: find2.B reference: .u
depend on no job
sbatch -p short -c 1 -t 50:0 --nodes=1  -J 2.0.find2.B -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.sh 
# Submitted batch job 123

step: 3, depends on: 1.2, job name: merge, flag: merge reference:
depend on multiple jobs
sbatch -p short -t 10:0 -c 1 --nodes=1 --dependency=afterok:123:123:123:123 -J 3.1.2.merge -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.sh 
# Submitted batch job 123

all submitted jobs:
job_id       depend_on              job_flag  
123         null                  1.0.find1.A
123         null                  2.0.find2.A
123         null                  1.0.find1.B
123         null                  2.0.find2.B
123         ..123.123..123.123    3.1.2.merge
---------------------------------------------------------

...

Code Block
runAsPipeline bashScriptV2.sh "sbatch -p short -t 10:0 -c 1" useTmp run

# Below is the output
converting bashScriptV2.sh to flag/slurmPipeLine.201801101002.run.sh

find loopStart: #loopStart,i

find job marker: for i in A B; do

find job:
grep -H John $u >> John.txt; grep -H Mike $u >> Mike.txt

find job marker:
#@2,0,find2,u,sbatch -p short -c 1 -t 50:0
sbatch options: sbatch -p short -c 1 -t 50:0

find job:
grep -H Nick $u >> Nick.txt; grep -H Julia $u >> Julia.txt
find loopend: done

find job marker:
#@3,1.2,merge:

find job:
cat John.txt Mike.txt Nick.txt Julia.txt > all.txt
flag/slurmPipeLine.201801101002.run.sh is ready to run. Starting to run ...
Running flag/slurmPipeLine.201801101002.run.sh
---------------------------------------------------------

step: 1, depends on: 0, job name: find1, flag: find1.A reference: .u
depend on no job
sbatch -p short -t 10:0 -c 1 --kill-on-invalid-dep=yes --nodes=1 -J 1.0.find1.A -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.A.sh
# Submitted batch job 8091045

step: 2, depends on: 0, job name: find2, flag: find2.A reference: .u
depend on no job
sbatch -p short -c 1 -t 50:0 --kill-on-invalid-dep=yes --nodes=1 -J 2.0.find2.A -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.A.sh
# Submitted batch job 8091046

step: 1, depends on: 0, job name: find1, flag: find1.B reference: .u
depend on no job
sbatch -p short -t 10:0 -c 1 --kill-on-invalid-dep=yes --nodes=1 -J 1.0.find1.B -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/1.0.find1.B.sh
# Submitted batch job 8091047

step: 2, depends on: 0, job name: find2, flag: find2.B reference: .u
depend on no job
sbatch -p short -c 1 -t 50:0 --kill-on-invalid-dep=yes --nodes=1 -J 2.0.find2.B -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/2.0.find2.B.sh
# Submitted batch job 8091048

step: 3, depends on: 1.2, job name: merge, flag: merge reference:
depend on multiple jobs
sbatch -p short -t 10:0 -c 1 --kill-on-invalid-dep=yes --nodes=1 --dependency=afterok:8091045:8091047:8091046:8091048 -J 3.1.2.merge -o /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.out -e /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.out /n/scratch3scratch/users/a/abc123/testRunBashScriptAsSlurmPipeline/flag/3.1.2.merge.sh
# Submitted batch job 8091049
all submitted jobs:
job_id depend_on job_flag
8091045 null 1.0.find1.A
8091046 null 2.0.find2.A
8091047 null 1.0.find1.B
8091048 null 2.0.find2.B
8091049 ..8091045.8091047..8091046.8091048 3.1.2.merge
---------------------------------------------------------

...

If you have a bash script with multiple steps and you wish to run it as Slurm pipeline, modify your old script and add the notation to mark the start and end of any loops, and the start of any step for which you want to submit as an sbatch job. Then you can use runAsPipeline with your modified bash script, as detailed above. 

...

Code Block
sbatch -p short -t 10:0 -o flag.out -e flag.out flag.sh

sendJobFinishEmail.sh is in /n/app/rcbio/1.2/bin/


Let us know if you have any questions by emailing rchelp@hms.harvard.edu. Please include your working folder and the commands used in your email. Any comments and suggestions are welcome!

...