Features of the new pipeline:
Submit each step as a cluster job using
sbatch
.Automatically arrange dependencies among jobs.
Email notifications are sent when each job fails or succeeds.
If a job fails, all its downstream jobs automatically are killed.
When re-running the pipeline on the same data folder, if there are any unfinished jobs, the user is asked to kill them or not.
When re-running the pipeline on the same data folder, the user is asked to confirm to re-run or not if a step was done successfully earlier.
You can directly copy and paste the commands to test run the pipeline.
...
Code Block |
---|
cp /n/app/rcbio/1.1/bin/skewerFastQCHisat2HtseqCount.sh . |
Now you can modify the options as needed. For example, if you have single end data, you should add read length. Please reference the Hisat2 user manual if you have any questions.
To edit the Kallisto and Sleuth bash script:
Code Block |
---|
nano skewerFastQCHisat2HtseqCount.sh |
To test the pipeline run the following command. Jobs will not be submitted to the scheduler.
Code Block |
---|
runAsPipeline "skewerFastQCHisat2HtseqCount.sh -s no -r mm10 -a /n/groups/shared_databases/rcbio/skewer_adapters.fa" "sbatch -p short -t 2:0:0 -n 1 --mem 4G" noTmp
# this is a test run |
To run the pipeline:
Code Block |
---|
runAsPipeline "skewerFastQCHisat2HtseqCount.sh -s no -r mm10 -a /n/groups/shared_databases/rcbio//skewer_adapters.fa" "sbatch -p short -t 2:0:0 -n 1 --mem 4G" noTmp run 2>&1 | tee output.log
# notice here 'run 2>&1 | tee output.log' is added to the command |
To understand how 'runAsPipeline' works, how to check output, how to re-run the pipeline, please visit: Run Bash Script As Slurm Pipeline
Now you are ready to run an rcbio workflow
To instead run workflow on your own data, transfer the sample sheet to your local machine following this wiki page and modify the sample sheet. Then you can transfer it back to O2 under your account, then go to the build folder structure step.