|
This page shows you how to run GATK4 using our recently installed Singularity GATK4 container. The runAsPipeline
script, accessible through the rcbio/1.0
module, converts the bash script into a pipeline that easily submits jobs to the Slurm scheduler for you.
...
The workflows are downloaded from: https://github.com/gatk-workflows/gatk4-rnaseq-germline-snps-indels and modified to work on O2 slurm cluster.
Notice the original workflow uses reference and annotation files listed in this file:
We download the genome reference and all annotation files from:
https://console.cloud.google.com/storage/browser/genomics-public-data/references/Homo_sapiens_assembly19_1000genomes_decoy/ except for the gtf file, which is downloaded from here: https://console.cloud.google.com/storage/browser/gatk-test-data/intervals?project=broad-dsde-outreach
We then modified the json file to this one:
Code Block |
---|
/n/shared_db/singularity/hmsrc-gatk/scripts/gatk4-rna-germline-variant-calling.inputs.template.json |
...
Code Block |
---|
# This will setup the path and environmental variables for the pipeline module load gcc/6.2.0 python/2.7.12 java/jdk-1.8u112 star/2.5.4a rcbio/1.3.3 export PATH=/n/shared_db/singularity/hmsrc-gatk/bin:/home/ld32/rcbioDev/bin:$PATH # setup database. Only need run this once. It will setup database in home, so make sure you have at least 5G free space at home. setupDB.sh |
...