Cutadapt, Bowtie2 and MACS2 for ChipSeq
Start an interactive job, with a walltime of 2 hours, 2000MB of memory.
srun --pty -p interactive -t 0-02:0:0 --mem 2000MB -n 1 /bin/bashCreate a working directory on scratch and change into the newly-created directory. For example, for user abc123, the working directory will be
mkdir /n/scratch/users/a/abc123/chipSeq/lib
cd /n/scratch/users/a/abc123/chipSeq
# get some testing data (only one libary is allowed)
head -n 40000000 /n/groups/shared_databases/rcbio/SRR34848368_1.with.barcode.fq > lib/in.fqBuild a barcodes.fa file for demultiplexing library. In the .fa file, sample name containing barcode as barcode ID:
nano barcodes.fa
>1Tr
ATCACG
>2Tr
CGATGT
>3Tr
TTAGGC
>4Tr
TGACCA
>5Tr
ACAGTG
>1Co
TAGCTT
>2Co
GGCTAC
>3Co
CTTGTA
>4Co
ATATAGGA
>5CoClone rcbio, setup path and copy the example bowtie2 and macs2 bash script:
# This will setup the path and environment variables for the pipeline
git clone https://github.com/ld32/rcbio.git $HOME/rcbio
export PATH=$HOME/rcbio/bin:$PATH
cp $HOME/rcbio/bin/cutadaptBowtie2Macs2.sh . Now you can modify the command options as needed. To edit the script:
nano cutadaptBowtie2Macs2.shTo test the pipeline run the following command. Jobs will not be submitted to the scheduler.
runAsPipeline "cutadaptBowtie2Macs2.sh -r hg38" "sbatch -p short --mem 6G -t 2:0:0 -n 1" noTmp
#Or if you want to use your own bowtie2 index:
runAsPipeline "cutadaptBowtie2Macs2.sh -b /n/scratch/users/a/abc123/index/hg38GenomeWithChr11Report" "sbatch -p short --mem 6G -t 2:0:0 -n 1" noTmp
# this is a test runTo run the cutadaptBowtie2Macs2.sh pipeline:
runAsPipeline "cutadaptBowtie2Macs2.sh -r hg38" "sbatch -p short --mem 6G -t 2:0:0 -n 1" noTmp run 2>&1 | tee output.log
#Or if you want to use your own bowtie2 index:
runAsPipeline "cutadaptBowtie2Macs2.sh -b /n/scratch/users/a/abc123/index/hg38GenomeWithChr11Report" "sbatch -p short --mem 6G -t 2:0:0 -n 1" noTmp run 2>&1 | tee output.log
# notice here 'run 2>&1 | tee output.log' is added to the command
To understand how 'runAsPipeline' works, how to check output, how to re-run the pipeline, please visit: Run Bash Script As Slurm Pipeline
Now you are ready to run an rcbio workflow
To instead run the workflow on your own data, transfer the sample sheet to your local machine following this wiki page and modify the sample sheet. Then you can transfer it back to O2 under your account, then go to the build folder structure step.