sratoolkit/2.10.7 to download NCBI SRA data

Log on to O2

If you need help connecting to O2, please review the How to login to O2 wiki page.

From Windows, use MobaXterm (preferred) or PuTTY to connect to o2.hms.harvard.edu and make sure the port is set to the default value of 22.

From a Mac Terminal, use the ssh command, inserting your eCommons ID instead of user123:


Start interactive job, and create working folder




Set default cache path. You can only need to do this once:



Use sratoolkit prefetch to download sra data, then convert the data from .sra to .fastq format



Additional tips: 

  1. If you need download a lot of data, run screen command before starting interactive job, to keep the session alive: 
    screen: Keep Linux Sessions Alive (so you can go back to the same terminal window from anywhere, anytime)
  2. If you a lot of samples to download, running prefetch command one by one is a lot of work. To automate the process, you can find the accession IDs from the website and put them in a loop to download one by one.  For example to download SRR6519510 to SRR6519519:
    for i in {6519510..6519519}; do
         prefetch SRR$i;
    done
  3. If you have more than a dozens of samples to download, running one by one needs lot of time. You can run them in parallel, For example you submit 5 jobs, let each job work on 100 accession IDs. Because these 5 jobs share the same network from O2 to NCBI cloud, these parallel prefetch commands will run slower than in serial mode. Please share your experience.
  4. Let us know if you have any questions.