Public Databases
/n/shared_db
We started using /n/shared_db for public databases with the launch of the O2 cluster. The databases are organized in this folder structure:
Genome/software/Version/database
For example:
mm10/rsem/1.3.0/mm10
Exceptions:
Some databases were assembled by developers and their original folder structure was kept, as in:
igenome/03032016/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/genome
bcbio databases are maintained by the Harvard Chan Bioinformatics Core , and installed this way:Â
bcbio/biodata/genomes/Hsapiens
/n/groups/shared_databases
This folder was created for an earlier cluster (Orchestra), but is still in use. The structure is like this:Â
software/genomeVersion/database
Exceptions:
There are some exceptions to the structure, such as:Â ignome and blastdb
A quick way to find the desired databases:Â
We have create a text file containing all the database names and paths. You can directly search the species and software:Â
$ grep -i bowtie2Â /n/shared_db/allDatabases.no.bcbio.singularity.rcbio.txt | grep -i hg19 | less
You should see (uk means unknown here):Â
Many of our modules also point to the relevant databases. You can use the module spider
command to identify a module of interest, and then to find the appropriate databases. Here is an example for cellranger: