Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

How much space do I have?

...

You might run an analysis on data in your home directory using reference data from your lab directory. You might then put results into the lab directory for other lab members to use.

Scratch directory (/n/

...

scratch/users/a/ab123)

Each user is entitled to space (10 25 TiB OR 1 2.5 million files/directories) under the /n/scratch3scratch filesystem. You can create a scratch3 personal scratch user directory  for storing temporary data.

...

Note: It is against HMS IT policy to artificially refresh last access modification time of any file located under /n/scratch3scratch.

For workflows that allow for full control of temp/intermediate files, you can leave your input data under your home or group (if available) directory, make the first step in the workflow read from the original directory, do all of the temp/intermediate writes to /n/scratch3scratch, and perform the final write back to the home or group location. So in a 5 step pipeline, step 1 reads from /n/groups or /home, steps 2-4 write intermediate files to /n/scratch3scratch, and step 5 reads from /n/scratch3scratch and writes back to the final output /n/groups or /home directory. Here is a suggested workflow:

  • Create a directory under /n/scratch3 if scratch if needed

  • Set up your workflow so that the input is read from /n/groups or /home, but temporary/intermediate files are written to your scratch3 scratch directory.

  • Write any needed results back to /n/groups or /home

  • Delete temporary data, or let it be auto-deleted

For workflows that write temp/intermediate files to the current directory, you can create a directory under /n/scratch3 and cd to it. Run the workflow from your scratch3 directory, specifying full paths to input files in /n/groups or /home and full final output paths to /n/groups or /home. Here is a suggested workflow using example ID "ab123":

  • Create a directory under /n/scratch3 scratch if needed.

  • Set up your workflow so that full paths are used to refer to input files in /n/groups or /home.

  • Change directories (cd) to your /n/scratch3scratch directory, and run the analysis from there:

    • cd /n/scratch3/users/a/ab123

  • Write or copy any needed results back to /n/groups, /home, or your desktop, with copies submitted as an sbatch job or from an interactive session:

    • srun --pty -p interactive -t 0-12:00 /bin/bash

  • Delete temporary data, or let it be auto-deleted

For workflows that allow little flexibility in the location of temporary/intermediate files, data can be copied over to /n/scratch3, computed against there, and copied back to /n/groups or /home. This creates a redundant copy of the input, takes up storage space, and requires time to transfer the data to and from /n/scratch3. Here is a suggested workflow:

  • Create a directory under /n/scratch3 if scratch if needed.

  • Copy data from /n/groups, /home, or your desktop to your scratch3 directory. We recommend submitting this as an sbatch job, or be copied from an interactive session (e.g. srun --pty -p interactive -t 0-12:00 /bin/bash)

  • Run the analysis in your scratch3 scratch directory, writing all temporary/intermediate files to this space

  • Copy any needed results back to your home or group directory on O2 via a cluster job or from an interactive session, or download to your desktop via the O2 file transfer servers (transfer.rc.hms.harvard.edu)

  • Delete temporary data, or let it be auto-deleted

IMPORTANT NOTE: If you are transferring files to /n/scratch3scratch using a tool and flag to preserve timestamps (e.g. rsync -a or -t), those files will also be subject to the deletion policy based on the original timestamp. If the preserved timestamp on a file is greater than 30 days, it will be deleted the next day, even if it had just been moved. This behavior may also occur if you are installing software on /n/scratch3scratch for personal usage for whatever reason; if there is a step inside the installation process that is simply copying files, and timestamps are preserved, your software may appear to stop functioning randomly as those files get purged prematurely. This is dangerous because the user rarely has insight as to when this occurs. Please be very judicious about handling files when moving them to or generating them on /n/scratch3scratch; as mentioned above, if you are affected by this behavior, the files are unrecoverable.

GPU dedicated scratch space (/n/scratch_gpu/users/a/ab123)

Research Computing is no longer providing the /n/scratch_gpu filesystem. Please use /n/scratch3 scratch instead.

Accessing folders on "research.files.med.harvard.edu" from O2

...

These filesystems are housed on a central file server and are available from any system within O2.

filesystem

use

/n/groups

shared group data storage (Contact Research Computing if you need a group space)

/n/data1

shared group data storage

/n/data2

shared group data storage

/home

individual account data storage

/n/

scratch3

scratch

temporary/intermediate file storage

/n/standby

longer term archival storage

Note: The /n/files filesystem, which allowed shared group data storage, is not accessible from O2 compute or login nodes, only from the transfer partition. This partition has restricted access, so you will need to request access to run jobs there. See File Transfer for more details.

...

These filesystems are housed on local disks on individual machines. We keep these filesystems synchronized using our deployment management infrastructure.

filesystem

use

/

top of UNIX filesystem

/usr

most installed software

/var

variable data such as logs and databases

Synchronized O2 filesystems are never backed up. The source system images from which compute nodes and application servers are built are backed up daily, and these can be used to reinstall a system.