|
How much space do I have?
...
You might run an analysis on data in your home directory using reference data from your lab directory. You might then put results into the lab directory for other lab members to use.
Scratch directory (/n/
...
scratch/users/a/ab123
)
Each user is entitled to space (10 25 TiB OR 1 2.5 million files/directories) under the /n/scratch3scratch
filesystem. You can create a scratch3 personal scratch user directory for storing temporary data.
...
Note: It is against HMS IT policy to artificially refresh last access modification time of any file located under /n/scratch3scratch
.
For workflows that allow for full control of temp/intermediate files, you can leave your input data under your home or group (if available) directory, make the first step in the workflow read from the original directory, do all of the temp/intermediate writes to /n/scratch3scratch
, and perform the final write back to the home or group location. So in a 5 step pipeline, step 1 reads from /n/groups
or /home
, steps 2-4 write intermediate files to /n/scratch3scratch
, and step 5 reads from /n/scratch3scratch
and writes back to the final output /n/groups
or /home
directory. Here is a suggested workflow:
Create a directory under /n/scratch3 if scratch if needed.
Set up your workflow so that the input is read from
/n/groups
or/home
, but temporary/intermediate files are written to your scratch3 scratch directory.Write any needed results back to
/n/groups
or/home
Delete temporary data, or let it be auto-deleted
For workflows that write temp/intermediate files to the current directory, you can create a directory under /n/scratch3
and cd
to it. Run the workflow from your scratch3 directory, specifying full paths to input files in /n/groups
or /home
and full final output paths to /n/groups
or /home
. Here is a suggested workflow using example ID "ab123":
Set up your workflow so that full paths are used to refer to input files in
/n/groups
or/home
.Change directories (
cd
) to your/n/scratch3scratch
directory, and run the analysis from there:cd /n/scratch3/users/a/ab123
Write or copy any needed results back to
/n/groups
,/home
, or your desktop, with copies submitted as ansbatch
job or from an interactive session:srun --pty -p interactive -t 0-12:00 /bin/bash
Delete temporary data, or let it be auto-deleted
For workflows that allow little flexibility in the location of temporary/intermediate files, data can be copied over to /n/scratch3
, computed against there, and copied back to /n/groups
or /home
. This creates a redundant copy of the input, takes up storage space, and requires time to transfer the data to and from /n/scratch3
. Here is a suggested workflow:
Create a directory under /n/scratch3 if scratch if needed.
Copy data from
/n/groups
,/home
, or your desktop to your scratch3 directory. We recommend submitting this as ansbatch
job, or be copied from an interactive session (e.g.srun --pty -p interactive -t 0-12:00 /bin/bash
)Run the analysis in your scratch3 scratch directory, writing all temporary/intermediate files to this space
Copy any needed results back to your home or group directory on O2 via a cluster job or from an interactive session, or download to your desktop via the O2 file transfer servers (transfer.rc.hms.harvard.edu)
Delete temporary data, or let it be auto-deleted
IMPORTANT NOTE: If you are transferring files to /n/scratch3scratch
using a tool and flag to preserve timestamps (e.g. rsync -a
or -t
), those files will also be subject to the deletion policy based on the original timestamp. If the preserved timestamp on a file is greater than 30 days, it will be deleted the next day, even if it had just been moved. This behavior may also occur if you are installing software on /n/scratch3scratch
for personal usage for whatever reason; if there is a step inside the installation process that is simply copying files, and timestamps are preserved, your software may appear to stop functioning randomly as those files get purged prematurely. This is dangerous because the user rarely has insight as to when this occurs. Please be very judicious about handling files when moving them to or generating them on /n/scratch3scratch
; as mentioned above, if you are affected by this behavior, the files are unrecoverable.
GPU dedicated scratch space (/n/scratch_gpu/users/a/ab123)
Research Computing is no longer providing the /n/scratch_gpu filesystem. Please use /n/scratch3 scratch instead.
Accessing folders on "research.files.med.harvard.edu" from O2
...
These filesystems are housed on a central file server and are available from any system within O2.
filesystem | use |
---|---|
| shared group data storage (Contact Research Computing if you need a group space) |
| shared group data storage |
| shared group data storage |
| individual account data storage |
|
| temporary/intermediate file storage |
| longer term archival storage |
Note: The /n/files
filesystem, which allowed shared group data storage, is not accessible from O2 compute or login nodes, only from the transfer
partition. This partition has restricted access, so you will need to request access to run jobs there. See File Transfer for more details.
...
These filesystems are housed on local disks on individual machines. We keep these filesystems synchronized using our deployment management infrastructure.
filesystem | use |
---|---|
| top of UNIX filesystem |
| most installed software |
| variable data such as logs and databases |
Synchronized O2 filesystems are never backed up. The source system images from which compute nodes and application servers are built are backed up daily, and these can be used to reinstall a system.