File Transfer
- 1 Tools For Copying
- 1.1 Graphical tools
- 1.2 How To Copy Data to O2
- 1.3 Transfers on the O2 File Transfer Servers
- 1.4 Interactive command line copying on O2
- 1.5 Batch Copying on O2
- 1.6 Very big copies
- 1.7 Special considerations for the '/n/files' filesystem, aka research.files.med.harvard.edu
- 1.8 Special considerations for the '/n/standby' filesystem, aka Standby
Â
NOTE: Do not transfer files when connected to the HMS VPN. The HMS VPN system is designed for secure remote access and management.  With increased remote work, the HMS VPN is getting much higher use.  It cannot, and was not designed, to support large data transfers. Doing so hinders access to the HMS network for the HMS Community and will cause us to terminate your data transfer.
NOTE: For Harvard's Accellion Kiteworks file transfer appliance, allowing you to "mail" large attachments from your desktop, please see more details at: https://it.hms.harvard.edu/kiteworks
NOTE: For information on /n/files (aka research.files.med.harvard.edu
), see the bottom of this page.
NOTE: For guidelines on transferring data from O2 to NCBI's FTP for GEO submission, please reference this wiki page.
NOTE: When using scp/sftp to ssh into a transfer server outside of the HMS Network, DUO will simply hang if you do not have a default Duo Method setup. Reference this page for instructions on setting up a default Duo authentication method.
Tools For Copying
There are a number of secure ways to copy files to and from O2. The tools listed below encrypt your data and login credentials during the transfer over the internet. Be aware of which file systems you want to copy from and to. You might be copying from your laptop or desktop hard drive, or from some other site on the Internet.
Graphical tools
FileZilla - a Mac/Linux/Windows standalone sftp tool, with available Firefox browser plugin
winscp - a Windows scp/sftp app
How To Copy Data to O2
Connection parameters:
host:Â
transfer.rc.hms.harvard.edu
port:Â 22Â (the SFTP port)Â
username:Â your HMS ID (formerly known as eCommons ID), the ID you use to login to O2, in lowercase, e.g., ab123 (not your Harvard ID or Harvard Key)Â
password:Â your HMS ID password, the password you use when logging in to O2
Command line tools available on the O2 File Transfer Servers
scp
,sftp
,rsync
- these are automatically installed on Mac and Linuxpscp
,psftp
- Windows-only. These can be installed with the PuTTYÂ ssh program.ftp
- available on O2 for downloading from external sites which only accept FTP logins. But, O2 does not accept incoming FTP logins.aspera -Â a data transport and streaming technology, now owned by IBM.
awscli - Amazon AWS command line interface
basemount - an Illumina tool to mount BaseSpace Sequence Hub data.
bbcp -Â a point-to-point network file copy application from NERSC
lftp - can transfer files via FTP, FTPS, HTTP, HTTPS, FISH, SFTP, BitTorrent, and FTP over HTTP proxy.
gcloud - Google Cloud command line interface, including the gsutil command
NBIA Data Retriever - a tool for downloading data from the TCIA Data Portal , installed under
/opt/NBIADataRetriever
rclone - rsync for cloud storage
Globus - If the other side support Globus
For graphical tools, see the documentation that came with the program. Also, see our instructions on how to use these tools with two-factor auth. Many tools will by default copy somewhere in your /home
directory, which has a small 100GiB storage quota. Make sure to explicitly specify whether you want to copy there or to a different location like: /n/scratch/users/m/mfk8/
If you just have a single file to copy and you're on a Mac, you can also run a command like the following from the Terminal application:
me@mydesktop:~$ scp myfile my_o2_id@transfer.rc.hms.harvard.edu:/n/scratch/users/m/mfk8/
By default, scp
will copy to/from your home directory on the remote computer. You need to give the full path, starting with a /, in order to copy to other filesystems.
Transfers on the O2 File Transfer Servers
You can connect to the transfer nodes using ssh at the hostname: transfer.rc.hms.harvard.edu . If you're on Linux or Mac, you can use the native terminal application to connect to the transfer nodes. If you're on Windows, you will need to install a program to connect to the transfer servers; we recommend MobaXterm. In either terminal or MobaXterm, type the following command:
ssh yourhmsid@transfer.rc.hms.harvard.edu
where you substitute yourhmsid
for your actual HMS ID (formerly known as eCommons ID) in lowercase. Once you authenticate, you'll be on one of the transfer servers. From here, you can enter commands like scp, sftp, rsync,
etc. See the above section on command line tools available on the O2 File Transfer Servers for more details. You can run transfer commands directly on the transfer servers after logging in. We do not have a job scheduler running on the transfer cluster, so you do not need to submit an sbatch
job or request an interactive session with srun
 to run such transfer processes. The transfer servers do not have any research applications (modules) are not available, as well.
If you have a large amount of data to transfer, please keep in mind that your session must stay active for the transfer to complete. For example, if your computer disconnects from your wifi network, the transfer will abort. You can prevent this from happening by modifying your transfer command to "ignore hang ups" with the nohup command like so:
nohup your_transfer_command &
The nohup
command says to avoid cancelling a running command, even if the user disconnects from the session. The &
means to run the process in the background. Any text which would normally be printed to screen will be put into a file named nohup.out
. You obviously would substitute your_transfer_command
with your actual scp
or rsync
or rclone
, etc. command.
Interactive command line copying on O2
File transfer processes are too resource intensive to run on the O2 login servers, but you can run these interactively from a compute node as you would any other application. Launch an interactive session with the following srun
command, and then run your commands once logged into a compute node:
Batch Copying on O2
Experienced users can set up batch copies using rsync
or recursive cp
. Please do not run large transfers on the O2 login nodes (login0X). They will be slow and subject to suspension, as they are competing with dozens of simultaneous logins and programs.
If you want to copy a large set of files, it may be best to submit as a job to O2. For example:
This will run in the short
partition like any other job.
The main advantage of batch copying is that you can make it part of a workflow. For example, you can use dependencies to run your analysis only when the job copying input files to O2 has finished. For example:
Very big copies
Contact Research Computing if you want to copy multiple tebibytes (a tebibyte is 1.0995 terabytes). We may be able to speed up the process.
Special considerations for the '/n/files' filesystem, aka research.files.med.harvard.edu
The O2 login nodes and most compute nodes do not currently mount /n/files
. There are 2 ways to access this filesystem from O2:
Use O2's dedicated file transfer servers:Â
SSH login to the hostname: transfer.rc.hms.harvard.edu . You will be connected to a system which has access to
/n/files
.Once logged in, just run your commands (e.g.
rsync
,scp
,cp
) normally without using sbatch.Transfer servers can not submit jobs to the cluster, and research applications (modules) are not available from those systems.
If you have a batch job workflow that must use /n/files , you can request access to be able to use the "transfer" job partition. This partition has access to a few lower performance compute nodes which mount /n/files . They are only recommended when using the transfer servers is not an option, as these nodes are slower and generally less available.
Using the transfer job partition
Please note that we have restricted use of the `transfer` partition, to ensure that only those who need to access /n/files
on O2 will run jobs in this partition. You can contact us to request access to the transfer
partition. Here are examples of jobs using this partition:
Special considerations for the '/n/standby' filesystem, aka Standby
The O2 login nodes and compute nodes do not currently mount /n/standby
. To access this filesystem from O2:
Use O2's dedicated file transfer servers.
SSH login to the hostname: transfer.rc.hms.harvard.edu . You will be connected to a system which has access to
/n/standby
Once logged in, just run your transfer commands (rsync, cp, or mv) normally without using sbatch.
Transfer servers can not submit jobs to the cluster, and research applications (modules) are not available from those systems.
Here are the commands you can run:
For more information on the Standby Storage option, please reference the HMS Research Computing Storage page, or the dedicated Standby page.