Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Longwood is the newest High-Performance Compute Cluster at HMS. It is located at the Massachusetts Green High Performance Computing Center.

...

This provides a heterogeneous environment with both Intel (DGX) and ARM (Grace Hopper) architectures. Module management is supported through LMOD, allowing easy loading of software suites like the NVIDIA NeMo deep learning toolkit and more.

How to connect

Note

The cluster is currently only accessible via secure shell (ssh) command line from the HMS network:

  • HMS wired LAN

  • HMS Secure wireless network

  • HMS VPN

Two-factor authentication (DUO) is not required for logins because all connections must originate from an HMS network. Currently, the login server hostname is: login.dgx.rc.hms.harvard.edu

...

  • Several popular tools are available as modules. Use the module -t spider command for a list of all modules.

  • Modules installed by the RC team are available in two stacks tailored for each architecture:

    • Intel: module load dgx

    • ARM: module load grace

  • Modules automatically loaded: DefaulModules and slurm

  • NVIDIA NeMo™ and BioNeMo™ are available in Longwood

  • Users can also install additional custom tools locally

  • It is possible to load any module directly from login nodes, but the actual software (under /n/app) is only available on compute nodes

  • Singularity Containers are also supported

    • Containers are located at /n/app/containers/

Partitions

  • gpu_dgx - the standard GPU partition

  • gpu_grace - this targets the special Grace Hopper nodes. You’ll need to be using software compiled for ARM

  • gpu_dia - the DIA dedicated GPU partition which takes priority over gpu_dgx

  • cpu - the partition available to run jobs that do not require a GPU card.

  • TimeLimit is up to 5 days for both partitions

...