Available Software

1 Introduction
2 dgx: Intel-based Modules
3 grace: ARM-based Modules

Introduction

Installed software is available via the Lmod environment module system. We maintain two distinct sets of applications, as Intel and ARM installations are generally incompatible with each other. Please keep this in mind when submitting jobs to one architecture or the other.

There are two sets of modules available, which can been identified via module avail. The modules that are located under /cm are applications that are provided by the NVIDIA DGX repositories; HMS Research Computing was not responsible for installation of these applications. However, users may request additional applications from the NVIDIA repositories; newly added applications will be listed under the /cm path headings (i.e., /cm/local or /cm/shared).

Software installed by Research Computing is organized within the /n/lmod header, with the entrypoint being /n/lmod/architecture. Depending on the target architecture jobs are being submitted to, users should first load either dgx (DGX/Intel) or grace (Grace Hopper/ARM).

Users wishing to leverage a combination of NVIDIA and Research Computing offerings (or even just NVIDIA offerings) should exercise caution, as the NVIDIA-provided modules do not have built-in hierarchy or dependency resolution. This means that it can be possible to load multiple versions of the same application, which may result in an indeterminate environment state under certain circumstances.

(Research Computing installed) software is not actually available from the login nodes (login0X in your terminal prompt by default). Modules are made available on the login nodes such that users can prepare environment variable modifications in advance of submitting jobs. To actually use the installed applications, users must be on the appropriate compute node first (e.g., requesting an interactive session, via batch job submission, etc.). NVIDIA-provided software is available, but we strongly recommend accessing applications via compute node due to greater abundance of available resources.

One final thing to note - the login nodes present as Intel, so the modules provided by NVIDIA will be DGX/Intel-based applications. This means that to have the most accurate ARM-based module list, users MUST make sure they are on a Grace Hopper node (gh0X) before running module avail.

`dgx`: Intel-based Modules

This software is only compatible with the DGX compute nodes (dgx0X) (submitting to the gpu_dgx partition).

Recall that modules under /cm headers are NVIDIA-provided modules, while modules under /n/lmod are provided by Research Computing.

A snapshot of module avail after having loaded the dgx module is as follows (current as of 4/24/2024):

--------------------------- /n/lmod/dgx/Core ---------------------------
   gcc/13.2.0

-------------------------- /n/lmod/dgx/Linux ---------------------------
   R/4.3.3    miniconda3/24.1.2

------------------------ /cm/local/modulefiles -------------------------
   apptainer/1.1.9       freeipmi/1.6.10    module-info
   boost/1.81.0          gcc/13.1.0         null
   cluster-tools/10.0    ipmitool/1.8.19    openldap
   cm-bios-tools         lua/5.4.6          python3
   cmd                   luajit             python39
   cmjob                 mariadb-libs       shared              (L)
   dot                   module-git         slurm/slurm/23.02.7 (L)

------------------------ /usr/share/modulefiles ------------------------
   DefaultModules (L)

------------------------ /cm/shared/modulefiles ------------------------
   blacs/openmpi/gcc/64/1.1patch03    hdf5_18/1.8.21
   blas/gcc/64/3.11.0                 hwloc/1.11.13
   bonnie++/2.00a                     hwloc2/2.8.0
   cm-pmix3/3.1.7                     iozone/3.494
   cm-pmix4/4.1.3                     jupyter-eg-kernel-wlm-py39/3.0.2
   cuda11.8/blas/11.8.0               jupyter/15.1.2
   cuda11.8/fft/11.8.0                lapack/gcc/64/3.11.0
   cuda11.8/toolkit/11.8.0            mpich/ge/gcc/64/4.1.1
   cuda12.1/blas/12.1.1               mvapich2/gcc/64/2.3.7
   cuda12.1/fft/12.1.1                netcdf/gcc/64/gcc/64/4.9.2
   cuda12.1/toolkit/12.1.1            netperf/2.7.0
   cuda12.3/blas/12.3.1               nvhpc-byo-compiler/23.11
   cuda12.3/fft/12.3.1                nvhpc-hpcx-cuda11/23.11
   cuda12.3/toolkit/12.3.1            nvhpc-hpcx-cuda12/23.11
   cudnn8.6-cuda11.8/8.6.0.163        nvhpc-hpcx/23.11
   cudnn8.9-cuda12.1/8.9.6.50         nvhpc-nompi/23.11
   default-environment                nvhpc-openmpi3/23.11
   fftw3/openmpi/gcc/64/3.3.10        nvhpc/23.11
   gcc12/12.2.0                       openblas/dynamic/0.3.18
   gdb/13.1                           openmpi/gcc/64/4.1.5
   git/2.40.0                         openmpi4/gcc/4.1.5
   globalarrays/openmpi/gcc/64/5.8    ucx/1.10.1
   hdf5/1.14.0

------------------------- /n/lmod/architecture -------------------------
   dgx (L)    grace

  Where:
   L:  Module is loaded

Module defaults are chosen based on Find First Rules due to Name/Version/Version modules found in the module tree.
See https://lmod.readthedocs.io/en/latest/060_locating.html for details.

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules
matching any of the "keys".

`grace`: ARM-based Modules

This software is only compatible with the Grace Hopper compute nodes (gh0X) (submitting to the gpu_grace partition).

Recall that modules under /cm headers are NVIDIA-provided modules, while modules under /n/lmod are provided by Research Computing.

A snapshot of module avail after having loaded the grace module is as follows (current as of 4/24/2024):

-------------------------- /n/lmod/grace/Core --------------------------
   gcc/13.2.0

------------------------- /n/lmod/grace/Linux --------------------------
   miniconda3/24.1.2

------------------------- /n/lmod/architecture -------------------------
   dgx    grace (L)

------------------------ /cm/local/modulefiles -------------------------
   apptainer/1.1.9       dot                null
   boost/1.81.0          freeipmi/1.6.10    openldap
   cluster-tools/10.0    gcc/13.1.0         python3
   cm-bios-tools         ipmitool/1.8.19    python311
   cmake/3.26.3          lua/5.4.6          python39
   cmd                   mariadb-libs       shared
   cmjob                 module-git         slurm/slurm/23.02.7
   cuda-dcgm/3.1.8.1     module-info

------------------------ /cm/shared/modulefiles ------------------------
   cm-pmix3/3.1.7             hwloc2/2.8.0
   cm-pmix4/4.1.3             jupyter-eg-kernel-wlm-py39/3.0.2
   cuda11.8/blas/11.8.0       jupyter/15.1.2
   cuda11.8/fft/11.8.0        lapack/gcc/64/3.11.0
   cuda11.8/toolkit/11.8.0    mvapich2/gcc/64/2.3.7
   cuda12.1/blas/12.1.1       nvhpc-byo-compiler/23.11
   cuda12.1/fft/12.1.1        nvhpc-hpcx-cuda11/23.11
   cuda12.1/toolkit/12.1.1    nvhpc-hpcx-cuda12/23.11
   cuda12.3/blas/12.3.1       nvhpc-hpcx/23.11
   cuda12.3/fft/12.3.1        nvhpc-nompi/23.11
   cuda12.3/toolkit/12.3.1    nvhpc-openmpi3/23.11
   gcc12/12.2.0               nvhpc/23.11
   hdf5/1.14.0                ucx/1.10.1
   hwloc/1.11.13

------------------------ /usr/share/modulefiles ------------------------
   DefaultModules (L)

  Where:
   L:  Module is loaded

Module defaults are chosen based on Find First Rules due to Name/Version/Version modules found in the module tree.
See https://lmod.readthedocs.io/en/latest/060_locating.html for details.

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules
matching any of the "keys".

Longwood

Available Software

Introduction

`dgx`: Intel-based Modules

`grace`: ARM-based Modules

Related content

Available Software

Introduction

dgx: Intel-based Modules

grace: ARM-based Modules

Related content

`dgx`: Intel-based Modules

`grace`: ARM-based Modules