NOTICE: FULL O2 Cluster Outage, January 3 - January 10th

O2 will be completely offline for a planned HMS IT data center relocation from Friday, Jan 3, 6:00 PM, through Friday, Jan 10

  • on Jan 3 (5:30-6:00 PM): O2 login access will be turned off.
  • on Jan 3 (6:00 PM): O2 systems will start being powered off.

This project will relocate existing services, consolidate servers, reduce power consumption, and decommission outdated hardware to improve efficiency, enhance resiliency, and lower costs.

Specifically:

  • The O2 Cluster will be completely offline, including O2 Portal.
  • All data on O2 will be inaccessible.
  • Any jobs still pending when the outage begins will need to be resubmitted after O2 is back online.
  • Websites on O2 will be completely offline, including all web content.

More details at: https://harvardmed.atlassian.net/l/cp/1BVpyGqm & https://it.hms.harvard.edu/news/upcoming-data-center-relocation

O2Portal -MATLAB Proxy Application

This app will start a MATLAB proxy application on one of the O2 cluster compute nodes. This application does not include the complete MATLAB desktop toolbox set but it is expected to have significantly better responsiveness and should be the preferred choice for running graphical MATLAB when possible.

After clicking on the HMS RC MATLAB-proxy-app application you should see the page:

where you can select several parameters for your MATLAB job:

Slurm Account:

This is the Slurm Account associated with your Slurm User. You can find your Slurm account by running the command sshare -U -u $USER from a shell within the O2 cluster.

Partition:

This is the partition you want to use to submit the job. 

Wall Time requested in hours:

This is the desired time, in hours, you want to allocate for the OOD job. The maximum value admissible depends on the partition you select. 

Number of cores:

This is the number of CPU cores you want to allocate for this job.

Number of GPU cards:

This is the number of GPU cards you want to allocate for this job. If you want to allocate one or more GPU card make sure to select a partition which supports GPU jobs. Leave this field blank if you do not need a GPU card

GPU card type:

Here you can select a particular type of GPU card. If you request a specific type of GPU card make sure to select a partition which includes the GPU type you are requesting.

Total Memory in GB:

This is the amount of memory (RAM) in GB you want to allocate for your job. 

MATLAB version:

Select the desired version of MATLAB to use.

Slurm Custom Arguments

This is an optional text field that can be used to pass additional flags to the Slurm scheduler when submitting the job.

 

 

After setting the above fields click on the Launch button which will submit the job.

While your job is pending on the queue you should  see a page like:

The Session ID highlighted link can be used to see the log files created for the current jobs on a new OOD browser tab.

When the job is dispatched and ready to run you should see a screen like:

To open the MATLAB GUI click on the Connect to MATLAB button.

A new tab should open and will temporarily display a screen like:

while the MATLAB instance is starting. This step can last up to several minutes, and is followed by the informative message:

which you can close and eventually arrive to the familiar MATLAB GUI interface:

How to debug problems

If something does not work properly please make sure to record the actual O2 jobid  printed at the top of the interactive app window

 

( 54253535 in the example) and click on the Session ID highlighted link which should open the OOD file editor on the folder where the job’s log files are written.

To debug your problem you can start by checking the output log in the file output.log.

If you need additional help you can reach out to rchelp@hms.harvard.edu, make sure to include the full path listed on the OOD file page along with any content printed in the output.log file.