NOTICE: FULL O2 Cluster Outage, January 3 - January 10th

O2 will be completely offline for a planned HMS IT data center relocation from Friday, Jan 3, 6:00 PM, through Friday, Jan 10

  • on Jan 3 (5:30-6:00 PM): O2 login access will be turned off.
  • on Jan 3 (6:00 PM): O2 systems will start being powered off.

This project will relocate existing services, consolidate servers, reduce power consumption, and decommission outdated hardware to improve efficiency, enhance resiliency, and lower costs.

Specifically:

  • The O2 Cluster will be completely offline, including O2 Portal.
  • All data on O2 will be inaccessible.
  • Any jobs still pending when the outage begins will need to be resubmitted after O2 is back online.
  • Websites on O2 will be completely offline, including all web content.

More details at: https://harvardmed.atlassian.net/l/cp/1BVpyGqm & https://it.hms.harvard.edu/news/upcoming-data-center-relocation

O2Portal - Jupyter App

This app will start a Jupyter Notebook on one of the O2 cluster compute node. After clicking on the HMS RC Jupyter application you should see the page:

where you can select several parameters for your Jupyter job:

Modules to be preloaded:

You can enter here the O2 modules that should be preloaded when running the Jupyter Notebook. The default setting is to load the gcc/6.2.0 and python/3.7.4 modules. 

Account:

This is the Slurm Account associated with your Slurm User. You can find your Slurm account by running the command sshare -U -u $USER from a shell within the O2 cluster.

Partition:

This is the partition you want to use to submit the job. 

Wall Time requested in hours:

This is the desired time, in hours, you want to allocate for the OOD job. The maximum value admissible depends on the partition you select. 

Number of cores:

This is the number of CPU cores you want to allocate for this job.

Number of GPU cards:

This is the number of GPU cards you want to allocate for this job. If you want to allocate one or more GPU card make sure to select a partition which supports GPU jobs. Leave this field blank if you do not need a GPU card

GPU card type:

Here you can select a particular type of GPU card. If you request a specific type of GPU card make sure to select a partition which includes the GPU type you are requesting.

Total Memory in GB:

This is the amount of memory (RAM) in GB you want to allocate for your job. 

Jupyter Environment:

This is where you need to select your customized Jupyter environment by sourcing (or activating) the desired Python virtualenv or Conda path. The default value is source /n/app/jupyter_python-3.7.4/6.2.0/bin/activate which will activate a basic Jupyter notebook installed against Python 3.7.4. For more information on how to build your own jupyter environment please check our wiki page at Jupyter on O2#InstallingJupyter

If you are using the Conda default (base) environment you need to explicitly activate that environment using this section, the environment is not automatically activated otherwise.

If you are doing any customization in your ~/.bashrc (or equivalent) file that is required for your Notebook you need to source that file adding the string: source $HOME/.bashrc

Any command from this field will be executed after the modules specified above are loaded, and might impact which program version is available in the OOD application.For example:
If you load modules gcc/6.2.0 and python/3.7.4,  then in this section you activate a conda environment which was based on python 3.6.0, then the version of python available within OOD application will be python 3.6.0, the last to be loaded.

Jupyter Extra Arguments:

This is an optional argument to include additional arguments for Jupyter Lab or Jupyter Notebook.

Enable JupyterLab:

Select to enable JupyterLab rather than Jupyter Notebook.  JupyterLab is the latest generation of Jupyterm which enables a more rich and modular GUI experience.

Slurm Custom Arguments

This is an optional text field that can be used to pass additional flags to the Slurm scheduler when submitting the job.

 

After setting the above fields click on the Launch button which will submit the job.

While your job is pending on the queue you should  see a page like:

The Session ID highlighted link can be used to see the log files created for the current jobs on a new OOD browser tab.

When the job is dispatched and ready to run you should see a screen like:

To open the Jupyter Notebook click on the Connect to Jupyter button.

A new tab should open with the Jupyter Notebook like:

When done you need to close the Jupyter Notebook browser tab and click the Delete button from the OpenOnDemand Interactive session. 

Note: Closing OpenOnDemand browser will not terminate active applications. Your OOD job will keep running until it reaches the requested Wall Time limit or the "Delete" button is used.

 

How to debug problems

If something does not work properly please make sure to record the actual O2 jobid  printed at the top of the interactive app window

( 26733283 in the example) and click on the Session ID highlighted link which should open the OOD file editor on the folder where the job’s log files are written.

To debug your problem you can start by checking the output log in the file output.log.

If you need additional help you can reach out to rchelp@hms.harvard.edu, make sure to include the full path listed on the OOD file page along with any content printed in the output.log file.

 

Create a virtual environment

Visithttps://harvardmed.atlassian.net/wiki/x/AoQGXwfor full details.

First, start an interactive job on O2:

srun -t 30:00 --pty -p interactive -c 1 --mem=8G /bin/bash

Create virtual environment with Jupyter and

module purge module load gcc/9.2.0 python/3.9.14 virtualenv /home/$USER/jupyterenv source /home/$USER/jupyterenv/bin/activate pip3 install jupyter jupyterlab pip3 install --force-reinstall urllib3==1.26.16

Â