NOTICE: FULL O2 Cluster Outage, January 3 - January 10th
O2 will be completely offline for a planned HMS IT data center relocation from Friday, Jan 3, 6:00 PM, through Friday, Jan 10
- on Jan 3 (5:30-6:00 PM): O2 login access will be turned off.
- on Jan 3 (6:00 PM): O2 systems will start being powered off.
This project will relocate existing services, consolidate servers, reduce power consumption, and decommission outdated hardware to improve efficiency, enhance resiliency, and lower costs.
Specifically:
- The O2 Cluster will be completely offline, including O2 Portal.
- All data on O2 will be inaccessible.
- Any jobs still pending when the outage begins will need to be resubmitted after O2 is back online.
- Websites on O2 will be completely offline, including all web content.
More details at: https://harvardmed.atlassian.net/l/cp/1BVpyGqm & https://it.hms.harvard.edu/news/upcoming-data-center-relocation
O2Portal - RELION App
This app will start a RELION GUI on one of the O2 cluster compute nodes. After clicking on the HMS RC RELION application you should see the page:
where you can select several parameters for your job:
Slurm Account:
This is the Slurm Account associated with your Slurm User. You can find your Slurm account by running the command sshare -U -u $USER from a shell within the O2 cluster.
Partition:
This is the partition you want to use to submit the job.
Wall Time requested in hours:
This is the desired time, in hours, you want to allocate for the OOD job. The maximum value admissible depends on the partition you select.
Number of cores:
This is the number of CPU cores you want to allocate for this job.
Number of GPU cards:
This is the number of GPU cards you want to allocate for this job. If you want to allocate one or more GPU card make sure to select a partition which supports GPU jobs. Leave this field blank if you do not need a GPU card
GPU card type:
Here you can select a particular type of GPU card. If you request a specific type of GPU card make sure to select a partition which includes the GPU type you are requesting.
Total Memory in GB:
This is the amount of memory (RAM) in GB you want to allocate for your job.
Relion version:
Select the desired version to use.
Path to your RELION project folder:
You can enter the full path for a new or preexisting RELION project folder. If left blank $HOME will be used as the default folder path.
Slurm Custom Arguments
This is an optional text field that can be used to pass additional flags to the Slurm scheduler when submitting the job.
After setting the above fields click on the Launch button which will submit the job.
While your job is pending on the queue you should see a page like:
The Session ID highlighted link can be used to see the log files created for the current jobs on a new OOD browser tab.
When the job is dispatched and ready to run you should see a screen like:
You can control the compression and quality of the graphics with the two control bars. To open the RELION GUI click on the Launch RELION button.
A new tab should open with the RELION GUI like:
If the folder used doesn’t contain an existing RELION project you will be prompted with the option of creating one.
Note: If you click “No” the job will fail as it won’t have any RELION project to use
When done you need to close the RELION browser tab and click the Delete button from the OpenOnDemand Interactive session.
Note: Closing OpenOnDemand browser will not terminate active applications. Your RELION job will keep running until it reaches the requested Wall Time limit or the "Delete" button is used.
How to debug problems
If something does not work properly please make sure to record the actual O2 jobid printed at the top of the interactive app window
( 63426095 in the example) and click on the Session ID highlighted link which should open the OOD file editor on the folder where the job’s log files are written.
To debug your problem you can start by checking the output log in the file output.log.
If you need additional help you can reach out to rchelp@hms.harvard.edu, make sure to include the full path listed on the OOD file page along with any content printed in the output.log file.
What to do if you accidentally minimize the RELION windows
If you accidentally minimize the RELION GUI you can bring it back using the VNC Alt option and the tab key on your keyboard.
After minimizing the RELION GUI you might end up with an empty gray screen. However your session is still active and can be brought back by first clicking on the VNC menu bar
then selecting the "A" option and the "Alt" button.
Finally using the "Tab" on your local keyboard you can bring back all active windows in the sessions:
and by clicking with the mouse the desired GUI can be brought back to the screen.
After resuming your RELION GUI, remember to unselect the Alt button before resuming your work, leaving the VNC Alt button selected will be equivalent to adding Alt to anything you type.