NOTICE: FULL O2 Cluster Outage, January 3 - January 10th
O2 will be completely offline for a planned HMS IT data center relocation from Friday, Jan 3, 6:00 PM, through Friday, Jan 10
- on Jan 3 (5:30-6:00 PM): O2 login access will be turned off.
- on Jan 3 (6:00 PM): O2 systems will start being powered off.
This project will relocate existing services, consolidate servers, reduce power consumption, and decommission outdated hardware to improve efficiency, enhance resiliency, and lower costs.
Specifically:
- The O2 Cluster will be completely offline, including O2 Portal.
- All data on O2 will be inaccessible.
- Any jobs still pending when the outage begins will need to be resubmitted after O2 is back online.
- Websites on O2 will be completely offline, including all web content.
More details at: https://harvardmed.atlassian.net/l/cp/1BVpyGqm & https://it.hms.harvard.edu/news/upcoming-data-center-relocation
Get information about current and past jobs
- 1 O2squeue
- 1.1 State
- 1.2 Nodelist (Reason)
- 1.3 Eligible Time
- 1.4 Start Time
- 1.5 TRES
- 2 O2_jobs_report
- 3 O2 Resource Utilization
RC created two simplified commands, O2squeue and O2_jobs_report based on slurm squeue and sacct, that can be used to gather information about your active (pending or running) jobs and your past jobs.
As usual, feel free to contact rchelp@hms.harvard.edu with any questions about the reports from these commands.
O2squeue
This command is based on the Slurm command squeue and will return information about your job that are currently pending or running . For example:
login05:~ O2squeue
JOBID PARTITION STATE TIME_LIMIT TIME NODELIST(REASON) ELIGIBLE_TIME START_TIME TRES_ALLOC
21801263 interactive RUNNING 12:00:00 2:09:52 compute-a-16-160 2020-11-09T11:35:49 2020-11-09T11:36:19 cpu=1,mem=2G,node=1,billing=1
O2squeue can take as inputs the string R or PD to list only running or pending jobs.
State
The STATE field will normally be either PENDING or RUNNING. For other job codes, see the “JOB STATE CODES” section of the Slurm squeue page.
Nodelist (Reason)
When a job is running, NODELIST(REASON) lists the node (or nodes, for a parallel MPI job) that the job is running on.
When a job is pending, NODELIST(REASON) describes the reason why the job is pending. The reasons are explained below. Some of the most common reasons are in bold. Many of the other reasons will only apply if you submitted the job with a special QOS (Quality of Service) or job dependency or reservation.
BadConstraints: The job's constraints can not be satisfied.
Dependency: This job is waiting for a dependent job to complete. (See the --dependency option on the Slurm sbatch page.)
InvalidQOS: The job's QOS is invalid.
JobHeldAdmin: The job is held (forced to pend) by a system administrator.
JobHeldUser: The job is held by the user.
None: The job has not been evaluated yet by the scheduler. (This can happen if the scheduler is working through a huge batch of submitted jobs.)
Priority: One or more higher priority jobs exist for this partition or advanced reservation. Over time, your job's priority will gradually increase, so your job should eventually run.
QOSJobLimit: The job's QOS has reached its maximum job count.
QOSResourceLimit: The job's QOS has reached some resource limit.
QOSTimeLimit: The job's QOS has reached its time limit.
ReqNodeNotAvail: Some node explicitly required by the job submission is not currently available.
Reservation: The job is waiting for its advanced reservation to become available.
Resources: The job is waiting for resources to become available.
Eligible Time
The field ELIGIBLE_TIME indicates the time when a job becomes eligible to be dispatched. This is usually the submit time, unless the job cannot be dispatched immediately, such as job dependencies or unavailable resources requested (like specific nodes that are having a planned outage).
Start Time
For running jobs, START_TIME indicates the time when the job was dispatched. For pending jobs, it indicates the expected start time. Note that the expected start time is only calculated for the first few pending jobs of each user, and it is, in general, an upper bound value, assuming that all jobs will run for their maximum time.
TRES
TRES indicates the resources requested with flags like -t (or --time) in the job submission command or sbatch script.
O2_jobs_report
O2_jobs_report is based on the Slurm command sacct and can be used to query the Slurm database for information on your past jobs.
The command gets information including CPU (compute time), Memory (RAM), and WallTime efficiency. It prints information for every single job or as an overall report. It is also possible to select specific dates, jobs, jobs'names, partitions, and jobs' states.
The RC team checks jobs' efficiency only for jobs that are marked as COMPLETED by the Slurm scheduler. To query only COMPLETED jobs with O2_jobs_report add the flag --state=COMPLETED
By default, the tool will only show jobs starting from midnight of the previous day. So if you run the command at 9 am on Thursday, you’ll get the data for 24 hours on Wednesday PLUS the first 9 hours of Thursday.
Use --start or --lastdays to specify a custom time range if looking for older jobs
You can customize your query using the flags described below. To see all the available options from the O2 shell, you can run O2_jobs_report -h
-j JOBID, --jobid JOBID
The specific jobid numbers; can be multiple comma-separated jobids with no spaces, for example --jobid=123,456,78
-s START, --start START
The desired start date for the query; the date must be entered using the format YYYY-MM-DD. This flag is not compatible with --lastdays
By default, O2_jobs_report will report information for each job, for example:
However, it is possible to see a summary report by using the flag --report, for example:
O2 Resource Utilization
We created a simplified script called O2usage
, which runs sreport
in the background and requires minimal arguments.
The script can be executed from everywhere in O2 and requires two inputs, the starting and ending dates for the desired time interval, both dates should be in the format of YYYY-MM-DD.
For example:
The usage is reported in CPU hours, MiB hours, and GPU hours.
If the User executing the query is the PI responsible for the Slurm Account (Lab), then the O2usage script will report the utilization for the entire Lab.