Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents


RC created two simplified commands, O2squeue and O2sacct  based O2_jobs_report  based on slurm squeue and sacct, that can be used to gather information about your active (pending or running) jobs and your past jobs. 

As usual, feel free to contact rchelp@hms.harvard.edu with any questions about the reports from these commands.

O2squeue

This command is based on slurm the Slurm command squeue and will return information about your job that are currently pending and or running jobs . For example:

Code Block
login05:~ O2squeue
JOBID     PARTITION     STATE       TIME_LIMIT     TIME           NODELIST(REASON)         ELIGIBLE_TIME         START_TIME            TRES_ALLOC
21801263  interactive   RUNNING     12:00:00       2:09:52        compute-a-16-160         2020-11-09T11:35:49   2020-11-09T11:36:19   cpu=1,mem=2G,node=1,billing=1

The field STATE describes the states of your jobs and it O2squeue can take as inputs the string R or PD to list only running or pending jobs.


State

The STATE field will normally be either PENDING or RUNNING.  When .  For other job codes, see the “JOB STATE CODES” section of the Slurm squeue page.

Nodelist (Reason)

When a job is pending running, NODELIST(REASON) describes the reason why the job is pending, most common reasons are:

BadConstraints: The job's constraints can not be satisfied.

Dependency: This job is waiting for a dependent job to complete.

InvalidQOS: The job's QOS is invalid.

JobHeldAdmin: The job is held by a system administrator.

JobHeldUser: The job is held by the user.

None: The job has not been evaluated yet by the scheduler

Priority: One or more higher priority jobs exist for this partition or advanced reservation.

QOSJobLimit: The job's QOS has reached its maximum job count.

QOSResourceLimit: The job's QOS has reached some resource limit.

QOSTimeLimit: The job's QOS has reached its time limit.

ReqNodeNotAvail: Some node specifically required by the job is not currently available.

Reservation: The job is waiting its advanced reservation to become available.

Resources: The job is waiting for resources to become available.

The field ELIGIBLE_TIME indicates the time when a job becomes eligible to be dispatched, this is usually the submit time unless there are reason why the job cannot be dispatched such as job dependencies or unavailable resources requested.

For running jobs START_TIME indicates the time when the job was dispatched. For pending jobs it indicates the expected start time. Note that expected start time is only calculated for the first few pending jobs of each user and it is in general an upper bound value.

TRES indicates the resources requested by the job

O2squeue can take as inputs the string R or PD to selectively list only running or pending jobs

O2sacct

This command is based on slurm command sacct and will return information about your past jobs. For example:

...

lists the node (or nodes, for a parallel MPI job) that the job is running on.

When a job is pending, NODELIST(REASON) describes the reason why the job is pending. The reasons are explained below. Some of the most common reasons are in bold. Many of the other reasons will only apply if you submitted the job with a special QOS (Quality of Service) or job dependency or reservation.

BadConstraints: The job's constraints can not be satisfied.

Dependency: This job is waiting for a dependent job to complete. (See the --dependency option on the Slurm sbatch page.)

InvalidQOS: The job's QOS is invalid.

JobHeldAdmin: The job is held (forced to pend) by a system administrator.

JobHeldUser: The job is held by the user.

None: The job has not been evaluated yet by the scheduler. (This can happen if the scheduler is working through a huge batch of submitted jobs.)

Priority: One or more higher priority jobs exist for this partition or advanced reservation. Over time, your job's priority will gradually increase, so your job should eventually run.

QOSJobLimit: The job's QOS has reached its maximum job count.

QOSResourceLimit: The job's QOS has reached some resource limit.

QOSTimeLimit: The job's QOS has reached its time limit.

ReqNodeNotAvail: Some node explicitly required by the job submission is not currently available.

Reservation: The job is waiting for its advanced reservation to become available.

Resources: The job is waiting for resources to become available.

Eligible Time

The field ELIGIBLE_TIME indicates the time when a job becomes eligible to be dispatched. This is usually the submit time, unless the job cannot be dispatched immediately, such as job dependencies or unavailable resources requested (like specific nodes that are having a planned outage).

Start Time

For running jobs, START_TIME indicates the time when the job was dispatched. For pending jobs, it indicates the expected start time. Note that the expected start time is only calculated for the first few pending jobs of each user, and it is, in general, an upper bound value, assuming that all jobs will run for their maximum time.

TRES

TRES indicates the resources requested with flags like -t (or --time) in the job submission command or sbatch script.


O2_jobs_report

O2_jobs_report is based on the Slurm command sacct and can be used to query the Slurm database for information on your past jobs.

The command gets information including CPU (compute time), Memory (RAM), and WallTime efficiency. It prints information for every single job or as an overall report. It is also possible to select specific dates, jobs, jobs'names, partitions, and jobs' states.

The RC team checks jobs' efficiency only for jobs that are marked as COMPLETED by the Slurm scheduler. To query only COMPLETED jobs with O2_jobs_report add the flag --state=COMPLETED

By default, the tool will only show jobs starting from midnight of the previous day. So if you run the command at 9 am on Thursday, you’ll get the data for 24 hours on Wednesday PLUS the first 9 hours of Thursday.

Use --start or --lastdays to specify a custom time range if looking for older jobs

You can customize your query using the flags described below. To see all the available options from the O2 shell, you can run O2_jobs_report -h

Code Block
-j JOBID, --jobid JOBID
                          The specific jobid numbers; can be multiple comma-separated jobids with no spaces, for example --jobid=123,456,78
Code Block
-s START, --start START
                          The desired start date for the query; the date must be entered using the format YYYY-MM-DD. This flag is not compatible with --lastdays
Code Block
--lastdays LASTDAYS      
                          Query jobs from the previous LASTDAYS days. This is equivalent to using --start=YYYY-MM-DD with the desired start date. 
                          This flag is not compatible with --start
Code Block
-e END, --end END       
                          Specify an end date for the query; the default end date is tomorrow.
Code Block
--account ACCOUNT       
                          Specify your entire Slurm account. The report will include jobs from every user in your Lab.
Code Block
--jobname JOBNAME       
        00:00:43                67.44  Specify a Slurm job name for  billing=1,cpu=1,mem=0.98G,node=1
 21769333.batch                 COMPLETED     compute-a-16-162the query (that you submitted with sbatch -J); this flag can be used with a comma-separated list of jobnames with no spaces
Code Block
--state STATE         
2020-11-09T00:15:11                          Specify a list of job states describing how 00:00:43 jobs ended. 
             67.44             Possible options are CANCELLED, COMPLETED,  cpu=1,mem=0.98G,node=1   FAILED, NODE_FAIL, OUT_OF_MEMORY, PREEMPTED, and TIMEOUT. 
      0.53G 21769333.extern                 COMPLETED  This flag  compute-a-16-162       2020-11-09T00:15:11             can be used with a comma-separated list of jobs states with no spaces, for example --state=COMPLETED,FAILED
Code Block
-p PARTITION, --partition PARTITION
                    00:00:45      Jobs submitted to specific partitions; you can specify multiple comma-separated partitions 0.00with no spaces.
Code Block
--report       billing=1,cpu=1,mem=0.98G,node=1       
       0        21775057     priority    COMPLETED  Print a  compute-a-16-168       2020-11-09T01:17:10summary report instead of detailed information for each job
Code Block
--verbose             
 00:05:00             00:00:50            Use this flag to 58.00display the verbose information for each job directly  billing=1,cpu=1,mem=0.98G,node=1
 21775057.batch as it is returned from the Slurm sacct command

By default, O2_jobs_report will report information for each job, for example:

Code Block
login04:~ O2_jobs_report

JOBID        USER     ACCOUNT   COMPLETED   PARTITION  compute-a-16-168     STATE  2020-11-09T01:17:10         STARTTIME       WALLTIME(hr)   nCPU,RAM(GB),nGPU    PENDINGTIME(hr)    CPU_EFF(%) RAM_EFF(%)      00:00:50                58.00                   cpu=1,mem=0.98G,node=1           0.50G
21775057.extern                 COMPLETED     compute-a-16-168       2020-11-09T01:17:10                                  00:00:50    WALLTIME_EFF(%)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
4992748      rc       rccg         transfer        COMPLETED       2023-03-16      24.0.00           billing=1,cpu=1,mem=0.98G,node=12.0,0              0.0                57.1 0

The field CPUefficiency_% indicates how efficiently the job used the CPU cores allocated. If this number is less than 75% and the job is requesting more than one cpu core, then your job is probably requesting more cores than it can use.

AllocTRES reports the total amount of resources (cpu, memory, etc.) allocated for the job. 

MaxMemoryUsed reports the maximum amount of memory used by the job, if this value is significantly smaller than the allocated memory reported by AllocTRES you should reduce the memory requested by your job. (Note: for mpi jobs this is the max amount of memory used in each node)

O2sacct  can take as arguments a jobid or a start-time, an end-time and a job state. Times can be used to define a search interval and must be in the format YYYY-MM-DD or YYYY-MM-DDThh-mm-ss, if a time range is not specify only recent jobs are searched. If a end-time is not specified end-time is set to now, if a job state is specified a search time window must be provided and the comand will return jobs that were in the specified state during the given time interval.

Possible job states are:

CA = job cancelled

CD = job completed

F = job failed

NF = job failed due to Node failure

TO = job timeout

R = job running

OOM = job out of memory

PD = job pending

PR = job preempted 

      0.0        0.1

5013407      rc       rccg         transfer        COMPLETED       2023-03-16      24.0           1,2.0,0              0.01               0.0        0.0        0.0

5045324      rc       rccg         transfer        COMPLETED       2023-03-16      24.0           1,2.0,0              0.02               0.0        0.0        0.0

5077214      rc       rccg         transfer        COMPLETED       2023-03-17      24.0           1,2.0,0              0.01               33.7       0.0        0.1

5100444      rc       rccg         transfer        COMPLETED       2023-03-17      24.0           1,2.0,0              0.02               0.0        0.0        0.0

However, it is possible to see a summary report by using the flag --report, for example:

Code Block
login04:~ O2_jobs_report --report

JOBS STATES COUNT FROM 2023-03-16 TO 2023-03-18
==========  ===========
  USERNAME    COMPLETED
==========  ===========
        rc            5
==========  ===========

JOBS PARTITIONS COUNT FROM 2023-03-16 TO 2023-03-18
==========  ==========
  USERNAME    transfer
==========  ==========
        rc           5
==========  ==========

JOBS STATISTICS FROM 2023-03-16 TO 2023-03-18
======  ============  =========================  ===========================  ======================  ==================  ============================  ===================
  User    Total Jobs    Median Pending Time(hr)    Average Allocated RAM(GB)    Average Used RAM(GB)    Max Used RAM(GB)    Jobs Using > 1/2 Alloc RAM    RAM Efficiency(%)
======  ============  =========================  ===========================  ======================  ==================  ============================  ===================
    rc             5                       0.01                            2                       0                   0                             0                 0.03
======  ============  =========================  ===========================  ======================  ==================  ============================  ===================

======  =======================  ===================  =====================  ========================  ===========================
  User    Average Allocated CPU    CPU Efficiency(%)    Average Runtime(hr)    WallTime Efficiency(%)    Jobs Using > 1/2 WallTime
======  =======================  ===================  =====================  ========================  ===========================
    rc                        1                 34.4                   0.01                         0                            0
======  =======================  ===================  =====================  ========================  ===========================

O2 Resource Utilization

We created a simplified script called O2usage, which runs sreportin the background and requires minimal arguments.

The script can be executed from everywhere in O2 and requires two inputs, the starting and ending dates for the desired time interval, both dates should be in the format of YYYY-MM-DD.

For example:

Code Block
login05:~ O2usage 2021-03-15 2021-04-15
 
--------------------------------------------------------------------------------
Cluster/Account/User Utilization 2021-03-15 - 2021-04-15
Usage reported in Hours, memory is in MiB hours
-------------------------------------------------------------------------------

...

Examples:

O2sacct
O2sacct 8964563
O2sacct CD 2018-04-01
O2sacct 2018-04-01 R
O2sacct 2018-04-01 2018-04-10
O2sacct 2018-04-01 2018-04-10 R

Note: 

...

-
 
Cluster|Account|User|Name|Resource|Usage
o2|rccg|rp189|Potami|cpu|315
o2|rccg|rp189|Potami|mem|423467
o2|rccg|rp189|Potami|gres/gpu|12

The usage is reported in CPU hours, MiB hours, and GPU hours. 

If the User executing the query is the PI responsible for the Slurm Account (Lab), then the O2usage script will report the utilization for the entire Lab.