Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Note

From July 1st 2021 the gpu_requeue partition is available only to users working for a PI with a primary or secondary appointment in a pre-clinical department.

...

This partition currently comprises 28 Nvidia RTX6000 single precision cards, 2 Nvidia A100 cards, and 2 Nvidia M40 Tesla cards. The RTX6000's and the M40's are NOT suitable Most of the cards in this partition (the RTX6000 cards) are not ideal for GPU double precision jobs, the A100's do support double-precision.*; if you need to run in double precision you should add the flag --constraint=gpu_doublep when submitting your jobs.

To see the currently available resources under the gpu_requeue partition you  can use the command below:

...

How to Submit to the gpu_requeue Partition

To submit jobs on gpu_requeue you need to specify that partition with the flag "-p", and add the flag --requeue. Without the requeue flags jobs will still get killed but will not be automatically requeued.  

...

How to Efficiently Use the gpu_requeue Partition

IMPORTANT: 

In order to work properly, any job submitted to gpu_requeue that writes intermediate files must either be restartable from the beginning (overwriting partially completed files) or from a last saved checkpoint. Researchers are responsible to choose jobs that can be run in this way.

...