...
profiles
describes methods by which the pipeline can be invoked. This is specified at execution time vianextflow ... -p profilename1,profilename2,...
. At least one profile name must be specified. The profile names in this file are in addition to the default profiles (thesingularity
profile in this file augments the defaultsingularity
profile implemented by Nextflow, etc.).the
singularity
profile sets parameters to allow usage of Singularity containers on O2 to execute pipeline steps. You shouldn’t need to mess with this profile.the
cluster
profile sets parameters to allow submission of pipeline steps via O2’sslurm
scheduler.the only parameter you may be interested in is the
queue
parameter, which governs which partition a pipeline step is submitted to.If a pipeline step requires less than 12 hours, it is submitted to
short
. If less than 5 days,medium
. Otherwise,long
.If you have access to additional partitions (such as
mpi
,highmem
, contributed partitions, etc.), setqueue
accordingly.Keep in mind that such special partitions do not have the same time governances (other than the 30 day limit) on them, so if you would like to integrate one or more of these partitions with the existing
short
/medium
/long
paradigm, you will likely need to modify one or more of the pipeline-specific configuration files as well. Please contact rchelp@hms.harvard.edu with inquiries about this.If you are planning to use a specialized partition exclusively, then simply overwrite the queue specification with that partition name.
the
local
profile invites the pipeline to be executed within your existing resource allocation (e.g., inside the active interactive session). You need to make sure you have requested the MAXIMUM of cores and memory desired by any one step of the pipeline in order for this profile to execute successfully.
executor
describes how the pipeline processes will be run (such as on local compute resources, on cloud resources, or by interacting with a cluster compute scheduler). Theexecutor
will keep track of each of the processes, and if they succeed or fail.When using the
slurm
executor, Nextflow can submit each process in the workflow as ansbatch
job.Additional parameters that govern the Slurm job submission process are
queueSize
andsubmitRateLimit
. ThequeueSize
is how many tasks can be processed at one time; here we use 1900 tasks. ThesubmitRateLimit
is the maximum number of jobs that will be submitted for a specified time interval. In our file, we limit it to 20 jobs submitted per second.
When using the
local
executor, Nextflow will run each process using the resources available on the current compute node.
Modifying the configuration file
As mentioned above, the following variables may be modified depending on your job submission preferences:
queue
- if you would like to submit to a contributed or other specialized access partition, you can replace this entire string with the appropriate partition name (e.g.,highmem
orgpu_quad
). You will still need to make sure you have access to submit to that partition.clusterOptions
- this is a variable that allows you to set additionalsbatch
parameters (or even other execution parameters for pipeline steps, but that is outside the scope of this section). More information can be found in the Nextflow documentation for the clusterOptions flag. An example of its use here could be something like:
Code Block |
---|
profiles {
...
cluster {
process {
executor = 'slurm'
cache = 'lenient'
queue = gpu_quad
clusterOptions = '--gres=gpu:1', '-x compute-g-17-[166-171]'
}
}
... |
here, we are now submitting directly to the gpu_quad
partition in all cases instead of dealing with the cpu-only partitions, and are excluding the compute nodes with L40S GPU cards (just as an example).
Do note that if you do plan to use gpu-based partitions, you should make sure that your process runtime limits do not exceed that of the partition’s specified limits (5 days in this case). If that is a concern, add the time
field to the above block to override the process limits specified by the workflow:
Code Block |
---|
profiles {
...
cluster {
process {
executor = 'slurm'
cache = 'lenient'
queue = gpu_quad
clusterOptions = '--gres=gpu:1', '-x compute-g-17-[166-171]'
time = '5.d'
}
}
... |
Executing Nextflow Pipelines
...