...
And nf-core
, as well as Nextflow, should be available to use.
Creating/Using Custom Nextflow Pipelines
Unfortunately, HMS IT is unable to support custom workflow creation below a surface level due to high amounts of customization involved. If a user is interested in creating their own Nextflow workflow, please see the Nextflow documentation for guidance on how to set up the structure correctly. HMS IT may be able to make recommendations related to resource requirements and such on O2.
Executing nf-core
Pipelines
Users that are interested in leveraging existing nf-core
workflows may do so using the nf-core
utility that they had installed via the instructions above. Generally, these workflows are invoked with the singularity
profile for reproducibility purposes, though the conda
profile is also supported on O2.If attempting to use an established Nextflow workflow that is independent of official nf-core
repositories, please refer to the instructions provided by the workflow maintainer.
Preparing Pipelines for Execution (using Singularity containers)
Note |
---|
O2 does not officially support software execution profiles other than |
If using the singularity
profile, it is necessary to move the associated containers to a whitelisted directory, per O2 containerization policy.
With the nf-core
conda environment active, download the containers associated with the pipeline with this command (while in an interactive session):
Code Block | ||
---|---|---|
| ||
(nf-core)$ nf-core download -x none --container-system singularity --parallel-downloads 8 nf-core/PIPELINENAME |
Where PIPELINENAME
is the name of the nf-core pipeline as notated on the nf-core website or associated GitHub repository.
...
Official nf-core
Pipelines
Users that are interested in leveraging existing nf-core
workflows may do so using the nf-core
utility that they had installed via the instructions above. Generally, these workflows are invoked with the singularity
profile for reproducibility purposes. However, manual intervention from HMS Research Computing is still currently required to get the containers installed.
If the pipeline is part of the official nf-core
repositories (e.g., it is listed at https://nf-co.re/pipelines/ ), then please contact HMS Research Computing at rchelp@hms.harvard.edu for assistance with moving these containers to the whitelisted location.
Once the containers are in the appropriate location, the NXF_SINGULARITY_CACHEDIR
variable needs to be set before executing the pipeline:
Code Block | ||||
---|---|---|---|---|
| ||||
(nf-core)$ export NXF_SINGULARITY_CACHEDIR=/n/app/singularity/containers/HMSID/nf-core/PIPELINENAME/PIPELINEVERSION |
This will allow you to execute containers associated with PIPELINEVERSION
of PIPELINENAME
without having to re-download the containers locally. This directory may change - desired paths can be negotiated on a per-request basis, but we recommend the nf-core/PIPELINENAME/PIPELINEVERSION
organization method such that users and labs can juggle multiple pipelines and versions if desired (though we request that if a user or lab no longer has need for an older version, they request that those containers be deleted/removed).
Obviously, this variable will need to be reset depending on the pipeline being executed.
From there, modification of the pipeline configuration files may be necessary. To start, there should be a nextflow.config
file located at /path/to/nf-core-PIPELINENAME_VERSION/SLIGHTLYDIFFERENTLOOKINGVERSION/nextflow.config
. This file will contain parameter settings associated with various steps in the workflow, as well as global maximum resource requirements.
Integrating Pipelines with slurm
Nextflow/nf-core does not provide HPC resource utilization out of the box via standard workflow configurations, but it can be configured manually.
Boilerplate O2 Configuration File
The following is an example of a configuration file that will allow you to submit individual steps of the pipeline as jobs to O2’s slurm
scheduler. Presently, only short
, medium
, and long
partitions are represented. If you would like to leverage a different partition (such as a contributed partition), please make edits to this file accordingly.
Presently there is no boilerplate configuration available for GPU utilization. Please contact rchelp@hms.harvard.edu for inquiries about leveraging GPUs with your Nextflow/nf-core pipeline via slurm
.
Paste this configuration into a text file on O2 somewhere (for example at the current working directory), and save it as something like nextflow_slurm.config
. You can then invoke your pipeline using nextflow ... -c nextflow_slurm.config -profile cluster,singularity
to prioritize this configuration file above (in addition to) the existing workflow configurations.
Code Block | ||
---|---|---|
| ||
//Use the params to define reference files, directories, and CLI options
params {
config_profile_description = 'HMS RC test nextflow/nf-core config'
config_profile_contact = 'rchelp@hms.harvard.edu'
config_profile_url = 'rc.hms.harvard.edu'
max_memory = 250.GB
// maximum number of cpus and time for slurm jobs
max_cpus = 20
max_time = 30.d
}
profiles {
singularity {
singularity.enabled = true
singularity.autoMounts = true
}
cluster {
process {
executor = 'slurm'
cache = 'lenient'
queue = { task.time > 5.d ? 'long' : task.time <= 12.h ? 'short' : 'medium'}
}
}
local {
process.executor = 'local'
}
}
executor {
$slurm {
queueSize = 1900
submitRateLimit = '20 sec'
}
}
// On a successful completion of a Nextflow run, automatically delete all intermediate files stored in the work/ directory
cleanup = true
// Allows to override the default cleanup = true behaviour for debugging
debug {
cleanup = false
}
//Miscellaneous CLI flags
resume = true |
Configuration File specification
The following is a brief summary of each section of this configuration file and their functions:
params
designates global maximum job allocation parameters, as well as configuration metadata.max_memory
is a global limit on how much memory any single job can request of the scheduler.max_cpus
is a global limit on how many cores any single job can request of the scheduler.max_time
is a global limit on how much wall time (real-life duration) any single job can request of the scheduler.30.d
(30 days) is the hard limit.
If you have access to resources that may allow you more than these values, you can consider modifying them accordingly.
...
profiles
describes methods by which the pipeline can be invoked. This is specified at execution time via nextflow ... -p profilename1,profilename2,...
. At least one profile name must be specified. The profile names in this file are in addition to the default profiles (the singularity
profile in this file augments the default singularity
profile implemented by Nextflow, etc.).
the
singularity
profile sets parameters to allow usage of Singularity containers on O2 to execute pipeline steps. You shouldn’t need to mess with this profile.the
cluster
profile sets parameters to allow submission of pipeline steps via O2’sslurm
scheduler.the only parameter you may be interested in is the
queue
parameter, which governs which partition a pipeline step is submitted to.If a pipeline step requires less than 12 hours, it is submitted to
short
. If less than 5 days,medium
. Otherwise,long
.If you have access to additional partitions (such as
mpi
,highmem
, contributed partitions, etc.), setqueue
accordingly.Keep in mind that such special partitions do not have the same time governances (other than the 30 day limit) on them, so if you would like to integrate one or more of these partitions with the existing
short
/medium
/long
paradigm, you will likely need to modify one or more of the pipeline-specific configuration files as well. Please contact rchelp@hms.harvard.edu with inquiries about this.If you are planning to use a specialized partition exclusively, then simply overwrite the queue specification with that partition name.
the
local
profile invites the pipeline to be executed within your existing resource allocation (e.g., inside the active interactive session). You need to make sure you have requested the MAXIMUM of cores and memory desired by any one step of the pipeline in order for this profile to execute successfully.
executor
describes how the pipeline processes will be run (such as on local compute resources, on cloud resources, or by interacting with a cluster compute scheduler). The executor
will keep track of each of the processes, and if they succeed or fail.
...
When using the slurm
executor, Nextflow can submit each process in the workflow as an sbatch
job.
Additional parameters that govern the Slurm job submission process are
queueSize
andsubmitRateLimit
. ThequeueSize
is how many tasks can be processed at one time; here we use 1900 tasks. ThesubmitRateLimit
is the maximum number of jobs that will be submitted for a specified time interval. In our file, we limit it to 20 jobs submitted per second.
...
with the pipeline name and version for which you would like the containers to be installed.
Custom or non-`nf-core` pipelines
Users attempting to set up a Nextflow pipeline that is not an official nf-core
pipeline will need to download the containers associated with the pipeline using whatever means is suggested by the pipeline maintainers.
You may attempt to use the self-service container installation tool to install your containers to the whitelisted directory as specified here: Self-Install Singularity Containers Note that this does require you to download the containers locally first. If this does not work for whatever reason, or the container installation is incomplete (i.e., you are also dealing with additional symbolic links or something else that the tool cannot presently handle), manual intervention will be required.
At this point, please contact HMS Research Computing at rchelp@hms.harvard.edu for assistance with moving these containers to the whitelisted location, and please indicate the path to which you downloaded these containers, as well as whether the pipeline is going to be for your personal use or if it will be shared with fellow lab members.
After containers are installed
If the requested containers were associated with an official nf-core
pipeline, they will be installed to
Code Block |
---|
/n/app/singularity/containers/nf-core/PIPELINENAME/PIPELINEVERSION |
Note that this is a directory that exists independent of individual user or lab membership - if you are looking to leverage a new nf-core pipeline, please look inside this directory tree to check if the pipeline and version you are intending to use has already installed containers. This will save you the time to contact HMS IT.
For other pipelines, they will be installed to
Code Block |
---|
/n/app/singularity/containers/HMSID/ |
or
Code Block |
---|
/n/app/singularity/containers/shared/LABNAME |
and possibly within some descriptive subdirectory, depending on preference.
For both cases, once the containers are installed, it is required to set the NXF_SINGULARITY_CACHEDIR
environment variable prior to executing the workflow:
Code Block | ||
---|---|---|
| ||
(nf-core)$ export NXF_SINGULARITY_CACHEDIR=/n/app/singularity/containers/CORRECTPATH |
Obviously, this variable will need to be reset depending on the pipeline being executed.
From there, modification of the pipeline configuration files may be necessary. To start, there should be a nextflow.config
file located at $HOME/.nextflow/assets/CATEGORY/PIPELINENAME/nextflow.config
. This file will contain parameter settings associated with various steps in the workflow, as well as global maximum resource requirements.
Integrating Pipelines with slurm
Nextflow/nf-core does not provide HPC resource utilization out of the box via standard workflow configurations, but it can be configured manually. If you do not provide a configuration file to interact with the Slurm scheduler on O2, Nextflow/nf-core will only use the local
resources of your current job allocation (such as within an interactive srun
job).
Boilerplate O2 Configuration File
The following is an example of a configuration file that will allow you to submit individual steps of the pipeline as jobs to O2’s slurm
scheduler. Presently, only short
, medium
, and long
partitions are represented. If you would like to leverage a different partition (such as a contributed partition), please make edits to this file accordingly.
Presently there is no boilerplate configuration available for GPU utilization. Please contact rchelp@hms.harvard.edu for inquiries about leveraging GPUs with your Nextflow/nf-core pipeline via slurm
.
Paste this configuration into a text file on O2 somewhere (for example at the current working directory), and save it as something like nextflow_slurm.config
. You can then invoke your pipeline using nextflow ... -c nextflow_slurm.config -profile cluster,singularity
to prioritize this configuration file above (in addition to) the existing workflow configurations.
Code Block | ||
---|---|---|
| ||
//Use the params to define reference files, directories, and CLI options
params {
config_profile_description = 'HMS RC test nextflow/nf-core config'
config_profile_contact = 'rchelp@hms.harvard.edu'
config_profile_url = 'rc.hms.harvard.edu'
max_memory = 250.GB
// maximum number of cpus and time for slurm jobs
max_cpus = 20
max_time = 30.d
}
profiles {
singularity {
singularity.enabled = true
singularity.autoMounts = true
}
cluster {
process {
executor = 'slurm'
cache = 'lenient'
queue = { task.time > 5.d ? 'long' : task.time <= 12.h ? 'short' : 'medium'}
}
}
local {
process.executor = 'local'
}
}
executor {
$slurm {
queueSize = 1900
submitRateLimit = '20 sec'
}
}
//Miscellaneous CLI flags
resume = true |
Configuration File specification
The following is a brief summary of each section of this configuration file and their functions:
params
designates global maximum job allocation parameters, as well as configuration metadata.max_memory
is a global limit on how much memory any single job can request of the scheduler.max_cpus
is a global limit on how many cores any single job can request of the scheduler.max_time
is a global limit on how much wall time (real-life duration) any single job can request of the scheduler.30.d
(30 days) is the hard limit.
If you have access to resources that may allow you more than these values, you can consider modifying them accordingly.
profiles
describes methods by which the pipeline can be invoked. This is specified at execution time vianextflow ... -p profilename1,profilename2,...
. At least one profile name must be specified. The profile names in this file are in addition to the default profiles (thesingularity
profile in this file augments the defaultsingularity
profile implemented by Nextflow, etc.).the
singularity
profile sets parameters to allow usage of Singularity containers on O2 to execute pipeline steps. You shouldn’t need to mess with this profile.the
cluster
profile sets parameters to allow submission of pipeline steps via O2’sslurm
scheduler.the only parameter you may be interested in is the
queue
parameter, which governs which partition a pipeline step is submitted to.If a pipeline step requires less than 12 hours, it is submitted to
short
. If less than 5 days,medium
. Otherwise,long
.If you have access to additional partitions (such as
mpi
,highmem
, contributed partitions, etc.), setqueue
accordingly.Keep in mind that such special partitions do not have the same time governances (other than the 30 day limit) on them, so if you would like to integrate one or more of these partitions with the existing
short
/medium
/long
paradigm, you will likely need to modify one or more of the pipeline-specific configuration files as well. Please contact rchelp@hms.harvard.edu with inquiries about this.If you are planning to use a specialized partition exclusively, then simply overwrite the queue specification with that partition name.
the
local
profile invites the pipeline to be executed within your existing resource allocation (e.g., inside the active interactive session). You need to make sure you have requested the MAXIMUM of cores and memory desired by any one step of the pipeline in order for this profile to execute successfully.
executor
describes how the pipeline processes will be run (such as on local compute resources, on cloud resources, or by interacting with a cluster compute scheduler). Theexecutor
will keep track of each of the processes, and if they succeed or fail.When using the
slurm
executor, Nextflow can submit each process in the workflow as ansbatch
job.Additional parameters that govern the Slurm job submission process are
queueSize
andsubmitRateLimit
. ThequeueSize
is how many tasks can be processed at one time; here we use 1900 tasks. ThesubmitRateLimit
is the maximum number of jobs that will be submitted for a specified time interval. In our file, we limit it to 20 jobs submitted per second.
When using the
local
executor, Nextflow will run each process using the resources available on the current compute node.
Modifying the configuration file
As mentioned above, the following variables may be modified depending on your job submission preferences:
queue
- if you would like to submit to a contributed or other specialized access partition, you can replace this entire string with the appropriate partition name (e.g.,highmem
orgpu_quad
). You will still need to make sure you have access to submit to that partition.clusterOptions
- this is a variable that allows you to set additionalsbatch
parameters (or even other execution parameters for pipeline steps, but that is outside the scope of this section). More information can be found in the Nextflow documentation for the clusterOptions flag. An example of its use here could be something like:
Code Block |
---|
profiles {
...
cluster {
process {
executor = 'slurm'
cache = 'lenient'
queue = gpu_quad
clusterOptions = '--gres=gpu:1, -x compute-g-17-[166-171]'
}
}
... |
here, we are now submitting directly to the gpu_quad
partition in all cases instead of dealing with the cpu-only partitions, and are excluding the compute nodes with L40S GPU cards (just as an example).
Do note that if you do plan to use gpu-based partitions, you should make sure that your process runtime limits do not exceed that of the partition’s specified limits (5 days in this case). If that is a concern, add the time
field to the above block to override the process limits specified by the workflow:
Code Block |
---|
profiles {
...
cluster {
process {
executor = 'slurm'
cache = 'lenient'
queue = gpu_quad
clusterOptions = '--gres=gpu:1', '-x compute-g-17-[166-171]'
time = '5.d'
}
}
... |
Executing Nextflow Pipelines
Once the NXF_SINGULARITY_CACHEDIR
environment variable is set (assuming you are using the singularity
profile), you have two options for invoking your pipeline:
if the pipeline is an official
nf-core
pipeline, you can simply paste the command from the website and modify it to use the correct input, output, and profile settings.Otherwise, use
nextflow run
. A typicalnextflow run
command may look something like this:Code Block nextflow run REPONAME/PIPELINENAME -profile cluster,singularity -c /path/to/slurm.config -input /path/to/input -outdir /path/to/output
You may need to refer to execution instructions provided by the pipeline maintainer.
To view a list of all pipelines you have ever downloaded/run, you can invoke the nextflow list
command. These pipelines are located at $HOME/.nextflow/assets
.
Cleaning Up After Execution
After your pipeline completes, there will be work
and .nextflow
directories at the location where you executed the workflow (not to be confused with your output directory). You may find it useful to occasionally delete these directories, especially if you find that you are using far more space than anticipated. You can keep track of your utilization with the quota-v2
tool (see https://harvardmed.atlassian.net/wiki/spaces/O2/pages/1588662343/Filesystem+Quotas#Checking-Usage ).
Note that these directories will be present at every location where you have ever executed a pipeline, so you may need to remove multiple directories from different locations if you do not have an established means of organization for juggling multiple workflows.
Also note that resume
(checkpointing) functionality will not work if you remove the work
OR .nextflow
directory for a given workflow execution location - it will think you are starting over from the beginning. You will see the following message:
Code Block |
---|
WARN: It appears you have never run this project before -- Option `-resume` is ignored |
Troubleshooting Pipelines
Each workflow execution will generate a .nextflow.log
file in the directory where the pipeline is invoked. Subsequent executions will result in nextflow
renaming previous .nextflow.log
files to .nextflow.log.1
, .nextflow.log.2
, etc., depending on how many executions are performed in the current directory - .nextflow.log
is always going to be the log file associated with the most recent run, with files with increasing numbers associating with older and older runs (.2
happened before .1
, etc.).
Workflows that are resume
-d will generate a NEW .nextflow.log
file, so it may be necessary to reconcile the newest log with the most recent previously generated logs to view full workflow output.
Some workflows may also include a debug
profile, which you can invoke alongside other profiles, to get more verbose output while the workflow is executing.
There are also some workflows where when it fails for some reason, but you do not see an error explaining the failure unless you visit a log file within a subdirectory of the work
folder. In such a case, you can refer to the output from nextflow log runName -f status,name,workdir
. In that command, runName
is the name that will be automatically assigned when your workflow is executed and the items after -f
are columns to display in the output.
Code Block |
---|
$ nextflow log deadly_davinci -f status,name,workdir
COMPLETED NFCORE_DEMO:DEMO:FASTQC (SAMPLE2_PE) /n/groups/labname/abc123/nextflow_directory/work/7f/c4076aa7ac34ed830920cd6a38b7cc
COMPLETED NFCORE_DEMO:DEMO:SEQTK_TRIM (SAMPLE2_PE) /n/groups/labname/abc123/nextflow_directory/work/53/d42b6aed1d402fe707804dae414aba
COMPLETED NFCORE_DEMO:DEMO:SEQTK_TRIM (SAMPLE3_SE) /n/groups/labname/abc123/nextflow_directory/work/da/e3e73c94dd61553e52a3325ca025ef
COMPLETED NFCORE_DEMO:DEMO:SEQTK_TRIM (SAMPLE1_PE) /n/groups/labname/abc123/nextflow_directory/work/e7/34a6592fe45d20dc4c67ecbac661f1
COMPLETED NFCORE_DEMO:DEMO:FASTQC (SAMPLE3_SE) /n/groups/labname/abc123/nextflow_directory/work/4b/7249114061ce5255f622f027e94757
COMPLETED NFCORE_DEMO:DEMO:FASTQC (SAMPLE1_PE) /n/groups/labname/abc123/nextflow_directory/work/b9/2ee82e060885b2b33900767db61abd
FAILED NFCORE_DEMO:DEMO:MULTIQC /n/groups/labname/abc123/nextflow_directory/work/f4/1b760137eca3bfa11cfd90cba9301b |
In the above output, each process that was run is displayed on its own line. The first column has the status
for the process, the second column reports the name
of the process, and the final column reports the workdir
. The available columns that can be reported with nextflow log
can be seen via
Code Block |
---|
nextflow log -l |
Definitions for each of these fields can be found in the Nextflow documentation here.
All but one of our processes had COMPLETED
status, meaning everything executed as expected. We would need to troubleshoot steps that report FAILED
or ABORTED
, of which we have one (the MULTIQC step). To find the associated files for a process, look at the last column; this has location of the associated subdirectory of the work
folder. Depending on the workflow, there may or may not be a log file with useful error messages contained within the process directory.