...
Users that are interested in leveraging existing nf-core
workflows may do so using the nf-core
utility that they had installed via the instructions above. Generally, these workflows are invoked with the singularity
profile for reproducibility purposes. However, manual intervention from HMS Research Computing is still currently required to get the containers installed.
If the pipeline is part of the official nf-core
repositories (e.g., it is listed at https://nf-co.re/pipelines/ ), then please contact HMS Research Computing at rchelp@hms.harvard.edu with the pipeline name and version for which you would like the containers to be installed.
...
Users attempting to set up a Nextflow pipeline that is not an official nf-core
pipeline will need to download the containers associated with the pipeline using whatever means is suggested by the pipeline maintainers.
At this point, please contact HMS Research Computing at rchelp@hms.harvard.edu for assistance with moving these containers to the whitelisted location, and please indicate the path to which you downloaded these containers, as well as whether the pipeline is going to be for your personal use or if it will be shared with fellow lab members.
After containers are installed
If the requested containers You may attempt to use the self-service container installation tool to install your containers to the whitelisted directory as specified here: Self-Install Singularity Containers Note that this does require you to download the containers locally first. If this does not work for whatever reason, or the container installation is incomplete (i.e., you are also dealing with additional symbolic links or something else that the tool cannot presently handle), manual intervention will be required.
At this point, please contact HMS Research Computing at rchelp@hms.harvard.edu for assistance with moving these containers to the whitelisted location, and please indicate the path to which you downloaded these containers, as well as whether the pipeline is going to be for your personal use or if it will be shared with fellow lab members.
After containers are installed
If the requested containers were associated with an official nf-core
pipeline, they will be installed to
Code Block |
---|
/n/app/singularity/containers/nf-core/PIPELINENAME/PIPELINEVERSION |
For other pipelines, they will be installed to
Note that this is a directory that exists independent of individual user or lab membership - if you are looking to leverage a new nf-core pipeline, please look inside this directory tree to check if the pipeline and version you are intending to use has already installed containers. This will save you the time to contact HMS IT.
For other pipelines, they will be installed to
Code Block |
---|
/n/app/singularity/containers/HMSID/ |
...
Nextflow/nf-core does not provide HPC resource utilization out of the box via standard workflow configurations, but it can be configured manually. If you do not provide a configuration file to interact with the Slurm scheduler on O2, Nextflow/nf-core will only use the local
resources of your current job allocation (such as within an interactive srun
job).
Boilerplate O2 Configuration File
...
Each workflow execution will generate a .nextflow.log
file in the directory where the pipeline is invoked. Subsequent executions will result in nextflow
renaming previous .nextflow.log
files to .nextflow.log.1
, .nextflow.log.2
, etc., depending on how many executions are performed in the current directory - .nextflow.log
is always going to be the log file associated with the most recent run, with files with increasing numbers associating with older and older runs (.2
happened before .1
, etc.).
Workflows that are resume
-d will generate a NEW .nextflow.log
file, so it may be necessary to reconcile the newest log with the most recent previously generated logs to view full workflow output.
Some workflows may also include a debug
profile, which you can invoke alongside other profiles, to get more verbose output while the workflow is executingdirectory - .nextflow.log
is always going to be the log file associated with the most recent run, with files with increasing numbers associating with older and older runs (.2
happened before .1
, etc.).
Workflows that are resume
-d will generate a NEW .nextflow.log
file, so it may be necessary to reconcile the newest log with the most recent previously generated logs to view full workflow output.
Some workflows may also include a debug
profile, which you can invoke alongside other profiles, to get more verbose output while the workflow is executing.
There are also some workflows where when an workflow fails for some reason, but you do not see an error explaining the failure unless you visit a log file within a subdirectory of the work
folder. In such a case, you can refer to the pipeline_info
directory where the workflow was started. Under pipeline_info
, there will be a text file that named like execution_trace_YEAR_MONTH_DAY_TIME.txt
and has details on each process in the workflow. More details on this trace file can be found in the Nextflow documentation here.
Within execution_trace*txt
, you can focus on the processes that report statuses of FAILED
or ABORTED
when troubleshooting. To find the associated folder and log for a process that did not complete successfully, look at the hash
column. The relevant work
subdirectory will match the hash
column. For example, let’s say we have an execution trace file that contains:
Code Block |
---|
$ cat pipeline_info/execution_trace_2024-10-21_15-57-46.txt
task_id hash native_id name status exit submit duration realtime %cpu peak_rss peak_vmem rchar wchar
2 43/26d4df 49933736 NFCORE_DEMO:DEMO:FASTQC (SAMPLE1_PE) FAILED 0 2024-10-21 15:57:59.516 32.1s 5s 247.0% 570 MB 63.7 GB 19.2 MB 4.6 MB
... |
We would look for the corresponding folder for this process in: work/43/26d4df*
. Depending on the workflow, there may or may not be a log file with useful error messages contained within the process directory.