Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Users that are interested in leveraging existing nf-core workflows may do so using the nf-core utility that they had installed via the instructions above. Generally, these workflows are invoked with the singularity profile for reproducibility purposes. However, manual intervention from HMS Research Computing is still currently required to get the containers installed.

If the pipeline is part of the official nf-core repositories (e.g., it is listed at https://nf-co.re/pipelines/ ), then please contact HMS Research Computing at rchelp@hms.harvard.edu with the pipeline name and version for which you would like the containers to be installed.

...

Users attempting to set up a Nextflow pipeline that is not an official nf-core pipeline will need to download the containers associated with the pipeline using whatever means is suggested by the pipeline maintainers.

At this point, please contact HMS Research Computing at rchelp@hms.harvard.edu for assistance with moving these containers to the whitelisted location, and please indicate the path to which you downloaded these containers, as well as whether the pipeline is going to be for your personal use or if it will be shared with fellow lab members.

After containers are installed

If the requested containers You may attempt to use the self-service container installation tool to install your containers to the whitelisted directory as specified here: Self-Install Singularity Containers Note that this does require you to download the containers locally first. If this does not work for whatever reason, or the container installation is incomplete (i.e., you are also dealing with additional symbolic links or something else that the tool cannot presently handle), manual intervention will be required.

At this point, please contact HMS Research Computing at rchelp@hms.harvard.edu for assistance with moving these containers to the whitelisted location, and please indicate the path to which you downloaded these containers, as well as whether the pipeline is going to be for your personal use or if it will be shared with fellow lab members.

After containers are installed

If the requested containers were associated with an official nf-core pipeline, they will be installed to

Code Block
/n/app/singularity/containers/nf-core/PIPELINENAME/PIPELINEVERSION

For other pipelines, they will be installed to

Note that this is a directory that exists independent of individual user or lab membership - if you are looking to leverage a new nf-core pipeline, please look inside this directory tree to check if the pipeline and version you are intending to use has already installed containers. This will save you the time to contact HMS IT.

For other pipelines, they will be installed to

Code Block
/n/app/singularity/containers/HMSID/

...

Nextflow/nf-core does not provide HPC resource utilization out of the box via standard workflow configurations, but it can be configured manually. If you do not provide a configuration file to interact with the Slurm scheduler on O2, Nextflow/nf-core will only use the local resources of your current job allocation (such as within an interactive srun job).

Boilerplate O2 Configuration File

...

Each workflow execution will generate a .nextflow.log file in the directory where the pipeline is invoked. Subsequent executions will result in nextflow renaming previous .nextflow.log files to .nextflow.log.1, .nextflow.log.2, etc., depending on how many executions are performed in the current directory - .nextflow.log is always going to be the log file associated with the most recent run, with files with increasing numbers associating with older and older runs (.2 happened before .1, etc.).

Workflows that are resume-d will generate a NEW .nextflow.log file, so it may be necessary to reconcile the newest log with the most recent previously generated logs to view full workflow output.

Some workflows may also include a debug profile, which you can invoke alongside other profiles, to get more verbose output while the workflow is executingdirectory - .nextflow.log is always going to be the log file associated with the most recent run, with files with increasing numbers associating with older and older runs (.2 happened before .1, etc.).

Workflows that are resume-d will generate a NEW .nextflow.log file, so it may be necessary to reconcile the newest log with the most recent previously generated logs to view full workflow output.

Some workflows may also include a debug profile, which you can invoke alongside other profiles, to get more verbose output while the workflow is executing.

There are also some workflows where when an workflow fails for some reason, but you do not see an error explaining the failure unless you visit a log file within a subdirectory of the work folder. In such a case, you can refer to the pipeline_info directory where the workflow was started. Under pipeline_info, there will be a text file that named like execution_trace_YEAR_MONTH_DAY_TIME.txt and has details on each process in the workflow. More details on this trace file can be found in the Nextflow documentation here.

Within execution_trace*txt, you can focus on the processes that report statuses of FAILED or ABORTEDwhen troubleshooting. To find the associated folder and log for a process that did not complete successfully, look at the hash column. The relevant work subdirectory will match the hash column. For example, let’s say we have an execution trace file that contains:

Code Block
$ cat pipeline_info/execution_trace_2024-10-21_15-57-46.txt
task_id	hash	native_id	name	status	exit	submit	duration	realtime	%cpu	peak_rss	peak_vmem	rchar	wchar
2	43/26d4df	49933736	NFCORE_DEMO:DEMO:FASTQC (SAMPLE1_PE)	FAILED	0	2024-10-21 15:57:59.516	32.1s	5s	247.0%	570 MB	63.7 GB	19.2 MB	4.6 MB
...

We would look for the corresponding folder for this process in: work/43/26d4df*. Depending on the workflow, there may or may not be a log file with useful error messages contained within the process directory.