Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
breakoutModewide
//Use the params to define reference files, directories, and CLI options
params {

    config_profile_description = 'HMS RC test nextflow/nf-core config'
    config_profile_contact = 'rchelp@hms.harvard.edu'
    config_profile_url = 'rc.hms.harvard.edu'
    max_memory = 250.GB
    // maximum number of cpus and time for slurm jobs
    max_cpus = 20
    max_time = 30.d

}

profiles {

    singularity {
        singularity.enabled = true
        singularity.autoMounts = true
    }

    cluster {
        process {
			executor = 'slurm'
			cache = 'lenient'
			queue = { task.time > 5.d ? 'long' : task.time <= 12.h ? 'short' : 'medium'}
        }
    }

    local {
        process.executor = 'local'
    }
}

executor {
    $slurm {
        queueSize = 1900
		submitRateLimit = '20 sec'
    }
}

// On a successful completion of a Nextflow run, automatically delete all intermediate files stored in the work/ directory
cleanup = true

// Allows to override the default cleanup = true behaviour for debugging
debug {
        cleanup = false
}

//Miscellaneous CLI flags
resume = true

...

  • profiles describes methods by which the pipeline can be invoked. This is specified at execution time via nextflow ... -p profilename1,profilename2,.... At least one profile name must be specified. The profile names in this file are in addition to the default profiles (the singularity profile in this file augments the default singularity profile implemented by Nextflow, etc.).

    • the singularity profile sets parameters to allow usage of Singularity containers on O2 to execute pipeline steps. You shouldn’t need to mess with this profile.

    • the cluster profile sets parameters to allow submission of pipeline steps via O2’s slurm scheduler.

      • the only parameter you may be interested in is the queue parameter, which governs which partition a pipeline step is submitted to.

        • If a pipeline step requires less than 12 hours, it is submitted to short. If less than 5 days, medium. Otherwise, long.

        • If you have access to additional partitions (such as mpi, highmem, contributed partitions, etc.), set queue accordingly.

          • Keep in mind that such special partitions do not have the same time governances (other than the 30 day limit) on them, so if you would like to integrate one or more of these partitions with the existing short / medium / long paradigm, you will likely need to modify one or more of the pipeline-specific configuration files as well. Please contact rchelp@hms.harvard.edu with inquiries about this.

          • If you are planning to use a specialized partition exclusively, then simply overwrite the queue specification with that partition name.

    • the local profile invites the pipeline to be executed within your existing resource allocation (e.g., inside the active interactive session). You need to make sure you have requested the MAXIMUM of cores and memory desired by any one step of the pipeline in order for this profile to execute successfully.

  • executor describes how the pipeline processes will be run (such as on local compute resources, on cloud resources, or by interacting with a cluster compute scheduler). The executor will keep track of each of the processes, and if they succeed or fail.

    • When using the slurm executor, Nextflow can submit each process in the workflow as an sbatch job.

      • Additional parameters that govern the Slurm job submission process are queueSize and submitRateLimit. The queueSize is how many tasks can be processed at one time; here we use 1900 tasks. The submitRateLimitis the maximum number of jobs that will be submitted for a specified time interval. In our file, we limit it to 20 jobs submitted per second.

    • When using the local executor, Nextflow will run each process using the resources available on the current compute node.

Cleaning Up After Execution

After your pipeline completes, there will be a work directory at the location where you executed the workflow (not to be confused with your output directory). You may find it useful to occasionally delete this directory, especially if you find that you are using far more space than anticipated. You can keep track of your utilization with the quota-v2 tool (see https://harvardmed.atlassian.net/wiki/spaces/O2/pages/1588662343/Filesystem+Quotas#Checking-Usage ).

Note that there will be a work directory at every location where you have ever executed a pipeline, so you may need to remove multiple work directories if you do not have an established means of organization for juggling multiple workflows.