On O2, we encourage cluster users to install the packages and software they need. One method to install packages and manage environments is to use conda, which is available through the conda2/4.2.13
module. Conda manages dependencies by default when you install packages, which can make it easier to install software. Packages that can be installed with conda include Python modules, libraries, or executable programs. Conda includes its own version of Python (2.7.12), though you can explicitly request to use Python 3 if you would prefer.
Commonly used commands, examples
Command | Meaning |
---|---|
module spider conda | shows the versions of conda installed on O2 |
module load conda2/version | loads an individual conda module (replace version with an actual version) |
conda info --envs | see available conda environments |
conda create -n test_env | create conda environment named test_env (name the environment whatever you'd like) |
conda create -n aligners_env bwa bowtie star | create conda environment, and install some packages (bwa, bowtie, and star) on the fly |
source activate test_env | "activate" a conda environment named test_env |
source deactivate | exit current conda environment |
conda-env remove -n test_env | delete a conda environment named test_env |
conda search numpy | search for a package (replace numpy with the package of your choice) |
conda install numpy | install a package, and must be within a conda environment or this command will fail. (replace numpy with the package of your choice) |
Setup
To install packages on O2 using conda, you must first create a conda environment. Environments are simply directories in ~/.conda/envs/
that contain packages you installed. You "source" an environment to use those packages, and can "deactivate" to exit the environment. You can have multiple environments, and can switch between them.
First let's get into an interactive session, as installing conda packages is resource intensive and should not be done on the login nodes.
mfk8@login01:~$ srun --pty -p interactive -t 0-2 bash
Next, load the conda module:
mfk8@compute-a-01-01:~$ module load conda2/4.2.13
Then the conda
command will be available:
mfk8@compute-a-01-01:~$ which conda /n/app/conda2/bin/conda
Running conda info
will return information about the current conda installation:
mfk8@compute-a-01-01:~$ conda info
You can see available conda environments with conda info --envs.
If you have not created any conda environments yet, the only listing you will see is the root environment in /n/app/conda2
. Cluster users do not have access to alter this.
mfk8@compute-a-01-01:~$ conda info --envs
You can create your own environment to install packages to. You can change the environment name (specified after -n
):
mfk8@compute-a-01-01:~$ conda create -n test_env
If you no longer want an environment, use conda-env remove
to delete the environment and any packages installed to it:
mfk8@compute-a-01-01:~$ conda-env remove -n test_env mfk8@compute-a-01-01:~$ conda info --envs # test_env will no longer be listed
Basic usage
To use the conda environment, it must be activated. Note that your prompt will change:
mfk8@compute-a-01-01:~$ source activate test_env (test_env) mfk8@compute-a-01-01:~$
To exit an environment you run source deactivate
, and your prompt will return to normal:
(test_env) mfk8@compute-a-01-01:~$ source deactivate mfk8@compute-a-01-01:~$
As you just exited the environment, any packages installed to that environment will not be able to be used now.
You can create as many conda environments as you need. Environments are independent (changing one environment won't affect another). They can be used for different analyses, or perhaps if you need more than one version of the same tool. You can run conda info --envs
to list all of your conda environments.
Installing Packages
To search for available versions of a package that can be installed, use conda search
:
(test_env) mfk8@compute-a-01-01:~$ conda search nameofpackage
With your conda environment activate, you can install a package with conda install
. Conda will handle dependencies by default. Make sure that you do not install conda packages when on a login node. Only install packages when you have requested dedicated resources beforehand (i.e. you are on a compute node and in a interactive job).
(test_env) mfk8@compute-a-01-01:~$ conda install nameofpackage
Conda and Python versions
Note that conda includes its own version of Python:
(test_env) mfk8@compute-a-01-01:~$ which python /n/app/conda2/bin/python
The default version of Python that's available through the conda module is Python 2.7.12:
(test_env) mfk8@compute-a-01-01:~$ python --version Python 2.7.12 :: Continuum Analytics, Inc.
If you want to use conda and Python 3, you can create a conda environment and install Python 3 to it. For example to create an environment using Python 3.6.5:
mfk8@compute-a-01-01:~$ conda create -n python_3.6.5 python=3.6.5 mfk8@compute-a-01-01:~$ source activate python_3.6.5 (python_3.6.5) mfk8@compute-a-01-01:~$ which python ~/.conda/envs/python_3.6.5/bin/python (python_test3) mfk8@compute-a-01-01:~$ python --version Python 3.6.5
A full example
mfk8@login01:~$ srun --pty -p interactive -t 0-2 bash mfk8@compute-a-01-01:~$ module load conda2/4.2.13 mfk8@compute-a-01-01:~$ module list Currently Loaded Modules: 1) conda2/4.2.13 (E) Where: E: Experimental mfk8@compute-a-01-01:~$ conda create -n my_env # truncated mfk8@compute-a-01-01:~$ source activate my_env # install example python package, scipy, which is available through conda: (my_env) mfk8@compute-a-01-01:~$ conda install scipy # truncated # see list of packages available in this conda environment: (my_env) mfk8@compute-a-01-01:~$ conda list # truncated # will report scipy in the list # test importing scipy in python to verify it is installed correctly (my_env) mfk8@compute-a-01-01:~$ python -c "import scipy" (my_env) mfk8@compute-a-01-01:~$ # exit environment (my_env) mfk8@compute-a-01-01:~$ source deactivate mfk8@compute-a-01-01:~$
Supported channels
The centralized conda installation, available through the conda2/4.2.13
module, includes several channels that we support. Channels are repositories where conda looks for packages. This is done with a centralized .condarc file that contains:
- conda-forge
- defaults
- r
- bioconda
The order here matters, as conda will pull packages from channels based upon the channel "priority". For example, the channel listed first in .condarc has the highest priority, and the channel listed last has the lowest priority. This means that if the package you want to install is found in multiple channels in your .condarc, conda will default to installing the version found in the highest priority channel. See here in the conda documentation for more information on channel management.
Conda-forge is a repository of recipes, which are used to build conda packages. The defaults channel is necessary for several dependencies, including conda and conda-build. The r channel contains common R packages, some of which are dependencies for bioconda packages. Bioconda is a channel geared for bioinformatics packages.
If you wish, you can still maintain your own ~/.condarc
file, but we may be unable to assist when using unsupported channels.