Table of Contents |
---|
Using MATLAB in O2
MATLAB is a resource-intensive application, and MUST ALWAYS be run on O2's computing nodes. This can be done submitting a job through the SLURM scheduler as explained in detailed below.
Note that in order to start MATLAB you will first need to load the corresponding module, for example module load matlab/2017a
Note: The The content below is presented assuming the User is already familiar with the O2 cluster and the SLURM scheduler. For more general information on how to submit jobs on the O2 cluster, the available partitions (queues) and the most useful submission flags please review our O2 guide.
There are several ways to run MATLAB jobs.
...
This can be done using the command srun --pty -p interactive -t 60:00 matlab
For example
Code Block |
---|
rp189@login01:~ module load matlab/2019a
rp189@login01:~ srun --pty -p interactive -t 60:00 matlab
srun: job 1412768 queued and waiting for resources
srun: job 1412768 has been allocated resources
MATLAB is selecting SOFTWARE OPENGL rendering.
... |
MATLAB batch jobs on O2
If you don't need to interact with the MATLAB interface, you can instead run one or more jobs by submitting them to O2 as batch jobs. Here below is a simple example of how to submit a 1 core MATLAB batch job to the partition short requesting a 6 hours wall time and ~8GB of memory
Code Block |
---|
rp189@login01:~ sbatch jobscript |
where jobscript
is a file that contains
Code Block |
---|
#!/bin/bash
#SBATCH -p short
#SBATCH -t 1:00:00
#SBATCH --mem=8000
#SBATCH -c 1
module load matlab/2018b
matlab -nodesktop -r "myfunction(my_inputs)"
#----------------------------------------------------- |
Another possibility is to use the flag wrap
to pass the MATLAB command directly to the sbatch
line.
The equivalent of the above example is
Code Block |
---|
rp189@login01:~ module load matlab/2018b
rp189@login01:~ sbatch -p short -c 1 -t 1:00:00 --mem=8000 --wrap="matlab -nodesktop -r \"myfunction(my_inputs)\"" |
where the special character \
must be used before the internal set of parenthesis.
NOTE:
Starting from MATLAB version 2019a the flag -r should be replaced with the flag -batch, for example:
Code Block |
---|
#-----------------------------------------------------
#SBATCH -p short
#SBATCH -t 1:00:00
#SBATCH --mem=8000
#SBATCH -c 1
module load matlab/2019a
matlab -batch "myfunction(my_inputs)"
#----------------------------------------------------- |
How to propagate MATLAB errors to the SLURM scheduler when using version 2018b or earlier
By default a SLURM job containing a MATLAB script will be recorded as "COMPLETED" or "TIME OUT" even when the executed MATLAB script fails. This is happening because the scheduler is executing and tracking the behavior of the command matlab -r "your_code" rather than the outcome of the actual function your_code.
To ensure that the outcome of a MATLAB job is captured by the scheduler you can use the MATLAB try catch exit(1) end construct as shown in the example below:
Code Block | ||
---|---|---|
| ||
% Matlab wrapper to catch and propagate a non-zero exit status
try
your_code
catch my_error
my_error
exit(1)
end
exit |
This script will run the function your_code and if no error is detected the script will then exit with SLURM reporting a successfully completed job. If instead your_code fails the script will catch and print the error message and will terminate MATLAB returning a non-zero exit status which will be then recorded by the scheduler as a failed job
Note that when using version 2019a or later with the flag -batch MATLAB will automatically propagate an error to the SLURM scheduler.
Running parallel MATLAB jobs on the O2 cluster
It is possible to run MATLAB parallel jobs++ on the O2 cluster using either the local cluster profile or the O2 cluster profile
(++ in order to run parallel the MATLAB scripts must contain parallel commands, such as parfor or spmd)
MATLAB Parallel jobs using the default local cluster profile
...
This approach can be used on any of the O2 partition with the exception of the mpi partition
Note 1: Several complex operations in MATLAB are already parallelized (intrinsic parallelization of libraries), if your script is serial but uses intensively these parallelized libraries you might still want to request at least 2 or 3 cores using this approach in order to retain the associated speedup performance.
...