MATLAB Parallel jobs using the custom O2 cluster profile

It is possible to configure MATLAB so that it interacts with the SLURM scheduler. This allows MATLAB to directly submit parallel jobs to the SLURM scheduler and enables it to leverage cpu and memory resources across different nodes (distributed memory).

To do so you first need to configure the O2 cluster profile in the MATLAB version being used which is done running the command configCluster

 

NOTE: 

It is strongly recommended to use MATLAB version 2019a or later when submitting multi-node jobs (mpi partition) with MATLAB O2 cluster profile. Earlier versions of MATLAB are using a mechanism to start the MATLAB workers that is not fully compatible with our existing SLURM epilog and could cause jobs to be killed.

Setting up the O2 MATLAB Cluster Profile 

 

1 2 3 4 5 6 7 8 9 10 11 >> configCluster Must set WallTime and QueueName before submitting jobs to O2. E.g. >> c = parcluster; >> % 5 hour walltime >> c.AdditionalProperties.WallTime = '05:00:00'; >> c.AdditionalProperties.QueueName = 'queue-name'; >> c.saveProfile >>

now your default cluster profile is set to o2 local R2019a and you should be able to verify it by running the command parcluster

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 >> parcluster ans = Generic Cluster Properties: Profile: o2 R2019a Modified: false Host: compute-a-16-22 NumWorkers: 100000 NumThreads: 1 JobStorageLocation: /home/abc123/MdcsDataLocation/o2/R2019a ClusterMatlabRoot: /n/app/matlab/2019a OperatingSystem: unix RequiresOnlineLicensing: false IntegrationScriptsLocation: /n/app/matlab/2019a/toolbox/local/IntegrationScripts/o2 AdditionalProperties: List properties Associated Jobs: Number Pending: 0 Number Queued: 0 Number Running: 0 Number Finished: 0 >>



Note 1:  The configCluster command needs to be executed only on time

Note 2:  After running the configCluster command the default cluster profile is set to the O2 cluster, if you want to go back and use the "local" cluster profile you can change the default profile using the command  parallel.defaultClusterProfile('local')

Note 3: Running the configCluster command sets the cluster profile only for the currently used MATLAB version. If later on you use a different version of MATLAB you will need to run configCluster again 

Note 4: O2 MATLAB cluster profile is not compatible with Orchestra profile. If you plan to run on both clusters it is recommended to use a different version of MATLAB in each cluster (for example 2016b in Orchestra and 2017a in O2)



Setting the submission parameter for the O2 MATLAB cluster profile 



In order to use the O2 MATLAB cluster profile it is required to define at least two submission parameters: the partition to be used and the desired wall-time. In MATLAB 2016b this can be done with the command ClusterInfo.set+Property for example:

1 2 3 >> ClusterInfo.setQueueName('mpi') >> ClusterInfo.setWallTime('48:00') >>

Note: In the above example the partition "mpi" is used to set the parameter ClusterInfo.setQueueName, however the MATLAB O2 Cluster Profile can be used with any of the partitions available on the O2 cluster.

Several other parameter can be defined in a similar way, this below is the complete list available: 

1 2 3 4 5 6 >> ClusterInfo. setArch setDiskSpace setPrivateKeyFile setRequireExclusiveNode setUserNameOnCluster setClusterHost setEmailAddress setPrivateKeyFileHasPassPhrase setReservation setWallTime setConstraint setGpusPerNode setProcsPerNode setSshPort setDataParallelism setMemUsage setProjectName setUseGpu setDebugMessagesTurnedOn setNameSpace setQueueName setUserDefinedOptions

The command ClusterInfo.setUserDefinedOptions can be used to pass additional flag to the scheduler. For example ClusterInfo.setUserDefinedOptions('-o output.log') will pass the flag -o output.log to the scheduler when submitting a job from within MATLAB. Similarly the command ClusterInfo.get+Property can be used to check the assigned Property

Note that, once assigned, each property will be saved in the user ~/.matlab profile folder and will not need to be re-defined unless a change is desired (i.e. different wall-time, partition, amount of memory, etc.)





Define job submission flags for Version ≥ R2017a 

In order to use the O2 MATLAB cluster profile it is required to define at least two submission parameters: the partition to be used and the desired wall-time. This can be done assigning the properties directly to a parcluster object as shown in the example below:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 >> c=parcluster; % Specify the walltime (e.g. 48 hours) >> c.AdditionalProperties.WallTime = '48:00:00'; % Specify a partition to use for MATLAB jobs >> c.AdditionalProperties.QueueName = 'partition-name'; % Optional flags % Specify memory to use for MATLAB jobs, per core (MB) >> c.AdditionalProperties.MemUsage = '4000'; % Specify the GPU card to run on >> c.AdditionalProperties.GpuCard = 'gpu-card-to-use'; % Request 2 GPUs per node >> c.AdditionalProperties.GpusPerNode = 2; % add any sbatch supported flag manually (for example mem per node and Num tasks per node): >> c.AdditionalProperties.AdditionalSubmitArgs = '--mem=4000 --tasks-per-node=2' % Save changes after modifying AdditionalProperties for the above changes to persist between MATLAB sessions >> c.saveProfile

Note that set parameters by default will not be retained and will need to be re-entered if the c object is deleted. To save permanently the submission parameter you must execute the command c.saveProfile

Important: Use --mem-per-cpu (or the flag c.AdditionalProperties.MemUsage) instead of --mem to request custom amount of memory when using the mpi partition. The slurm flag --mem is used to request a given amount of memory per node, so, unless you are enforcing a balanced distribution of tasks (i.e. MATLAB workers) per node, you might end up with too much or not enough memory on a given node, depending on how the tasks are allocated.



Using the O2 MATLAB Cluster Profile 

Parpool() command

One way to use the O2 MATLAB cluster profile is to request a parallel pool of N_c workers with the command parpool(N_c). MATLAB will submit a slurm job request for the requested number of cores N_c and will start the parallel pool once the requested cores are allocated. For example requesting a parallel pool of 3 cores would look like:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 >> parpool(3) Starting parallel pool (parpool) using the 'o2 local R2017a' profile ... additionalSubmitArgs = '--ntasks=3 -t 48:00:00 -p mpi --ntasks-per-node=1' connected to 3 workers. ans = Pool with properties: Connected: true NumWorkers: 3 Cluster: o2 local R2017a AttachedFiles: {} IdleTimeout: 30 minutes (30 minutes remaining) SpmdEnabled: true

Then any parallel part of a script will be executed on the parallel workers allocated with the parpool command, for example:

1 2 3 4 5 6 7 8 9 10 11 12 13 % command executed locally on the current node: >> system('hostname'); compute-a-16-68.o2.rc.hms.harvard.edu % same command executed within a parallel Matlab construct >> spmd;system('hostname');end Lab 1: compute-a-16-74.o2.rc.hms.harvard.edu Lab 2: compute-a-16-75.o2.rc.hms.harvard.edu Lab 3: compute-a-16-76.o2.rc.hms.harvard.edu

Note 1: If you run a non interactive parallel job using the parpool() command with the O2 cluster profile you will actually be dispatching two jobs. First a serial job (1 core) to start your MATLAB script (i.e. matlab -nodesktop -r "my_function") and then a second parallel job that will be submitted directly within MATLAB once the execution of  my_function.f reaches the parpool() command. To avoid this double-job condition you can use the command batch described later in this page

Note 2: The command parpool cannot be executed if MATLAB has been started on login nodes. Make always sure to start interactive MATLAB sessions from within interactive jobs (on actual compute nodes instead of login nodes)

Batch Command

Similarly it is also possible to dispatch a batch of parallel jobs directly from within MATLAB using the command batch. In the example below we will submit to the cluster the simple parallel sleep function:

1 2 3 4 5 6 7 function elapsed = sleep(cc) tic parfor kk=1:cc pause(1) end elapsed=toc; end

 starting 3 parallel jobs with 2,5 and 10 cores. In each job running the above function with the input parameter of 20 (i.e. sleep for 20 seconds) and measure the real elapsed time:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 % set the cluster object >> c=parcluster; % submit three parallel job with 2, 5 and 10 cores >> j1=c.batch(@sleep, 1, {20},'Pool',2); additionalSubmitArgs = '--ntasks=3 -t 1:00:00 -p mpi' >> j2=c.batch(@sleep, 1, {20},'Pool',5); additionalSubmitArgs = '--ntasks=6 -t 1:00:00 -p mpi' >> j3=c.batch(@sleep, 1, {20},'Pool',10); additionalSubmitArgs = '--ntasks=11 -t 1:00:00 -p mpi' % wait for the jobs to complete >> j1.wait;j2.wait;j3.wait; % gather the results >> j1.fetchOutputs ans = cell [10.7801] >> j2.fetchOutputs ans = cell [4.4421] >> j3.fetchOutputs ans = cell [2.4390]

Note 1: When using the function batch to run a parallel function you will not need to add explicitly the command parpool inside the parallel function being executed (see the sleep.m example above)

Note 2: the function batch can also be used to submit non-parallel jobs, for example:

1 2 3 4 5 6 7 8 9 10 11 12 13 >> j4=c.batch(@cos,1,{pi}); additionalSubmitArgs = '--ntasks=1 -t 1:00:00 -p short' >> j4.fetchOutputs ans = cell [-1]