How to get started with distributed computing using qsub?

The FieldTrip qsub toolbox is a small stand-alone toolbox to facilitate distributed computing. The idea of the qsub toolbox is to provide you with an easy MATLAB interface to distribute your jobs and not have to go to the Linux command-line to use the qsub command from there. Besides the Torque cluster (which we have at the Donders in Nijmegen) it also supports Linux clusters with other PBS versions, Sun Grid Engine (SGE), Oracle Grid Engine, SLURM and LSF as batch queueing systems.

You should start by adding the qsub toolbox to your MATLAB path:

>> addpath /home/common/matlab/fieldtrip/qsub/

Submitting a single MATLAB job to the cluster

To submit a job to the cluster, you will use qsubfeval. It stands for “qsub – evaluation - function”. As an input you specify a name of your function, an argument, and time and memory requirements (see below).

Try the following:

qsubfeval('rand', 100, 'timreq', 60, 'memreq', 1024)
Besides the memory requirements for your computation, MATLAB also requires memory for itself. The qsubfeval and qsubcellfun functions have the option memoverhead for this, which is by default 1GB. The memreq option itself does not have a default value. The torque job is started with a memory reservation of memreq+memoverhead.

You will get the job ID as the output:

submitting job username_dccn_c004_p23910_b1_j001... qstat job id
ans =

Note, that Qsubfeval does not return the output of your function to the command window! You only get the job ID. Your function thus has to include a command for writing the output on disk.

You can check the status of your submitted job with qstat command in Linux terminal.

bash-3.2$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----  
1066196.dccn-l014          ...23910_b1_j001 username          00:00:00 C matlab   

For detailed information on the submitted job use qstat -f JobID:

bash-3.2$ qstat -f 1066196
Job Id:
    Job_Name = irisim_dccn_c004_p23910_b1_j001
    Job_Owner =
    resources_used.cput = 00:00:00
    resources_used.mem = 91232kb
    resources_used.vmem = 4181060kb
    resources_used.walltime = 00:00:02

The qsubfeval command creates a bunch of tempory files in your working directory. STDIN.oXXX is the standard output, i.e. the output that MATLAB normally prints in the command window. STDIN.eXXX is an error message file. For the job to complete successfully, this file should be empty.

Submitting a batch of jobs

To execute few jobs in parallel as a batch you will use qsubcellfun. It is very similar to qsubfeval, but instead of one input argument, you specify a cell array of arguments. Qsubcellfun then evaluates your function with each element of the array. In fact it calls qsubfeval as many times as the number of elements in the array.

Qsubcellfun is similar to the standard Matlab Cellfun. Try the following:

>> qsubcellfun(@randn, {1,1,1,1}, 'memreq', 1024, 'timreq', 60)
submitting job irisim_mentat284_p7284_b6_j001... qstat job id
submitting job irisim_mentat284_p7284_b6_j002... qstat job id
submitting job irisim_mentat284_p7284_b6_j003... qstat job id
submitting job irisim_mentat284_p7284_b6_j004... qstat job id
job irisim_mentat284_p7284_b6_j001 returned, it required 0 seconds and 832.0 KB
job irisim_mentat284_p7284_b6_j002 returned, it required 0 seconds and 828.0 KB
job irisim_mentat284_p7284_b6_j003 returned, it required 0 seconds and 830.0 KB
job irisim_mentat284_p7284_b6_j004 returned, it required 0 seconds and 829.0 KB
computational time = 0.1 sec, elapsed = 1.0 sec, speedup 0.0 x
ans =
[0.1194] [0.3965] [-0.2523] [0.3803]

and compare it with

>> cellfun(@randn, {1,1,1,1})
ans =
-2.2588 0.8622 0.3188 -1.3077

The difference in the output formats is due to the UniformOutput argument, which is default false in qsubcellfun and default true in CELLFUN.

When using qsubcellfun you have to specify the time and memory requirements of a single job.

Same as with qsubfeval, you can use

 qstat, qstat -f, qstat -al, cluster-qstat 

commands to check the status and info on the submitted jobs.

Qsubcellfun works as a wrapper for qsubfeval. If you use qsubcellfun, all the temporally files created by qsubfeval are automatically deleted when the job is completed, or when it is terminated with Ctrl+C, or with an error.

Time and memory management

You will have noticed that you have to specify the time and memory requirements for the individual jobs using the 'timreq' and 'memreq' arguments to qsubcellfun. These time and memory requirements are passed to the batch queueing system, which uses them to find an appropriate execution host (i.e one that has enough free memory) and to monitor the usage.

Do not set the requirements too tight, because if the job exceeds the requested resources, it will be killed. However, if you grossly overestimate them, your jobs will be scheduled in a “slow” queue, where only a few jobs can run simultaneously. The queueing and throttling policies on the number and the size of the jobs is to prevent a few large jobs from a single user from blocking all computational resources of the cluster. So the most optimal approach to get your jobs executed is to try and estimate the memory and time requirements as good as you can.

The help of qsubcellfun lists some suggestions on how to estimate the time and memory.

Stacking of jobs

The execution of each job involves writing the input arguments to a file, submitting the job, to Torque, starting MATLAB, reading the file, evaluate the function, writing the output arguments to file and at the end collecting all output arguments of all jobs and rearranging them. Starting MATLAB for each job imposes quite some overhead on the jobs if they are small, that is why qsubcellfun implements “stacking” to combine multiple MATLAB jobs into one job for the Linux cluster. If the jobs that you pass to qsubcellfun are small (less than 180 seconds) they will be stacked automatically. You can control it in detail with the “stack” option in qsubcellfun. For example

>> qsubcellfun(@randn, {1,1,1,1}, 'memreq', 1024, 'timreq', 60, 'stack', 4);
stacking 4 matlab jobs in each qsub job
submitting job irisim_mentat284_p7284_b7_j001... qstat job id
>> qsubcellfun(@randn, {1,1,1,1}, 'memreq', 1024, 'timreq', 60, 'stack', 1);
submitting job irisim_mentat284_p7284_b8_j001... qstat job id
submitting job irisim_mentat284_p7284_b8_j002... qstat job id
submitting job irisim_mentat284_p7284_b8_j003... qstat job id
submitting job irisim_mentat284_p7284_b8_j004... qstat job id

Note that the stacking implementation is not yet ideal, since with the default option it distributed the 4 jobs into 3+1, whereas 2+2 would be better.

Submitting a batch and don't wait within MATLAB for them to return

If you run your interactive MATLAB session on a torque execution host with a limited walltime and want to submit a batch of jobs with qsubcellfun, you don't know when the batch of jobs will finish. Consequently, you cannot predict the walltime that your interactive session requires in order to see all jobs returning.

Rather than waiting for all jobs to return, you can submit the batch and close the interactive MATLAB session. The next day or week, when all batch jobs have finished (use “qstat” to check on them) you can start MATLAB again to collect the results.

With a single job you could simply do:

jobid = qsubfeval(@myfunction, inputarg);
save jobid.mat jobid
% start MATLAB again
load jobid.mat
outputarg = qsubget(jobid);

A complete batch of jobs can be dealt with in a similar manner:

jobidarray = {};
for i=1:10
  jobidarray{i} = qsubfeval(@myfunction, inputarg{i});
save jobidarray.mat jobidarray
% start MATLAB again
load jobidarray.mat
outputarg = {};
for i=1:10
outputarg{i} = qsubget(jobidarray{i});

Or with fewer lines of code using the standard cellfun function as:

jobidarray = cellfun(@qsubfeval, repmat(@myfunction, size(inputarg)), inputarg, ...
   'UniformOutput', false);
save jobidarray.mat jobidarray
% start MATLAB again
load jobidarray.mat
outputarg = cellfun(@qsubget, jobidarray, 'UniformOutput', false);