Peer distributed computing allows to have multiple computers running the same analysis at the same time, efficiently speeding up your entire analysis. But how can I distribute my jobs?

Let's say we have an analysis-script and we want to run that particular analysis for every subject, every condition (say we have 2), every event (again 2), and 5 different frequencies of interest. The analysis-script should receive any of these parameters and then compute. For example, such an analysis-script could look like this:

  myanalysis(input)
  
  % our study parameters
  myconds    = {'A','B'}; 
  myevents   = {'con','incon'};
  myfreqs    = [10:10:50]; % 10, 20, 30, 40, and 50 Hz
  
  % load the variable we need 
  input = ['/home/mystorage/', input.subj,'_data.mat']; % the matfile which contains the variables
  variable = [myconds{input.cond}, '_', myevents{input.event}, '_' , num2str(myfreqs(input.freq))];
  load(input, variable);
  
  % perform analysis
  data = analyze(variable);
  
  % store data in a matfile with matching name
  output = ['/home/mystorage/', input.subj, '_newdata', variable, '_.mat']; 
  save(output, data);
  clear

This script will simply perform 'analysis' on the parameters specified. Now we should effectively distribute this script and different parameters (we don't want multiple computers to do exactly the same of course). Here is an example of how such a job distribution (all subjects, 2 conditions, 2 events, 5 frequencies) script could look like. Basically in our example, we have 4 (2 conditions * 2 events) different task-parameters repeated for 5 different frequencies:

   % our study parameters
   myconds    = {'A','B'}; 
   myevents   = {'con','incon'};
   myfreqs    = [10:10:50];
  
   % list subjects
   mysubjs = {'subject01', 'subject02', 'subject03'};
   nsub = 3;
   
   ctr = 0; 
   for s = 1:nsub % all subjects
       
       for i = 1:20 % 2 conditions * 2 events * 5 frequencies
          
           % switch condition every 2 jobs
           if i >  2 && i <=  4 || i >  6 && i <=  8 || i > 10  && i <= 12 ...
               || i > 14 && i <= 16 || i > 18 && i <= 20             
              cond = 2;
           else
              cond = 1;
           end
      
           % switch event every job
           if floor(i/2) == i/2 % when i = equal number
              event = 2;
           else                 % when i = odd number
              event = 1;
           end
      
           % switch frequency every 4 jobs
           if i > 0 && i <= 4
              freq = 1;
           elseif i > 4 && i <= 8
              freq = 2;
           elseif i > 8 && i <= 12
              freq = 3;
           elseif i > 12 && i <= 16
              freq = 4;
           elseif i > 16 && i <= 20
              freq = 5;
           end
      
           % check for existence
           variable = [myconds{cond}, '_', myevents{event}, '_' , num2str(myfreqs(freq))];
           output = dir(['/home/mystorage/', mysubjs{s}, '_newdata', variable, '_.mat']);
           
           if isempty(output) % if the matfile does not yet exist, then add to the joblist
              ctr = ctr +1;
              input{ctr}.cond  = cond;
              input{ctr}.event = event;
              input{ctr}.freq  = freq;
              input{ctr}.subj  = mysubjs{s};
           end
           clear output;
       end
   end
   
   % distribute and assume one job requires 2 Gb of memory and 1 hour of CPU time
   % note: one should test with cellfun instead of peercellfun first
   peercellfun(@myanalysis, input, 'memreq', 2*(1024^3), 'timreq', 1*3600)   
  

Our joblist (i.e. 'input') should now contain 60 jobs (20 different settings * 3 subjects) which are distributed over the computer cluster. Having 'myanalysis' loading and saving the data is memory efficient. None of the output, namely, is sent to the workspace but instead stored on disk.