NOTE: This document is under active development and is not yet fully complete.
VPAC has installed MATLAB Distributed Computing Server (DCS) and has purchased 32 Worker licenses.
Matlab DCS is a version of MATLAB that can run MATLAB tasks as PBS jobs on Tango. These jobs have to be specially prepared by MATLAB Parallel Computing Toolbox (PCT) installed on your desktop at your institute.
MATLAB PCT will copy across the compiled jobs and data and submit them into the queuing system on Tango and can poll for them to complete. Once finished MATLAB PCT will recover the output and insert the results into your MATLAB session.
To make use of this you need:
Please note that VPAC cannot provide you with a license for MATLAB or the PCT, it must belong to you or your University and must be installed on your desktop or laptop computer there.
Please note that the instructions on the README file must be followed and the relevant files copies as requested. The respective paths for Linux and Windows clients are:
$matlabroot/toolbox/distcomp/examples/integration/pbs/nonshared/unix/README
$matlabroot\toolbox\distcomp\examples\integration\pbs\nonshared\windows\README
ssh-keygen to create a new key (if you don't already have one) and set a passphrase!ssh-copy-id user@tango.vpac.org (replacing user with your VPAC user name) to copy your ID to VPAC.ssh-add to add the newly created key to your existing ssh-agent configuration.Windows is, sadly, far more complicated than Linux as it does not include the functionality needed by default. If you run into problems here please do drop an email to help@vpac.org for assistance!
First you will need to follow the Windows Pageant tutorial and set up a passphrase protected public/private key pair between your computer and VPAC.
Then you will need to download the programs plink.exe and pscp.exe from the main PuTTY Download Page and put them into your Windows PATH.
PSCP and Plink are command line applications; you cannot just double-click on its icon to run it. For PSCP and Plink to work it will need either to be on your PATH. To set your PATH on Windows NT, 2000, and XP, use the Environment tab of the System Control Panel.
PATH=C:\path\to\putty\directory;%PATH%
Use of PSCP is virtually identical to the commands used in SCP, except a 'p' is appended to the start of each command. Thus the general principle is;
pscp account@source.address:/path/to/file c:\path\to\desitination.txt
Finally, you will need to establish a c:\matlab directory where you can put your data and results, as scp does not work with spaces in path names in this instance. The entire MATLAB Windows toolbox needs to be copied into this directory. e.g.,
cp c:\$matlabroot\toolbox\distcomp\examples\integration\pbs\nonshared\windows\ c:\matlab\windows
Before using MATLAB PCT you will need to configure the MATLAB path to include the correct set of submit functions for your system. To do this go to the File menu and select the "Set Path" option. Select the "Add Folder" option and then add one of the two options below depending on whether you use Linux or Windows. $MATLAB represents the location where MATLAB is installed on your computer, you will need to navigate to the directory below by hand.
Linux: $MATLAB/toolbox/distcomp/examples/integration/pbs/nonshared/unix
Again in MS-Windows the task is a little trickier. First the existing path has to be removed i.e.,
$MATLAB\toolbox\distcomp\examples\integration\pbs\nonshared\windows
Then the new path (see above) has to be added.
c:\matlab\windows
The examples below demonstrate how to use PCT to submit either a set of serial, independent, tasks or a larger parallel task.
MATLAB PCT requires some initial configuration of the scheduler to say what the name of the cluster is, where MATLAB is installed on the cluster, where the MATLAB temporary files are to be stored there and where they should be created on your PC.
Linux users will need to use ssh-add to unlock their key for this session, Windows users will need to run Pageant and unlock their key.
This code is mostly common to both the examples below and should be run before each of them.
The key changes to the code below (which is Linux-based) and the MS-Windows version includes:
;
; This is the name of the cluster to submit the jobs to
;
clusterHost = 'tango.vpac.org';
;
; This is where my temporary files are kept, replace "user" with
; your VPAC username and make sure the directory specified exists!
;
remoteDataLocation = '/home/user/matlab/';
;
; This then tells MATLAB that you are going to use a remote scheduler
;
sched = findResource('scheduler', 'type', 'generic');
;
; This is where your local files are kept
;
set(sched, 'DataLocation', '/home/user/MATLAB')
;
; Please do not change these three settings below!
;
set(sched, 'ClusterMatlabRoot', '/usr/local/matlab/default');
set(sched, 'HasSharedFilesystem', true)
set(sched, 'ClusterOsType', 'unix'); The following code sequence is an example of how to submit a number of individual tasks, each doing independent work, as a single job.
;
; Tell MATLAB to use the correct submit function for this type of job.
;
set(sched, 'SubmitFcn', {@pbsNonSharedSimpleSubmitFcn, clusterHost, remoteDataLocation});
;
; Create an overarching job which will contain all the tasks
;
j = createJob(sched)
;
; Create 4 individual tasks, each returning a 3 X 3 set of random numbers
; between 0 and 1.
;
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});
createTask(j, @rand, 1, {3,3});
;
; Submit the job to Tango (all 4 tasks)
;
submit(j)
;
; Wait for all those tasks to finish.
; NB: This could take some time if you have to wait in the queue!
;
waitForState(j)
;
; Get the results of all the tasks submitted.
;
results = getAllOutputArguments(j);
;
; Pretty print the results.
;
celldisp(results) This submits a parallel job running on 9 CPUs that returns a series of random numbers.
;
; Tell MATLAB to use the correct function for submitting a parallel job.
;
set(sched, 'ParallelSubmitFcn', {@pbsNonSharedParallelSubmitFcn, clusterHost, remoteDataLocation});
;
; Create the parallel job object
;
pjob = createParallelJob(sched)
;
; Create a parallel task using the rand function to provide 3 random numbers.
;
createTask(pjob, 'rand', 1, {3});
;
; Now we specify that it should use 9 CPUs (workers) to run this task on.
; Remember you can't use more than 32 at once!
;
set(pjob,'MinimumNumberOfWorkers',9);
set(pjob,'MaximumNumberOfWorkers',9);
;
; Now we submit that job to run on Tango.
;
submit(pjob)
;
; We wait for it to complete.
; NB: This could take some time if you have to wait in the queue!
;
waitForState(pjob)
;
; Get the results of the job.
;
results = getAllOutputArguments(pjob)
;
; Pretty print the results.
;
celldisp(results);For more information please see the MATLAB Parallel Computing Toolbox documentation.