Compiling MPI Applications Tutorial


Compiling MPI Applications


To create an example MPI application, we'll use the source code that is located in /common/examples/mpi.


Change into the directory we made previously, and copy the file:



trainXX@trifid:~/class> cp -v /common/examples/mpi/* .



The -v indicates that the copy will be 'verbose', giving information back to the user during the copy process. The '.' indicates to copy to the working directory.

Three files will be copied into your current directory. First, we’ll compile up the mpi-pong.c example MPI application.



Try:
trainXX@trifid:~/class> mpicc mpi-pong.c -o mpi-pong



This should produce the binary mpi-pong which we can execute, but before we do, we need to work out how to submit it as a job to be processed by the cluster.

To launch a job on the cluster, you will require a script to specify the parameters of the job. There are a few examples of job scripts for PBS in:



/common/examples/PBS



Generally there are a number of things that will need checking:



Job Name, set with the -N option

Number of CPUs, using the -l nodes=[number of CPUs] option

Wall time of the Job, using the -l walltime=[hrs:min:sec] option

You may also want to check that you are collecting stdout and errout. Many users like to keep a pbs-script file in their home directory set up how they like it so that only a couple of things need to be changed after it is copied to their run directory.



Note that you must deal with both the number of nodes and CPUs required. If you want to use 8 CPUs for your job, you need to specify that in the the PBS script. For example:



#PBS -l nodes=8



Then, submit the job like this:



trainXX@trifid:~/class> qsub pbs-pong

You will get a response that includes a job number, its worth noting it. See the overall picture with the command showq. Keep track of whats happening with your job by using the qstat command, by itself it will list all jobs in the queue, put the job number on the command line and it will list only that job and put -f there and you get lots of useful information about the job. So suppose the job number is 4567 you would type:



trainXX@trifid:~> checkjob 4567



Try compiling the source file we grabbed before :



trainXX@trifid:~/class> mpicc msum.c -o msum -lm



Create a PBS script for it, say pbs_msum, based on one we have used before. Perhaps grab a pristine copy from /usr/local/examples/PBS/pbs-script:



trainXX@trifid:~/class> cp /common/examples/PBS/pbs-script pbs_msum



Edit the PBS script to set your walltime, job name, number of CPUs and binary we just compiled. When you're happy with it, submit the job to the resource manager:



trainXX@trifid:~/class> qsub pbs_msum

You can see where your program is running by using a number of commands, the most useful are showq and qstat. You'll have to be quick, because the job won't last long!

We need to examine ways of running this job with a long and short wall time and with one, two and four cpus. We will play with the idea of sharing a node with another user or using both CPUs ourselves.

Lets see if we have time to fill in the table below. Try switching between the GCC compiled version of MVAPICH2 with the XL compiled version to compare the speed differences of our MPI code:

  GNU GCC PGI Intel
1 CPU      
4 CPUs      
8 CPUs      

Staging files

For jobs which require large continuous I/O, we recommended staging files from your home directory to your compute nodes before processing.

The following lines in your PBS script stage-in the file 'inputfile' to /tmp of the first compute node you have been assigned, and then stage-out the file 'outputfile' back to your home directory.



#PBS -W stagein=/tmp/inputfile@trifid:$HOME/inputfile

#PBS -W stageout=/tmp/outputfile@trifid:$HOME/outputfile

Top of Page