The cluster has been configured to allow CUDA/OpenCL applications to be compiled on the head node (which has no GPUs) for later execution on a GPU node. The following tutorial provides the steps to build and test the example applications provided with the NVIDIA SDK. The tutorial is written against CUDA Toolkit 3.1 but has information on how to use other versions of the toolkit.
- Log on to the head node.
- Next set up the environment with the modules command
- Obtain the SDK from NVIDIA.
- Install the SDK
- To build CUDA "C" examples
- To build OpenCL examples
- To test CUDA/OpenCL applications
You can see which modules are available for cuda with the following command:
module avail cuda
Load the required module (e.g. for 3.1)
module load cuda/3.1
This can be found through the download links on the NVIDIA gpucomputing website: http://developer.nvidia.com/object/gpucomputing.html 
For example the 3.1 "GPU Computing SDK code samples" can be found in the Linux section of the CUDA 3.1 download page:
You can download this SDK using the following command: wget http://developer.download.nvidia.com/compute/cuda/3_1/sdk/gpucomputingsdk_3.1_linux.run 
Change the mode of the downloaded file to be executable
chmod +x gpucomputingsdk_3.1_linux.run
Then run it to install it in your home folder (referred to by "~/")
This will install the default directory ~/NVIDIA_GPU_Computing_SDK and locate the cuda version specified in step 2. (e.g. /usr/local/cuda/3.1/cuda)
To verify, the following executable should exist and produce "cudaGetDeviceCount FAILED" on the head node (with no GPU) and "There are 2 devices supporting CUDA" on a node with GPUs.
The SDK requires the correct version of libOpenCL.so to be placed in
The OpenCL library for a particular version is found under the location:
To copy and build the examples execute the following commands:
cp /usr/local/cuda/3.1/cuda/lib64/libOpenCL.so ~/NVIDIA_GPU_Computing_SDK/OpenCL/common/lib
To verify, the following executable should exist and produce "Error -1001 in clGetPlatformIDs Call !!!" on the head node (with no GPU) and "2 devices found supporting OpenCL" on a node with GPUs.
To execute any CUDA or OpenCL code requires a GPU enabled node. You can access a GPU enabled node interactively using the command "qsub -q gpu -I" along with other options you need before the "-I". The -I means interactive.
For example to request a GPU node for 1 hour use:
qsub -q gpu -l walltime=1:0:0 -I
If you do not use a GPU enabled node, errors of no CUDA device found will be reported.