GPU Jobs¶
The Scholar cluster nodes contain NVIDIA GPUs that support CUDA and OpenCL. See the detailed hardware overview for the specifics on the GPUs in Scholar.
This section illustrates how to use SLURM to submit a simple GPU program.
Suppose that you named your executable file gpu_hello from the sample code gpu_hello.cu (see the section on compiling NVIDIA GPU codes). Prepare a job submission file with an appropriate name, here named gpu_hello.sub:
Submit the job:
Requesting a GPU from the scheduler is required. You can specify total number of GPUs, or number of GPUs per node, or even number of GPUs per task:
After job completion, view the new output file in your directory:
View results in the file for all standard output, slurm-myjobid.out
If the job failed to run, then view error messages in the file slurm-myjobid.out.
To use multiple GPUs in your job, simply specify a larger value to the GPU specification parameter. However, be aware of the number of GPUs installed on the node(s) you may be requesting. The scheduler can not allocate more GPUs than physically exist. See detailed hardware overview and output of sfeatures command for the specifics on the GPUs in Scholar.