Skip to content

OpenMP

A shared-memory job is a single process that takes advantage of a multi-core processor and its shared memory to achieve parallelization.

This example shows how to submit an OpenMP program compiled in the section Compiling OpenMP Programs.

When running OpenMP programs, all threads must be on the same compute node to take advantage of shared memory. The threads cannot communicate between nodes.

Set OMP_NUM_THREADS

To run an OpenMP program, set the environment variable OMP_NUM_THREADS to the desired number of threads.

In csh:

setenv OMP_NUM_THREADS 16

In bash:

export OMP_NUM_THREADS=16

This should almost always be equal to the number of cores on a compute node. You may want to set it to another appropriate value if you are running several processes in parallel in a single job or node.

Example Job Submission File

Create a job submission file named omp_hello.sub:

1
2
3
4
5
6
7
8
9
#!/bin/bash
# FILENAME:  omp_hello.sub
#SBATCH --nodes=1
#SBATCH --ntasks=16
#SBATCH --gpus-per-node=1
#SBATCH --time=00:01:00

export OMP_NUM_THREADS=16
./omp_hello

Submit the Job

Submit the job:

sbatch omp_hello.sub

View Results

View the results from one of the sample OpenMP programs about task parallelism:

cat omp_hello.sub.omyjobid

Example output:

1
2
3
4
SERIAL REGION:     Runhost:gilbreth-a003.rcac.purdue.edu   Thread:0 of 1 thread    hello, world
PARALLEL REGION:   Runhost:gilbreth-a003.rcac.purdue.edu   Thread:0 of 16 threads   hello, world
PARALLEL REGION:   Runhost:gilbreth-a003.rcac.purdue.edu   Thread:1 of 16 threads   hello, world
   ...

If the job failed to run, view error messages in the file slurm-myjobid.out.

If an OpenMP program uses a lot of memory and 16 threads use all of the memory of the compute node, use fewer processor cores, or OpenMP threads, on that compute node.

Back to the Running Jobs section