Example scripts for multiple serial simulations, OpenMP jobs, and hybrid jobs

De Wiki de Calcul Québec
Aller à : Navigation, rechercher
Cette page est une traduction de la page Exemples de scripts pour simulations séquentielles multiples, tâches OpenMP, tâches hybrides et la traduction est complétée à 100 % et à jour.

Autres langues :anglais 100% • ‎français 100%

Here are examples of job submission files for various cases. You can copy one of those files in your home folder and modify it according to your needs. You will need to adapt it to the server that you are using since not all servers have the same number of cores or features.

Sommaire

Multiple serial jobs

When you need to run the same program for multiple datasets, it is best to group those jobs in a single job. This is even required on some our servers which allow only a single job per node (Colosse and Mp2) to avoid wasting resources.

To efficiently group your simulations, you ideally want compute times that are roughly the same for each job. You also need to make sure that enough memory is available on a single compute node. You need take this into consideration when choosing the number of jobs to run on a node.

The principle is simple : you launch tasks in background using the & character. Then, the command wait ensures the script waits until all background tasks are done before terminating the job. In the example below, we assume that the program prog is within the folder $HOME/program_dir et that each subdirectory contains input files that are different, such that each job writes its result in different output files. We also assume that all necessary module have already been loaded.


File : multiple_simulations.sh
#!/bin/bash
#PBS -l walltime=30:00:00
#PBS -l nodes=1:ppn=8
 
SRC=$HOME/program_dir
cd $SCRATCH/dir1 ; $SRC/prog > output &
cd $SCRATCH/dir2 ; $SRC/prog > output &
cd $SCRATCH/dir3 ; $SRC/prog > output &
cd $SCRATCH/dir4 ; $SRC/prog > output &
cd $SCRATCH/dir5 ; $SRC/prog > output &
cd $SCRATCH/dir6 ; $SRC/prog > output &
cd $SCRATCH/dir7 ; $SRC/prog > output &
cd $SCRATCH/dir8 ; $SRC/prog > output &
 
wait


Sequential task array

You may submit a batch of serial tasks with bqTools on the servers that have it installed. For other servers, here is how you can run serial tasks on a larger number of dataset, which are for example in directories $SCRATCH/dirX, where X goes from 0 to 255 and where a maximum of 10 tasks are run simultaneously.

First a Moab example, to be submitted with the msub command.

File : moab_sequential_taskarray.sh
#!/bin/bash
#PBS -A abc-123-aa
#PBS -l walltime=30:00:00
#PBS -l nodes=1:ppn=8
#PBS -t [0-255:8]%10
 
SRC=$HOME/program_dir
UPPER_BOUND=$(($MOAB_JOBARRAYINDEX + 7))
 
for i in $(seq $MOAB_JOBARRAYINDEX $UPPER_BOUND)
do
  cd $SCRATCH/dir$i ; $SRC/prog > output &
done
 
wait


Same example, with Torque, to be submitted with qsub.

File : torque_sequential_taskarray.sh
#!/bin/bash
#PBS -l walltime=30:00:00
#PBS -l nodes=1:ppn=8
#PBS -t 0-31%10
 
SRC=$HOME/program_dir
LOWER_BOUND=$((8 * $PBS_ARRAYID))
UPPER_BOUND=$(($LOWER_BOUND + 7))
 
for i in $(seq $LOWER_BOUND $UPPER_BOUND)
do
  cd $SCRATCH/dir$i ; $SRC/prog > output &
done
 
wait



OpenMP job

Because OpenMP jobs only works on a shared-memory architecture, they can only be run on a single node at a time.

Note : The KMP_* parameters below are only valid for the Intel compiler. For more information, see the OpenMP page.

File : openmp_job.sh
#!/bin/bash
#PBS -l walltime=30:00:00
#PBS -l nodes=1:ppn=8
 
# We use the Intel compiler
module load compilers/intel/12.0.4 
# We define environment variables for this compiler
# - we avoid releasing cores after parallel sections
export KMP_LIBRARY=turnaround
# - we define the stack size for each thread. default value is 1 MB. Accepted units are b, k, and m for bytes, kB and MB.
export KMP_STACKSIZE=1000m
# - we bind the OpenMP threads on specific cores
export KMP_AFFINITY=compact
 
export OMP_NUM_THREADS=8 
SRC=$HOME/program_dir
cd $SCRATCH/dir
$SRC/prog_openmp


Hybrid job

Here is an example of job using MPI with one process per node and OpenMP on all cores of a single node. The example uses the Intel compiler.

File : hybrid_job.sh
#!/bin/bash
#PBS -l walltime=30:00:00
#PBS -l nodes=4:ppn=8
 
# We use the intel compiler
module load compilers/intel/12.0.4
# We use an MPI library built with this compiler
module load mpi/openmpi/1.4.5_intel
 
# We define the same environment variables as before
export KMP_LIBRARY=turnaround
export KMP_STACKSIZE=1000m
export KMP_AFFINITY=compact
 
export OMP_NUM_THREADS=8
# We do not bind the process on a single core. This would basically deactivate threading with OpenMP.
export IPATH_NO_CPUAFFINITY=1
SRC=$HOME/program_dir
cd $SCRATCH/dir
mpiexec -n 4 -npernode 1 $SRC/prog_openmp


Additional server-specific documentation

Briarée

Colosse

Cottos

Guillimin

Hadès

Mammouth parallèle II

Mammouth série II

Psi


Outils personnels
Espaces de noms

Variantes
Actions
Navigation
Ressources de Calcul Québec
Outils
Partager