MVAPICH2

De Wiki de Calcul Québec
Aller à : Navigation, rechercher
Cette page est une traduction de la page MVAPICH2 et la traduction est complétée à 100 % et à jour.

Autres langues :anglais 100% • ‎français 100%

Sommaire

Description

MVAPICH2 (pronounced as em-vah-pich) is a high-performing MPI implementation. This library is based on MPICH, which name comes from combining MPI with chameleon, showing how portable this MPI implementation is. The MPICH implementation that is adapted to the VAPI for InfiniBand networking interface has received the name MVAPICH. Note however, that recent versions of MVAPICH are not constrained to the VAPI interface. The first version of MVAPICH followed the first MPI standard. The number two was added to the library respecting the second MPI standard. The most recent version of MVAPICH2 (1.9) follows multiple elements of the MPI3 standard. Over the years the library's name has become a poor description of what it really is.

The most recent version of the MVAPICH2 library and its documentation are available online at The Ohio State University.

Advantages and inconveniences

MVAPICH2 was designed for InfiniBand networks. This library is among the most performing implementations of MPI. Comparing with Open MPI, MVAPICH2 has fewer parameters that are only set at runtime. The most recent versions of MVAPICH2 accept more features of the MPI standard version 3.

Compiler wrappers

Like various other ones, MVAPICH2 comes with compiler wrappers making it easier to compile a piece of code. To compile C, use mpicc, for C++ use mpicxx, and for Fortran, use mpif90.

These commands parse certain additional options that are useful when you encounter difficulties at compile time.

Option Description
-show Show all the commands that would be run (without actually running them)
-compile_info Show the compilations' stages.
-link_info Show the command used to link the program
-cc={compilateur} Permits using a different compiler (use with caution)

Configuring SSH keys

The mpirun and mpiexec commands that are part of MVAPICH2 use SSH to launch MPI processes. To use them, depending on the server, it may therefore be necessary to Generate SSH keys in your account. The version of mpiexec from the OSC documented below, does not have this requirement.

Execution with the default mpiexec

As opposed to mpirun, documented below, the MVAPICH2's mpiexec command does not require you to start MPD daemons. Running simply takes place via the following command:

[name@server $] mpiexec hello-mpi


Equivalent options for Open MPI and MVAPICH2

To those used to Open MPI, here are some equivalent options for MVAPICH2.

Open MPI MVAPICH
mpicc -showme mpicc -show
mpiexec --report-bindings ... export MV2_SHOW_CPU_BINDING=1
mpiexec --output-filename out.txt mpiexec -outfile-pattern out.txt.%g.%r
mpiexec --bind-to-socket mpiexec -bind-to-socket
mpiexec -npersocket 1 mpiexec -ppn 2 (for a node with two sockets)

Execution with the default mpirun

If you use Briarée or Guillimin you can skip the following section since you probably use OSC's mpiexec command. If you are not sure use the following to find out:

[name@server $] mpiexec -version


If you see the word HYDRA at the beginning you use the default version, and if not you use the OSC version.

MPD daemons

MVAPICH2, much like MPICH2, uses daemons to manage processes on each node (one daemon per node). One calls daemons MPD (MPI Process Manager Deamons). Hence you should start such a daemon once before running a job. Your job submission script should contain instructions similar to this one:

File : pbs.sh
#PBS -l walltime=1:00:00
#PBS -j oe
#PBS -N myJob
cd $PBS_O_WORKDIR
 
# defines the number of processes per node
ppn=6
 
# Adjust the number of threads (if hybrid)
# Replace the number 24 with the number of cores per node on your system
export OMP_NUM_THREADS=$[24/ppn]
 
# Start MVAPICH daemons
mpdboot -n $PBS_NUM_NODES -f $PBS_NODEFILE > /dev/null 2>&1
 
# Start the execution
mpiexec -f $PBS_NODEFILE -n $[PBS_NUM_NODES*ppn] -ppn $ppn -env OMP_NUM_THREADS ./myApplication


You do not need to specify any options. By default the number of processes is equal to the number of available cores. The most commonly used options are given in the following table:

Option Description
-n number Specify the total number of processes
-f file Specify the file name that contains the list of nodes to use.
-env variable Environment variables to propagate.

Distribution of processes

Contrary to Open MPI, by default MVAPICH2 distributes the processes to neighbouring cores. This distribution technique makes for good performance for pure MPI codes, but less so for hybrid codes. In the hybrid case, if you do not pay attention, certain nodes will do all work where others have nothing to do. There exists however an option equivalent to npernode, called ppn (processes per node). For example, to submit 16 processes on two nodes using eight processes per node, use this command:

[name@server $] mpiexec -n 16 -ppn 8 ./my_application


Distribution of processes can be controlled using environment variables. The variable MV2_CPU_BINDING_POLICY can be set to the values bunch or scatter. The bunch policy corresponds to the default distribution, whereas the scatter policy corresponds to the distribution used by Open MPI. To use the latter policy, you can use the following command line:

[name@server $] mpirun -n 16 -env MV2_CPU_BINDING_POLICY scatter ./my_application


Another environment variable, MV2_CPU_MAPPING, allows for a more precise way of specifying the distribution of processes. This one allows you to improve the performance when you use a hybrid code. This variable lets you fix processes to specific cores. For example, if you have one node with 12 cores and you want that process 0 uses cores 0 to 2, process 1 uses cores 3 to 5, and so on, use the following command:

[name@server $] mpirun -n 4 -env MV2_CPU_MAPPING 0,1,2:3,4,5:6,7,8:9,10,11 ./my_application


Execution with OSC mpiexec

On certain servers, presently Briarée and Guillimin, the mpiexec version that is used is not the default one, but instead the one provided by OSC. Using this version of mpiexec is very similar to using Open MPI. This version is adapted to systems that use PBS as the task manager. For example, it is not necessary to pass it the list of nodes reserved for the job.

The main options are:

Option Description
-n number Specify the total number of processes
-verbose Give verbose informations about operations done by mpiexec
-npernode number Number of processes per node
Outils personnels
Espaces de noms

Variantes
Actions
Navigation
Ressources de Calcul Québec
Outils
Partager