Python

De Wiki de Calcul Québec
Aller à : Navigation, rechercher
Cette page est une traduction de la page Python et la traduction est complétée à 100 % et à jour.

Autres langues :anglais 100% • ‎français 100%

Note: This documentation has been tested on Colosse. Some details may vary on other servers.

Sommaire

Description

Python is an interpreted programing language with a design philosophy stressing the readability of code. Its syntax is simple and expressive. Python has an extensive, easy-to-use library of standard modules.

The capabilities of Python can be extended with modules developed by third parties. In general, to simplify operations, it is left up to individual users and groups to install these third-party modules in their own directories. However, most systems offer several versions of Python as well as tools to help you install the third-party modules that you need.

The following sections discuss the Python interpreter, and how to install and use modules.

Loading an interpreter

To discover the versions of Python available:

[name@server $] module avail apps/python


You can then load the version of your choice using module load. For example, to load Python 2.7.3 on Colosse:

[name@server $] module load apps/python/2.7.10


Creating and using a virtual environment

With each version of Python, we provide the tool virtualenv. This tool allows users to create virtual environments within which you can easily install Python modules. These environments allow, for example, installations of many version of the same modules, or to compartmentalize a Python installation according to the needs of a specific project.

To create a virtual environment, enter the following command, where ENV is the name of the directory containing your environment:

[name@server $] virtualenv ENV


Note that since Python 3.4.0, virtualenv is integrated with Python and is rather named pyvenv. You must instead use the command

[name@server $] pyvenv ENV


Once the virtual environment has been created, it must be activated:

[name@server $] source ENV/bin/activate


To exit the virtual environment, simply enter the command deactivate:

[name@server $]  deactivate


Installing modules

Once you have a virtual environment loaded, you will be able to run the pip command. This command takes care of compiling and installing most of Python modules and their dependencies.

All of pip's commands are explained in detail in the user guide. We will cover only the most important commands and use the Numpy module as an example.

We first load the Python interpreter:

[name@server $] module load apps/python/2.7.10


Note that since Python 3.4.0, virtualenv is integrated with Python and is rather named pyvenv. You must instead use the command

[name@server $] pyvenv exp1


The, we activate the virtual environment:

[name@server $] source exp1/bin/activate


Finally, we install the latest stable version of Numpy:

[name@server $] pip install numpy


If we wanted to install the development version of Numpy, we can also give a link toward its Git repository:

[name@server $] pip install git+git://github.com/numpy/numpy.git


Installing Numpy with MKL bindings (on Colosse and Helios)

On Colosse and Helios, a simple

[name@server $] pip install numpy


will automatically install a version of numpy that uses MKL.

Installing Numpy with MKL bindings (on other servers)

MKL contains a very optimized version of BLAS and LAPACK that is developped and maintained by Intel. It is always best to use MKL instead of a generic version of BLAS such as the one provided by Numpy. Installing Numpy with MKL bindings is straightforward. First, load the appropriate MKL module for the server you are using. For example,

[name@server $] module load libs/mkl


Then, you need to create a file in your home directory. This file can be created by copying the following lines :

[name@server $] cat > ~/.numpy-site.cfg << EOF
[mkl]
library_dirs = $MKLROOT/lib/intel64
include_dirs = $MKLROOT/include
mkl_libs = mkl_rt
lapack_libs =
EOF


Then, simply run, as instructed above:

 
 [name@server $] virtualenv exp1
 [name@server $] source exp1/bin/activate
 


With GCC

[name@server $] pip install numpy


WARNING: The installation instructions depend on which compiler is used. By default, on Colosse, the compiler is Intel. Follow the instructions from the section With Intel.

With Intel[1]

You must first download the Numpy source code, since you will have to edit them. Once the source downloaded and extracted, move to the source directory and run the following command:

[name@server $] sed -i "s/self.cc_exe =.*/self.cc_exe = 'icc -O3 -g -fPIC -fp-model strict -fomit-frame-pointer -openmp -xhost'/g" numpy/distutils/intelccompiler.py


Finally, to compile and install, run the following command

[name@server $] python setup.py config --compiler=intelem build_clib --compiler=intelem build_ext --compiler=intelem install


Installing PyCuda

To install PyCuda, you must first install NumPy. Follow the instruction from the section above to do so. Then, in the same virtual environment, run the command

[name@server $] pip install pycuda

.

Profiling code

It is very simple to profile your code with Python, using the -m cProfile command line option:

[name@server $] python -m cProfile script.py 
 885812 function calls (883641 primitive calls) in 38.299 CPU seconds
   Ordered by: standard name
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   38.299   38.299 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 <string>:1(ParseResult)
        1    0.000    0.000    0.000    0.000 <string>:1(SplitResult)
        1    0.000    0.000    0.000    0.000 ConfigParser.py:106(Error)
        1    0.000    0.000    0.000    0.000 ConfigParser.py:133(NoSectionError)
        1    0.000    0.000    0.000    0.000 ConfigParser.py:140(DuplicateSectionError)
        1    0.000    0.000    0.000    0.000 ConfigParser.py:147(NoOptionError)
...


By default, the profile lists the functions in alphabetical order. You may change this order with the -s command line argument. For example, to sort them by time spent within the function, add -s time:

[name@server $] python -m cProfile -s time script.py 
 885206 function calls (883035 primitive calls) in 37.897 CPU seconds
   Ordered by: internal time
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1262   34.225    0.027   34.225    0.027 {time.sleep}
     1750    2.627    0.002    2.627    0.002 {select.select}
      335    0.167    0.000    0.167    0.000 {posix.forkpty}
   524253    0.110    0.000    0.110    0.000 Module.py:18(<lambda>)
     4745    0.099    0.000    0.209    0.000 {filter}
    47784    0.077    0.000    0.077    0.000 {built-in method sub}
        1    0.040    0.040    0.056    0.056 pexpect.py:64(<module>)
...


The possible sort options are listed on this documentation page.

Going parallel with SCOOP

Some packages are available to make it easier to parallelize a Python code. One of these open-source projects is Scalable Concurrent Operations in Python (SCOOP). To use it, you can install it within your environment:

 
 [name@server $] pip install pyzmq
 [name@server $] pip install scoop
 


You will then be able to use SCOOP functions in order to easily parallelize your program.

To use more than one node in your parallel computation, you need to Generate SSH keys.

To start your softwares developped with SCOOP, you may use a submit file such as the one below.

Fichier : submit.tmpl.sh
#!/bin/bash
 
#PBS -S /bin/bash
#PBS -N TASK
#PBS -A xxx-yyy-zz (votre # de projet)
#PBS -l nodes=2:ppn=8
#PBS -l walltime=300
 
python -m scoop examples/piCalc.py


References

  1. https://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl

Other references

Outils personnels
Espaces de noms

Variantes
Actions
Navigation
Ressources de Calcul Québec
Outils
Partager