"R is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files."
Even though R was not developed for high performance computing (HPC), its popularity with scientists from a variety of disciplines, including engineering, mathematics, statistics, bioinformatics, etc. makes it an essential tool on HPC installations dedicated to academic research. Features such as C extensions, byte-compiled code and parallelisation allow for reasonable performance in single-node jobs. Thanks to R’s modular nature, users can customize the R functions available to them by installing packages from the Comprehensive R Archive Network (CRAN) into their home directories.
The R interpreter
With R in your environment, you can start the R interpreter, and type R code inside that environment:
[nom@serveur $] R R version 2.14.2 (2012-02-29) Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-unknown-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > values <- c(3,5,7,9) > values  3 > q()
To execute R scripts, use the Rscript front-end with the file containing the R commands as an argument:
[name@server $] Rscript computation.R
This front-end will automatically pass scripting-appropriate options --slave and --no-restore to the R interpreter. These also imply the --no-save option, preventing the creation of useless workspace files on exit.
The commands previously described can be used inside jobs submitted to the scheduler. Here is an example of a minimal, single-node, R submit file for a parallel computation, combining the commands seen up to now, and also redirecting standard output and error to files:
Since this submit file will launch one instance of R on one 8-CPU Colosse node, the R script computation.R should be able to perform the bulk of its tasks with multi-threaded, parallel functions that will take advantage of the available computing power. If this is not the case for your R script, you should consider using task arrays or GNU parallel.
There is also a multitude of packages available on CRAN that can be used to distribute R computation. Most of these packages are listed on CRAN Task View: High-Performance and Parallel Computing with R.
Installing R packages
To install packages from CRAN, you can use the install.packages facility inside the R interpreter. For example, to install the sp package that provides classes and methods for spatial data, use the following command on a login node:
[name@server $] R [...] > install.packages("sp")
When asked, select an appropriate mirror for download. Ideally, it will be geographically close to you.
Some packages require defining the environment variable TMPDIR before installing.
To install a package that you downloaded (i.e. not from CRAN), you can install it the following way. Assuming the package is named archive_package.tgz, run the following command in a shell:
[name@server $] R CMD INSTALL archive_package.tgz