We have introduced a new software environment on Snellius and Lisa based on EasyBuild. It provides several improvements over the current environment:

  • Automated software installations.
  • Software packages and all their dependencies are built with a consistent set of compilers & libraries (known as toolchains).
  • Modules automatically load all their dependencies. The compiler used to build the module and its dependencies also gets loaded.
  • Tab completion for the module command.
  • The same environment on Cartesius and Lisa. Note that not all modules are installed on both systems.
  • Software is reinstalled when upgrading the OS. This allows for Lisa and Cartesius to have a mix of OS versions on the nodes, and the correct software build will be loaded automatically.
  • Better optimization, including support for the Knights Landing processors.
  • A convenient way for you to install complex software locally (EasyBuild supports over 1200 packages)

You'll notice many new modules, and some typical HPC applications are:

  • CP2K (with PLUMED)
  • GROMACS (with PLUMED)
  • Siesta

And some of the available libraries include:

  • OpenFOAM
  • Boost
  • GDAL
  • ParMETIS
  • HDF5
  • PETSc

EasyBuild toolchains

Toolchains are one of the main concepts used by EasyBuild. An EasyBuild toolchain is a collection of compilers and (potentially) commonly used HPC libraries (e.g. MPI, BLAS). Toolchains allow software packages and all their dependencies to be built with a consistent set of compilers & libraries, avoiding any potential issues that may occur when linking libraries that are built with mixed compiler versions.

We use a number of main toolchains to build software: GCC(core), foss, and intel. The toolchains are organized in a hierarchical fashion and contain the following compilers/libraries:

  • GCCcore: the GNU compilers (gcc/g++/gfortran)
  • GCC: GCCcore, binutils
  • foss: GCC, OpenMPI, OpenBLAS, ScaLAPACK and FFTW
  • intel: GCCcore, binutils, icc, ifort, Intel MPI, Intel MKL. Note that the intel toolchain uses icc and ifort as compilers, NOT the GNU compilers (GCCcore is merely included in the toolchain to provide the C standard library).

The hierarchical fashion in which the toolchains are organized has the advantage that software built with a lower toolchain, e.g. GCCcore, is fully compatible with that of a higher toolchain (e.g. intel). As such, software built with the foss toolchain can use libraries built with GCCcore as dependencies (since GCC is part of foss, and GCCcore is part of GCC). Here, GCCcore is known as a subtoolchain of foss.

Just as there are different compiler versions there are also different toolchain versions. In principle, EasyBuild releases new toolchain versions twice a year, e.g. foss/2017a and foss/2017b. Newer toolchains usually contain a newer version of the compiler.

For more information on EasyBuild toolchains, see https://easybuild.readthedocs.io/en/latest/Common-toolchains.html

EasyBuild module naming

EasyBuild names modules according to a fixed format: softwarename/softwareversion-toolchainname-toolchainversion-suffix. For example, Boost/1.76.0-GCC-10.3.0 is a module for the Boost library, version 1.76.0, which was compiled with the GCC compiler version 10.3.0. We sometimes use the suffix to provide additional information. For example, pkgconfig/1.5.4-GCCcore-10.3.0-python is also a module for pkgconfig version 1.5.4, was also compiled with GCC version 10.3.0, but was compiled with a python interface for Python.

Module names are case-sensitive and are capitalized the same way as the original software packages, e.g. netCDF, CMake, etc.

EasyBuild modules

EasyBuild creates modules that automatically load the same dependencies that were used to compile the software. For example, the Python/3.9.5-GCCcore-10.3.0-bare module file contains statements such as this:

...
conflict	Python
module load	GCCcore/10.3.0
module load	zlib/1.2.11-GCCcore-10.3.0
...

When the Python module is loaded it loads all of the dependencies, like the GCCcore compiler, zlib, and binutils to name a few.

Automatically loading dependencies has two main advantages

  • You as an end user don't have to know which dependencies you need to load before running a program.
  • Issues may occur when users load different versions of dependencies when running the program than the ones that were used to compile it. Automatic loading of dependencies avoids these issues.

As you can see, there is also a 'conflict' statement in the module. This means that no two modules with the same name can be loaded at the same time (having two versions of the same software loaded would make it ambiguous which of the two is actually used). For example, you cannot have two versions of Python loaded at the same time.

Note that conflicts will also not allow you to load modules compiled with different versions of the toolchains. For example, you cannot load the libraries Boost/1.60.0-foss-2015b and netCDF/4.4.1.1-foss-2017b at the same time, because they would load conflicting versions of the foss toolchain.

What toolchain should I use to compile my own software?

In general, for the end-user, we would recommend using the foss or intel toolchain. This allows you to use many of the libraries already installed on our systems as dependencies for your own projects. The choice between foss and intel is up to you. Some considerations:

  • Good software developers will test if their code compiles with various compilers. However, a lot of scientific software does not get such extensive testing. Since the GNU compilers are most commonly used, you may encounter build issues with Intel compilers that the developers of your code were not aware of.
  • The Intel compilers are sometimes believed to produce faster code (although this statement is also disputed). If fast code is important to you, we suggest you compile with both toolchains, and benchmark to see which of the two provides the fastest code for your application.
  • Check which libraries/software we have built with each toolchain to determine which toolchain provides most of the dependencies you need.

Using EasyBuild to install the software in your home folder

If you need software that is not installed on our system, you can see if there is an EasyConfig (which is an EasyBuild build recipe) available in our repository. You'll need to have the EasyBuild module loaded:

module load eb/<latest_version> 

And then you can search the repository using

eb -S [software_name]

Then, you can install the software using

eblocalinstall [easyconfig_name]

For example, to install your own version of the GROMACS:

eb -S GROMACS
eblocalinstall GROMACS-2021-foss-2020b.eb

Use eblocalinstall

Do not use the eb command directly for your installations, e.g. something like this will not work correctly:

eb GROMACS-2021-foss-2020b.eb --installpath-software=$HOME/.local/EasyBuild --installpath-modules=$HOME/.local/EasyBuild/modules

The problem is a clash between the wrappers used by EasyBuild, and the wrappers used by SURF. This problem is resolved by the eblocalinstall script.





  • No labels