Synopsis

In this section we explain the general procedures to installing software from source, and how to install additional packages for R or Python. Sometimes, the existing software stack that we have installed on the system is not sufficient for your particular needs, and you want to install your own piece of software. 

Installing software from source

Installing software on Unix systems can sometimes be a tedious process, because installation procedures may vary a lot between different software packages. For the same reason, it is hard to give general instructions in this manual.

Ideally, a software package has its own manual, with a section on how to install the software. If it does, that should be your starting point - though you may still want to read this manual to get a general idea of software installations on Unix systems. If your package does not have an installation manual, the common installation methods discussed in the following sections may provide a starting point.

In general, a software installation consists of three steps:

  1. Configuration of the installation
  2. Compilation (or 'building')
  3. Installing the compiled binaries (executables)

Software programmers have many tools to choose from that can do the configuration and compilation steps. However, most commonly the GNU build system or CMake are used for the configuration step. In other cases, manual configuration is required by manual editing of a configuration file, or by choosing from a set of template files. Virtually all installations on Unix however use Make for the compilation and installation steps.

For a list of compiler options for the AMD processors, take a look at the Quick Reference guide here.

Configuring the installation

In the configuration step, it is determined how the compilation should be done for the specific system you are building the software on. For example, an automatic configuration may test if certain software (e.g. a specific software library or compiler) is present and configure the installation accordingly. For example, if a certain library is present (e.g. the mkl library for matrix algebra), the installation may be configured to use the functions from that library instead of the software's own implementation. Additionally, there is sometimes a possibility for the user to configure some parts of the installation manually.

Before doing any automated configuration step, whether it is through a configure script, or CMake, make sure to load the modules of libraries that your installation should be able to find. For example, if the software package you are installing has a dependency on zlib (i.e. it uses functions from the zlib library), make sure you load the most recent zlib release available on the system:

module load 2022
module load zlib/1.2.12-GCCcore-11.3.0

before the configuration step. Note that sometimes, dependencies have specific version requirements (which should be mentioned in an installation manual) , in which case you need to make sure to load the correct version of the module.

Note that loading the modules is no guarantee that the automatic configuration can find the required library: it depends on which environment variables the module sets, and which environment variables the automatic configuration uses to search for these libraries. If an automatic configuration cannot find it, there are usually ways to pass the path with the library location explicitely, e.g. as arguments to the configure script.

Configure script / The GNU build system

One common way of configuring an installation is through a configure script. In that case, there is generally a file 'configure' in the source, which you may execute by typing

./configure

This will generate a makefile with the specific configuration for your system. The makefile can then be used to compile the source code.

The configure script can often accept certain parameters to allow the user some manual control in the configuration. Some of these arguments are pretty generic, and will be accepted by most configuration scripts. For example

./configure --help

generally shows you all the options that you may use for this particular configuration script, while

./configure --prefix=[PATH]

is a very common option which sets the path where you want your software to be installed (after it is compiled). You will generally need this option to set the install path to somewhere in your home directory, since the default path for installations is usually a system path (and for obvious reasons, users are not allowed to do system-wide installations).

Most configuration files are created by software developers using the GNU build system, which is why many of them have these 'standard' options. However, developers may also create configure scripts manually and for obvious reasons, such scripts may behave differently.

CMake

Another common way to configure software is using CMake. CMake has a good manual on how to install software with CMake. Here, we provide the short summary.

There are two ways to do a CMake installation:

  • An in-place build, where the binaries are placed in the same directory as the source code
  • An out-of-place build, where the binaries (and libraries) are produced in a directory separate from the source code.

We recommend the second approach, as it makes it easy to start over with a clean installation if you didn't succeed the first time.

Lets suppose the source files are in the directory $HOME/my_prog_source. First, we create a new directory where we want the binaries to be installed, and change directory to that folder

mkdir $HOME/my_prog
cd $HOME/my_prog

There, we run ccmake, with as argument the location of the source folder:

ccmake $HOME/my_prog_source

This will open up a window with some instructions at the bottom. Press 'c' to start the configuration. You will now see a list of all variables that are configured, and the values assigned to them. If desired, you can at this point manually change variables. Again, note that there may be a variable CMAKE_INSTALL_PREFIX (or similar) which determines where the binaries are going to be installed, this may be one of the variables that you may want to change. Note also that often, you can press 't' to toggle advanced mode to see more configuration items that may be set.

When you're satisfied with the configuration, press 'g'. This will generate a Makefile (in your $HOME/my_prog directory).

If you messed up and want to start with a clean configuration, simply remove the contents $HOME/my_prog directory and start over:

rm $HOME/my_prog/*
cd $HOME/my_prog
ccmake $HOME/my_prog_source


Manual configuration

Of all methods to configure software, manual configuration is the hardest to give instructions for, as manual configuration methods may vary a lot between software packages. In such cases, it is particularly important to look for documentation: either a manual, or some readme/install text file in the source directory.

Some software will already have a generic (i.e. architecture-independent) Makefile present, and perform configuration by setting a number of options/variables in a configuration file which is included at the start of the Makefile (e.g. the configuration file may be called something like makefile.include or similar). Such packages may require to edit this file by hand, setting the options you want, or it may provide templates for common architectures, having names containing different achitectures (e.g. linux_x86, SUN, etc). In case a template is present, you can generally copy the template that best matches your architecture to the makefile.include and then tweak it if needed.

Which configureation/installation method should I use?

If your software lacks an installation manual, how can you know how to install it? Look in the source folder and check there is a 'readme' or 'install' text file with installation instructions. If so, follow these instructions. If not, check in the source folder. If you see

  1. A 'configure' file, the software should probably be installed according to the instructions for the GNU build system.
  2. Files like CMakeLists.txt, the software should probably be installed according to the instructions for CMake.
  3. None of these, but you do see a file 'Makefile' (not being Makefile.am or Makefile.in, those indicate a GNU build system!), it may need manual configuration or may not need configuration at all.

Compiling the application

As mentioned, regardless of the tool used for configuration, nearly all Unix software uses Unix Make to compile the application. To compile the source code, you simply type

make

in the same folder where you ran your configure script or your ccmake (or, for a manual configuration, usually in the top directory of the source code). Optionally, you can add the -j option to compile in parallel. Parallel compilation should be substantially faster which is benificial particularly for large software packages. However, if you receive errors during the compilation, it is advised to omit the -j option.

Installing the application and setting environment variables

The third step, 'installing' the application is nothing more than a process that copies all the compiled binaries to a predetermined folder (e.g. the one specified by you, using the --prefix argument). You can do this using

make install

Suppose you installed a program 'dice' to $HOME/my_prog/ using the --prefix argument. You can then run it using the full path

$HOME/my_prog/dice

If you want to run it using only the name of the executable ('dice'), you need to add the installation directory to the PATH environment variable:

export PATH=$PATH:$HOME/my_prog/


Note that you need to do this for every login session. For a more permanent solution, you can add the above line to your $HOME/.bashrc file: the $HOME/my_prog/ directory will then be added to your PATH variable as soon as you login.

In case you have built a library, for example because you want to use it for building another program that depends on this library, you may consider adding the path to the LD_LIBRARY_PATH variable in a similar way:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/my_library/

This ensures that the compilers are able to find the library when building your second program. Alternatively, you can often set such library paths (like $HOME/my_library) during the configuration step of your installation.

Install R packages

There are many packages available for R, and many versions of R itself. We simply cannot install all of these system-wide. Generally, for the default version of R, we have the most common packages installed system-wide.

If you need specific packages for your analysis, you can install these in your home directory. To do that, you'll need to load the R modules for one of the toolchains (foss or intel), and use the install.packages() function in R. E.g. to install the far package:

module load 2023
module load R/4.3.2-gfbf-2023a
R

> install.packages('far')

R will automatically detect that you cannot write to the system folder, and will ask you if you want to install the package in a personal folder in your home directory. After installation, you can load the R package in the conventional way.

A more extensive explanation of this procedure can be found in our R documentation.

Install Python packages

The same thing that goes for R, also applies to Python. The large number of Python packages and Python versions make it impractical for us to install every package system-wide. So again, for the default version of Python, the most common packages have been installed system-wide.

If you need any specific packages for your analysis, you can install these in your home directory. To do that, load the Python module, and use pip to install the required package in your home by passing the --user option. E.g. to install the svgutils module:

module load 2023
module load Python/3.11.3-GCCcore-12.3.0
pip install --user svgutils

A more extensive explanation of this procedure can be found in our Python documentation.

  • No labels