You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Usage Agreement

In order to make use of the Snellius or Lisa services, you need to read and accept the Usage Agreement. For this you need to visit https://portal.surfsara.nl/home/ and login with your login and password.

When your project is finished

Expired home directories and project spaces will be deleted

The SURFsara Usage Agreement states that data will be removed within 6 months after the expiration date of an agreement (Contract, Project Agreement, NWO (EInfra)grant, etc.). If a login (and its home directory) has no association for longer than 15 weeks with an active account/budget on the basis of which access to our systems is granted, we will delete the login and its home directory.

Data access granted to others

In some cases, owners of home directories have granted access to their data to others via group memberships or "access control lists" (ACL's). If the others still need these data, they need to take action to preserve the data for themselves.

Project spaces

For project spaces, by and large, the same applies as for home directories. Differences with home directories have to do with the fact that project spaces, unlike home directories, are created as collectively owned by the logins that are members of a disk quota group. Project space allocation is an integral part of NWO grants. When the NWO grant (or other contractual basis) expires, and there is no new or prolonged grant or contract within 15 weeks, the project space is expired as well and will be cleaned up. If there is a new grant or prolongation arrangement, the project space will remain. However, the logins that are no longer associated with the new account, will be removed from the quota group. Files in the project space associated with the UID of an expired login should be assigned to another group member that is still active.

Principal investigators of an account are warned 90, 60 and 30 days before their account expires, so there is enough time to take the appropriate measures before expiration date.


I want to acknowledge SURF for the usage of Snellius and/or Lisa and the support I got

We would appreciate if you put a text like this in your publications about projects wherein Lisa played a role:

We thank SURF (www.surf.nl) for the support in using the <Lisa Compute Cluster|National Supercomputer Snellius>.

What does Snellius mean

Lichtbreking


Mathematician Willebrord Snel van Royen


Willebrord Snel van Royen (Leiden, 1580-1626), also known by his Latin name Snellius, was a Dutch mathematician and physicist, humanist, linguist and astronomer. He was professor of mathematics at Leiden University from 1613 until his death in 1626. He is best known for Snell's law, named after him, which indicates how light rays are broken when light passes through different materials (e.g. from air to glass, as in the image above).



Portret van Willebrord Snel van Royen

What does LISA mean

We think the name 'Lisa' is appropriate for the system, because:

  • We like the name Lisa
  • The name is short and easy to type
  • 'Lisa' is easily understandable

If one wants, 'Lisa' can stand for:

  • Lisa Supercomputer Amsterdam
  • Linux Supercomputer Amsterdam

The first one honors the fact that large essential portions of the software that make systems like Lisa possible are from the open source community: GNU. GNU stands for "GNU's not Unix". The second honors the fact that de operating system on Lisa is Debian Linux. Without the availability of free, open source operating systems a cluster like Lisa would be nearly impossible.

My job doesn't start with a status 'ReqNodeNotAvail'

This usually happens when a maintenance session is planned. You can see planned maintenance on the system status page or in the message of the day when logging in on Snellius or Lisa. Jobs with a maximum wall clock time longer than the time until the start of the maintenance, will not start until after the maintenance and are indicated with a status 'ReqNodeNotAvail' in the squeue output. A workaround is to use a shorter maximum wall clock time.

Help, I can't login! Is my account blocked?

There can be different reasons why this happens:

  • One of the most common reasons is that the system is in maintenance. You can see planned and ongoing maintenance on the system status page.
  • Snellius interactive nodes only accept (GSI-)ssh connections from known, white-listed IP, ranges. You may be trying to connect with an IP by using an IP address that is not in a white-listed range.

So, you might find that the system cannot be accessed while traveling. For these moments, please use the doornode.

If you need access from a location that you will use regularly and long-term, please contact us through the service desk with your external IP address. Please take care that you report the CORRECT public IP address. As many sites nowadays use private IP space and a network address translation scheme, your public IP address is NOT necessarily an address that is configured directly on your local system and hence not necessarily known to your system. The following ranges are by definition private IP address ranges that cannot be whitelisted:

  • 192.168.0.0 - 192.168.255.255
  • 172.16.0.0 - 172.31.255.255
  • 10.0.0.0 - 10.255.255.255

You can easily find out the public IP address that you use, by visiting sites like http://www.whatismyip.com, https://whatismyipaddress.com, or https://www.whatsmyip.org, with your web browser.

How to connect to the Snellius system from abroad

At times you'll find yourself on the road and get this good idea, which you would like to test with a simulation on Snellius. When you try to login, you'll find that access is often not possible, which can be quite frustrating. The problem is that Snellius uses a white-list of ip-addresses and only from those locations you can access the system. To help you in these situations, we have setup a separate login server, that can be accessed from anywhere in the world: doornode.surfsara.nl (thus using `ssh user@doornode.surfsara.nl`). This server can be accessed with your usual login and password, after which you get a menu with systems that you can login to. Select 'Snellius' and type your password a second time. You are now logged on to Snellius. Please note that you cannot copy files or use X11 when using the door node.

If you are sure that you will access Snellius more regularly from the same location, please send your ip-address our service desk and ask it to be white-listed. You can find the blocked ip-address when logging in to Snellius using ssh -v [login]@snellius.surf.nl

How to disconnect

Simply issue the command

logout

or

exit

in the terminal window. Do not forget the 'Enter' after this command.

More information

More information about using Linux systems in general can be found on the web, for example:

  • The UNIX Tutorial for Beginners contains a useful into Unix. NOTE: some examples (especially those about variables) are for another shell (csh) then the default shell on Snellius (bash).
  • The Advanced Bash-Scripting Guide gives an in-depth but readable overview of the usage of the standard login shell 'bash', with examples.

I need to see the output of my batch job immediately while executing or my program crashed but the output seems cut off!

If your program is compiled with the GNU gfortran compiler, set the following environment variable:

export GFORTRAN_UNBUFFERED_ALL=y


For C programs, the buffering can be changed using the command setvbuf. E.g. standard output can be unbuffering using:

#include <stdio.h>
...
setvbuf(stdout, NULL, _IONBF, 0);

I can't use CVS.

Snellius does not support the default remote shell 'rsh' for security reasons. Please use:

export CVS_RSH=ssh


How can I determine the memory usage of my application?

The SLURM batch scheduler logs the memory usage of your application and it can be retrieved after your job has ended. By issuing the command

job-statistics -j <JOB_ID>

will show the average and maximum memory use per MPI task, which MPI task used the maximum memory and on what node.

Example usage might look like this

$ job-statistics -j 1155623
...
              AveRSS :  11077K
              MaxRSS :  11576K
          MaxRSSTask :  46
          MaxRSSNode :  tcn828
...

If you want to print the memory usage as part of your application, note that the linux system call getrusage() isn't fully implemented under linux, see 'man 2 getrusage'. Please compile and use the C routine printmem() (listed below), which prints the memory usage.

C routine printmem()
#include <stdio.h>
int printmem()
{
char buf[30];
        snprintf(buf, 30, "/proc/%u/statm", (unsigned)getpid());
        FILE* pf = fopen(buf, "r");
        if (pf) {
            unsigned size; //       total program size
            unsigned resident;//   resident set size
            unsigned share;//      shared pages
            unsigned text;//       text (code)
            unsigned lib;//        library
            unsigned data;//       data/stack
            unsigned dt;//         dirty pages (unused in Linux 2.6)
            fscanf(pf, "%u" /* %u %u %u %u %u"*/, &size/*, &resident, &share, &text, &lib, &data*/);
            printf("KB used: %u\n",size);
            fclose(pf);
            return((int)size);
        }
}


Which nodes are allocated to my job?

The environment variable $SLURM_NODELIST contains the names of the nodes. The format is something like: tcn[9006-9008]. Using the program scontrol you can obtain the nodenames, one name per line:

$ scontrol show hostnames
tcn9006
tcn9007
tcn9008

Using the command nodeset, you get all node names from the $SLURM_NODELIST variable on one line:

$ nodeset -e $SLURM_NODELIST
tcn9006 tcn9007 tcn9008

Connection refused

A "connection refused" error on Lisa always means that you have been locked out by the Intrusion Detection System due to either having used a wrong password for 5 consecutive times or because you or your program tried to reconnect to a session that was not closed yet (MobaXterm / WinSCP and similar). The lock-out lasts 24 hours and cannot be lifted by us.

In order to avoid this sort of issues I recommend that you use SSH public key authentication as explained here: SSH

What does maintenance mean?

A few times per year, you will see in the 'message of the day' (the message you get when you login in), that maintenance is planned. During this period the system will be upgraded or adapted.

Consequences for you:

  • During maintenance, you cannot log in
  • Jobs, that would still be running at the start of the maintenance, will not be started

Can I receive mail on my login?

No, you can't receive messages from outside the system. The batch nodes can send mail to your login, but, in order to read them, you have to forward (using the $HOME/.forward file) mail sent to your login.

--mail-user=me@home.nl

Put the following line in your job:

echo "Job $SLURM_JOBID started at `date`" | mail $USER -s "Job $SLURM_JOBID"

and edit the file $HOME/.forward, example:

me@home.nl



  • No labels