You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 23 Next »

Trouble with connecting to Snellius

There are a few common issues that may be the cause of failing to connect to snellius.

Usage Agreement

In order to make use of the Snellius service, you need to read and accept the Usage Agreement.

For this you need to visit https://portal.cua.surf.nl/ and login with your login and password.

The system is down:

Please check the system status page.

You are temporarily banned:

You will receive a 24hrs ban on a login node after 5 failed login attempts.
The interactive nodes are protected by fail2ban; since we have multiple login nodes it can happen that you are banned at one login host but not at the other login host.

We encourage you to use SSH public keys setup to access Snellius. See this information on how to upload your public key to Snellius.

Attempting to connect from a non-whitelisted IP

The interactive nodes only accept (GSI-)ssh connections from known, white-listed IP, ranges. You may be trying to connect with an IP by using an IP address that is not in a white-listed range.

So, you might find that the system cannot be accessed while traveling. For these moments, please use the doornode.

If you need access from a location that you will use regularly and long-term, please contact us through the service desk with your external IP address. Please take care that you report the CORRECT public IP address. As many sites nowadays use private IP space and a network address translation scheme, your public IP address is NOT necessarily an address that is configured directly on your local system and hence not necessarily known to your system. The following ranges are by definition private IP address ranges that cannot be whitelisted:

  • 192.168.0.0 - 192.168.255.255
  • 172.16.0.0 - 172.31.255.255
  • 10.0.0.0 - 10.255.255.255

You can easily find out the public IP address that you use, by visiting https://echoip.cua.surf.nl with your web browser.

Using a doornode

At times you'll find yourself on the road and get this good idea, which you would like to test with a simulation on Snellius. When you try to login, you'll find that access is often not possible, which can be quite frustrating. The problem is that Snellius uses a white-list of ip-addresses and only from those locations you can access the system. To help you in these situations, we have setup a separate login server, that can be accessed from anywhere in the world: doornode.surfsara.nl (thus using `ssh user@doornode.surfsara.nl`). This server can be accessed with your usual login and password, after which you get a menu with systems that you can login to. Select 'Snellius' and type your password a second time. You are now logged on to Snellius. Please note that you cannot copy files or use X11 when using the door node.

More information

More information about using Linux systems in general can be found on the web, for example:

  • The UNIX Tutorial for Beginners contains a useful into Unix. NOTE: some examples (especially those about variables) are for another shell (csh) then the default shell on Snellius (bash).
  • The Advanced Bash-Scripting Guide gives an in-depth but readable overview of the usage of the standard login shell 'bash', with examples.

Data management policy

Expired home directories and project spaces will be deleted

The SURF Usage Agreement states that data will be removed within 6 months after the expiration date of an agreement (Contract, Project Agreement, NWO (E-infra)grant, etc.). If a login (and its home directory) has no association for longer than 15 weeks with an active account/budget on the basis of which access to our systems is granted, we will delete the login and its home directory.

Data access granted to others

In some cases, owners of home directories have granted access to their data to others via group memberships or "access control lists" (ACL's). If the others still need these data, they need to take action to preserve the data for themselves.

Project spaces

For project spaces, by and large, the same applies as for home directories. Differences with home directories have to do with the fact that project spaces, unlike home directories, are created as collectively owned by the logins that are members of a disk quota group. Project space allocation is an integral part of NWO grants. When the NWO grant (or other contractual basis) expires, and there is no new or prolonged grant or contract within 15 weeks, the project space is expired as well and will be cleaned up. If there is a new grant or prolongation arrangement, the project space will remain. However, the logins that are no longer associated with the new account, will be removed from the quota group. Files in the project space associated with the UID of an expired login should be assigned to another group member that is still active.

Principal investigators of an account are warned 90, 60 and 30 days before their account expires, so there is enough time to take the appropriate measures before expiration date.


Trouble with running jobs

My job doesn't start with a status 'ReqNodeNotAvail'

This usually happens when a maintenance session is planned. You can see planned maintenance on the system status page or in the message of the day when logging in on Snellius. Jobs with a maximum wall clock time longer than the time until the start of the maintenance, will not start until after the maintenance and are indicated with a status 'ReqNodeNotAvail' in the squeue output. A workaround is to use a shorter maximum wall clock time.

My program crashed but the output seems cut off!

Python

Using python's built-in -u flag:

python -u <program.py>

Or using the environment variable PYTHONUNBUFFERED :

 PYTHONUNBUFFERED=1 python <program.py>

Fortran

If your program is compiled with the GNU gfortran compiler, set the following environment variable:


C

For C programs, the buffering can be changed using the command setvbuf. E.g. standard output can be unbuffering using:


I can't use CVS.

Snellius does not support the default remote shell 'rsh' for security reasons. Please use:


How can I determine the memory usage of my application?

The SLURM batch scheduler logs the memory usage of your application and it can be retrieved after your job has ended. By issuing the command


will show the average and maximum memory use per MPI task, which MPI task used the maximum memory and on what node.

Example usage might look like this


If you want to print the memory usage as part of your application, note that the linux system call getrusage() isn't fully implemented under linux, see 'man 2 getrusage'. Please compile and use the C routine printmem() (listed below), which prints the memory usage.

C routine printmem()

Which nodes are allocated to my job?

The environment variable $SLURM_NODELIST contains the names of the nodes. The format is something like: tcn[9006-9008]. Using the program scontrol you can obtain the nodenames, one name per line:


Using the command nodeset, you get all node names from the $SLURM_NODELIST variable on one line:


What does maintenance mean?

A few times per year, you will see in the 'message of the day' (the message you get when you login in), that maintenance is planned. During this period the system will be upgraded or adapted.

Consequences for you:

  • During maintenance, you cannot log in

  • Jobs, that would still be running at the start of the maintenance, will not be started

Can I receive mail on my login?

No, you can't receive messages from outside the system. The batch nodes can send mail to your login, but, in order to read them, you have to forward (using the $HOME/.forward file) mail sent to your login.


Put the following line in your job:


and edit the file $HOME/.forward, example:


What information should be present in NWO Small Requests (Pilot grants) for Snellius?

We expect certain information to be present in the NWO small grant applications. Putting this information there already helps us to evaluate grants quickly and efficiently which in turn results in faster process times for the applicant and also lesser questions asked. Please refer to this page for tips on what details do we require in the application form and also refer to the examples present on that page.

Acknowledge SURF for the usage of Snellius and support

We would appreciate if you put a text like this in your publications about projects wherein Snellius played a role:

We thank SURF (www.surf.nl) for the support in using the National Supercomputer Snellius.

  • No labels