Info
The configuration of Snellius allows users to use a node in a "shared" mode where they are able to use a subset of the resources of a full node. This page explains the partitions available to users and the accounting for each partition. Please refer to the HPC user guide for a general introduction on the partition usage and how to submit a job using SLURM.

Table of Contents

Snellius partitions

Compute nodes are grouped into partitions in order to allow the user to select different hardware to run their software on. Each partition includes a subset of nodes with a different type of hardware and a specific maximum wall time.

The partitions can be selected by users via the SLURM option:

Code Block

language	text

#SBATCH --partition=<partition name>

or its short form:

Code Block
#SBATCH -p <partition name>

The partitions available on Snellius are summarised in the table below. For details of the different hardware available on each node, please look at the Snellius hardware and file systems page.

Partition name	Node type	# cores per node	Memory per node	Smallest possible allocation	Max wall time	Notes
thin	tcn (thin compute node)	128	256 GiB	1/4 node: 32 cores + 64 GiB memory	120 h (5 days)
fat	fcn (fat compute node)	128	1 TiB	1/4 node: 32 cores + 250 Gib memory	120 h (5 days)
himem_4tb	hcn, PH.hcn4T (High memory node 4 TiB)	128	4 TiB	1/4 node: 32 cores + 1 TiB memory	120 h (5 days)
himem_8tb	hcn, PH.hcn8T (High memory node 8TiB)	128	8 TiB	1/4 node: 32 cores + 2 TiB memory	120 h (5 days)
gpu	gcn	72	512 GiB	1/4 node: 18 cores + 1 GPU + 128 GiB memory	120 h (5 days)
staging	srv	16 (32 threads with SMT)	256 GiB	1 thread + 8 GiB memory	120 h (5 days)	SMT is enabled on the srv nodes enabling up to 32 threads per node (2 threads/core).

Short jobs

Whenever you submit a job that uses at most 1 hour walltime to the "thin", "fat" or "gpu" partitions, SLURM will schedule the job on a node that is only available for such short jobs. This effectively reduces the wait time for short jobs compared to longer jobs, which is useful for testing the setup and correctness of your jobs before submitting long-running production runs.

Note that the number of nodes that can run short jobs is relatively small. So submitting a short running job which uses many (e.g. tens or hundreds of) nodes will not work.

Accounting

Resource usage is measured in SBUs (System Billing Units). An SBU can be thought of as a "weighted" or "normalised" core hour. Because nodes differ in the type of CPUs, the amount of memory, and attached resources like a GPU or a local NVMe disk, SBUs are assigned and weighted per node type. On Snellius charging for resource usage is based on how long a resource was used (wall-clock time) in addition to the type and amount of nodes (or partial nodes) used.

The table below shows the "SBU pricing" of core hours for the various node types.

Node type	weight, SBUs per core hour	# cores per node	Smallest possible allocation	SBUs per 1 hour, full node	SBUs per 1 hour, smallest possible allocation
tcn (thin compute node)	1.00	128	1/4 node: 32 cores	128 SBUs	32 SBUs
fcn (fat compute node)	1.50	128	1/4 node: 32 cores	192 SBUs	48 SBUs
gcn (4 GPUs enhanced compute node)	7.11	72	1/4 node: 18 cores + 1 GPU	512 SBUs	128 SBUs
hcn, PH1.hcn4T (High memory node 4 TiB)	2.00	128	1/4 node: 32 cores	256 SBUs	64 SBUs
hcn, PH1.hcn8T (High memory node 8TiB)	3.00	128	1/4 node: 32 cores	384 SBUs	96 SBUs
srv (service node, for data transfer jobs)	1.00	16 (32 threads with SMT)	1 thread	32 SBUs	1 SBU

Shared usage accounting

Info
It is possible to submit a single-node job on Snellius that uses only part of the resources of a full node. Resources here means either cores or memory of a node. The rules for shared resource accounting is below. Example shared usage job scripts can be found here.

For single-node jobs (only), users can request part of a node's resources. Jobs that require multiple nodes will always allocate (and get charged for) full nodes, i.e. there are no multi-node jobs that share nodes with other jobs.

The requested resources, i.e. CPU and memory, will be enforced by cgroups limits. This means that when you request, say, 1 CPU core and 1 GB of memory, those will be the hardware resources your job gets access to.

However, the accounting of shared jobs using less than a full node is done in increments of 1/4th of a node. So any combination of memory and/or cores (or GPUs) will be rounded up to the next quarter node, up to a full node. A quarter of a node's resources is defined to be a quarter of a node's total cores or a quarter of a node's total memory. The resource (memory/cores) that is requested at the highest fraction will define the resource allocation of the job. So requesting a quarter of the memory and half the CPU cores will lead to half the node being accounted.

For nodes with attached GPUs, a quarter of a node implies: 1 GPU + a quarter of the cores of the CPU and memory.

Here is a list example shared usage allocations:

Panel

title	Shared CPU accounting examples

1/4 node reservation
- Single-node jobs requesting up to and including 32 cores for a thin or high memory node
- Single-node jobs requesting up to and including 64 GiB memory on a thin node or 250 GiB on a fat node

1/2 node reservation
- Single-node jobs requesting up to and including 64 cores for a thin or high memory node
- Single-node jobs requesting up to and including 128 GiB memory on a thin node or 500 GiB on a fat node

3/4 node reservation
- Single-node jobs requesting up to and including 96 cores for a thin or high memory node
- Single-node jobs requesting up to and including 192 GiB memory on a thin node or 750 GiB on a fat node

Full node reservation
- Jobs requesting all the cores in the node
- Jobs requesting all the memory of a node

Panel

title	Shared GPU accounting examples

1/4 node reservation
- Single-node jobs requesting up to and including 18 cores (or 1/4 of the node memory, or 1 GPU) for a GPU node
1/2 node reservation
- Single-node jobs requesting up to and including 36 cores (or 1/2 of the node memory, or 2 GPU) for a GPU node
3/4 node reservation
- Single-node jobs requesting up to and including 54 cores (or 1/4 of the node memory, or 1 GPU) for a GPU node
Full node reservation
- For multi-node jobs independent of the number of cores (memory, GPUs) requested

Note
You will be charged for this share of the node independently from the number of cores actually used.

Jobs requesting more than 1 node, will get exclusive access (only one job can run at the same time) to the allocated nodes, independent from the amount of core/memory requested. The batch system will accept jobs that request 1 node, 2 nodes, 3 nodes, and so on, providing exclusive use of all the cores, GPUs and memory on the node(s). It is important to note that Snellius is a machine designed for large compute jobs. We encourage users to develop workflows that schedule jobs running on at least a full node of a particular type.

The "odd one out" node type is the service node (srv node). Srv nodes are dedicated for the automation of data transfer tasks. The transferring of data in or out of the system, is a task that does not involve much "compute" at all. Usually it is more limited by network bandwidth than by CPU resources. Therefore, jobs submitted to srv nodes by default are jobs using just a single thread out of the 32 available per node (on srv nodes we enabled SMT).

The use of the unit of "core hour" above does not imply anything about the minimum or maximum duration of a job. The job scheduling and accounting systems have a time resolution of 1 second. Accounts will be budgeted only for the time they used the resources, independently from the requested walltime.

How resources are accounted in terms of SBU budget subtracted differs between regular jobs and jobs run within a reservation:

For regular jobs (i.e. not part of a reservation) the wall clock time that is accounted is the time from the actual start time of allocation of the resources to the actual end time and de-allocation of the resources. If such a job ends before its reserved time limit (as specified with -t <duration> to sbatch ) is over then only SBUs for the the actual run time in wall-clock time are consumed. Jobs that are submitted and subsequently cancelled before they ever were provided with an allocation of nodes do not consume any SBU budget.
A reservation will always be accounted for the full duration and set of resources reserved. This is even the case when all or part of the reserved resources are left idle, e.g. because smaller jobs than would be possible within the reservation are run.

Our HPC User Guide contains guidelines and several examples on how to request resources on our HPC systems. Check the Creating and running jobs section or the Example job scripts for more details.

Note

title	Costs of inefficient use

You will be charged for all cores in the node(s) that you reserved, regardless of the actual number of cores used by the job/application. So if your application uses only a few (or even one) of the CPU cores of a node then it makes sense to write a job script that runs multiple instances of this application in parallel, in order to fully utilize the reserved resources.

Getting account and budget information

You can view your account details using

Code Block
accinfo

This shows information such as the e-mail associated with the account, the initial and remaining budget, and until when the account is valid.

An overview of the SBU consumption can be obtained with

Code Block
accuse

By default, consumption is shown for the current login, per month, over the last year. Per day usage can be obtained by adding the -d flag. The start and end of the period shown in the overview can be changed with the -s DD-MM-YYYY and -e DD-MM-YYYY flags, respectively. Finally, consumption for a specific account or login can be obtained using -a accountname and -u username, respectively.

Space shortcuts

Page tree

Versions Compared

Old Version 20

New Version 21

Key

Snellius partitions

Short jobs

Accounting

Shared usage accounting

Getting account and budget information

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 20

New Version 21

Key

Snellius partitions

Short jobs

Accounting

Shared usage accounting

Getting account and budget information