Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

We encourage learning by doing! That is why we prepare our courses in a way that you can do real things yourself, as opposed to just listen to a trainer talking.

This webinar guides you through the steps you need to complete in order to get your first workspaces running in SURF Research Cloud, and hopefully also to try some basic operations within a few workspace types.

You can find the general user documentation here: https://servicedesk.surfsara.nl/wiki/x/HIKV

If you ever need help in the future, you can contact our Servicedesk here: https://servicedesk.surfsara.nl

Info
titleTable of Contents

Table of Contents
maxLevel2
stylesquare

0. Onboarding

SURF Research Cloud relies on two external services to provide its own service. The two services are: SURF Research Access Management (SRAM), and our Central Budgetting and Accounting (CBA). You need to be properly set up in those before you can do anything in Research Cloud. We will do so in this section.

0.1 Log in to SRAM

SURF Research Cloud revolves around a web portal. You can log in with your institute's credentials if your institute is connected to a different service called SURF Research Access Management (or SRAM, for short). If your institute is not connected (yet?), you can choose the EduID identity provider.

For this webinar, if you registered through the right channels then we already sent you an invitation to accept a membership in SRAM, which you should have received in your mailbox. If you have not received that invitation, please contact the facilitator of the webinar now. If you have not accepted the invitation yet, do so now, by following the links you received by e-mail. 

After accepting the invitation, verify that you are all set in SRAM. Please follow the instructions from https://wiki.surfnet.nl/x/Nq12Ag now to log into SRAM (we recommend that you use a new private window in your browser). Verify that you are a member of at least one Collaboration, and that at least one Collaboration's name is related to this training.

0.2 Obtain a Wallet

In order to be able to create workspaces yourself, you will need a wallet. In order to create a wallet for you, we need to know what your identity is in Research Cloud. Therefore, let us get you equipped now.

  1. Open the Research Cloud portal in your browser: https://portal.live.surfresearchcloud.nl
  2. Click on the Log In button. A list of identity providers will show.
  3. Select the same identity provider from the list that you registered with in SRAM in the previous section (remember you may have chosen EduID if your institute is not yet connected to SRAM)
  4. Follow your identity provider's steps to log in
  5. Click Continue or Next until you reach the Research Cloud portal, where you are welcomed with the title: "Welcome to your SURF research cloud dashboard"
  6. On the top of the screen, you find the Wallet section. Click there.
  7. You can now see the list of wallets that you can use. It may be empty. Verify whether you can see that there is a wallet listed for trainings. If you do not have a training wallet yet, click on the yellow Request button, which you can find under the heading "Quick actions". A form pops up.
  8. Fill in the form indicating that you are requesting a training wallet, making sure the existing contract reads as follows: "src-webinar"

This last step will trigger a ticket in our helpdesk system (including you in a mail copy). That e-mail contains your user identity in Research Cloud. The trainers will make sure to grant you a wallet soon, by coupling your user identity to a budget in CBA. It is a manual process yet, so please bear with us. If it takes longer than a few minutes, please reach out to the trainers.

Exercise 1. First steps in an Ubuntu workspace

Once you are on board (i.e.: set up in the external services), we will be creating a first workspace.

1.1 Create a workspace

You can create your first Ubuntu-based workspace by following our step-by-step guide from our general user documentation.

Info
titleSSH or TOTP?

Before starting to create your workspace, please decide if you want to start a workspace with a Linux Desktop or a "headless" Linux server:

If you are not yet acquainted with the Linux command line or the usage of SSH key pairs, choose the application "Ubuntu Desktop 20.04", while following the steps of the manual.

You can log in to your workspace using two-factor-authentication, then. (e.g. with the Google Authenticator app on your smartphone)

Otherwise, you can choose a Linux server application like "Ubuntu 18.04 (SUDO enabled)" where you will log in using a SSH key-pair.

This is the manual to create your workspace: https://servicedesk.surfsara.nl/wiki/x/kIKV .

For completeness sake, that guide includes steps to delete the workspace. For the time being, make sure that you just follow the steps to create a workspace from that guide now. Or put differently: do NOT delete it yet.

1.2 Log in to your workspace

Once you have an Ubuntu-based workspace, you can log in by following our step-by-step guide here: https://servicedesk.surfsara.nl/wiki/x/koKV

Depending on your earlier choice for "Desktop" or "Server" you follow the steps for SSH or for TOTP.

1.3 Working in your workspace

Follow Following chapter 4 of the online book Data Science at the Command Line as an example, we will run a couple of command lines to showcase that you are in a real computing environment.

If you are not there yet, connect logged in to your workspace via SSH , log in as you did before. Then run


Info
iconfalse

If you chose for the "Desktop" option, earlier, open a terminal for this exercise by pressing Ctrl+Alt+T.

Run this list of commands:

curl -s http://www.gutenberg.org/files/76/76-0.txt |
tr '[:upper:]' '[:lower:]' | 
grep -oE '\w+' |             
sort |                       
uniq -c |                    
sort -nr |                   
head -n 10                   

That should yield the top-10 of the most frequent words found in the Adventures of Huckleberry Finn, by Mark Twain. Something like this:

   6443 and
   5098 the
   3665 i
   3258 a
   3024 to
   2567 it
   2087 t
   2046 was
   1848 he
   1781 of

In order for this to happen, your command has:

  1. downloaded the .txt file with the text
  2. converted all text to small caps
  3. extracted all words individually
  4. sorted the words alphabetically
  5. removed duplicates while counting these
  6. sorted the list by count
  7. kept only the top 10 counts

See? Quite some data cleansing done in a bunch of concatenated unix commands there!

You can now try to run your own. As an example: can you install the software packages that you would need to use to run some of your pipelines?

1.4 Delete your workspace

On the Dashboard section of the Research Cloud portal, you can see all workspaces in your Collaborative Organisations. You can also see logs and delete workspaces that belong to you. When you are running a workspace, your quota is ticking from your wallet.

Once you no longer need a workspace, you can best delete it to release resources that the workspace may be keeping busy. Let us do that now for the workspace that you created so far.

  1. In your dashboard, locate your Ubuntu-based workspace
  2. Click on the arrow to the right of your workspace line, so that you display the workspace's details
  3. Click on the Delete button

Exercise 2. Running a first scientific workflow in Jupyter

2.1 Create a workspace

You can create your first Jupyter-based workspace by following steps similar to the ones that you followed in the previous section (remember our step-by-step guide here: https://servicedesk.surfsara.nl/wiki/x/kIKV ).

2.2 Log in to your workspace

Once you have a Jupyter-based workspace, you can log in by following our step-by-step guide here: https://servicedesk.surfsara.nl/wiki/x/koKV


Info
For your Jupyter workspace you will need to follow the steps in the section titled "Workspace Access with TOTP". If you did not complete that section in the previous exercise, you can do so now.

2.3 Do some work

We invite you to bring your own Jupyter-based workflow already into the workspace. If you are lacking ideas or would simply like to see what others are doing, choose one of the following list:

https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks

You can now play around pretending you are working in a real scenario. For example: run some of those notebooks, try to launch a task that will take some few minutes to complete....

2.4 Delete your workspace

Once you no longer need a workspace, you can best delete it to release resources that the workspace may be keeping busy. Let us do that now for the workspace that you created so far.

  1. In your dashboard, locate your Ubuntu-based workspace
  2. Click on the arrow to the right of your workspace line, so that you display the workspace's details
  3. Click on the Delete button

Exercise 3. Running a first scientific workflow in R-studio

Repeat a similar scenario as in Exercise 2. This time we leave it up to you to find some nice exercise to carry out in R-studio. If you find nice tips, please share with the room!

Think of releasing resources when you are done.

Exercise 4. Working with persistent data

4.1 Create a volume

As you can see, once you delete a workspace, all data that may have existed in the workspace is deleted as well. To allow to store data more persistently, you can create a storage volume. Let us do that now:

  1. On the dashboard, look for the "Create new storage" card, and click the Create New button.
  2. Follow the wizard analogously to how you created a workspace in the past. Make sure to choose the small size.

Once you finish the wizard, back on the dashboard, under the Storage tab you can now see a new volume there that you are the owner of. Wait until it has the state "Available".

4.2 Attach a volume to a workspace

If you now follow the wizard to launch a workspace for app "Ubuntu 18.04 with storage" as you did in Exercise 1, then in step 5 you will be able to choose the volume that you created a moment ago. Do so

Choose any Ubuntu-based application to try this.

Can you see now that the volume's state is now In-use?

Wait until the workspace shows to be running. Then you can connect to it (see Exercise 1).

Check that you can see the volume, by running the following command:

df -hT /data/volume_1

That should deliver a result similar to this:

Filesystem Type Size Used Avail Use% Mounted on
udev devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs tmpfs 798M 652K 798M 1% /run
/dev/sda1 ext4 15G 2.2G 13G 16% /
...
/dev/sda15 vfat 105M 3.6M 101M 4% /boot/efi
/dev/sdb xfs 250G 288M 250G 1% /data/volume_1
...

See that you have the volume mounted under /data/volume_1. Let us see if you can write there:

date > /data/volume_1/first.txt

Then verify that you have written the time into that first.txt file:

cat /data/volume_1/first.txt

Do you see the date and time of a few seconds ago?

4.3 Upload data to a volume

Info
You will need SSH-based login to do this.

Imagine you now want to upload some files from your laptop to this workspace into the persistent volume, so that you can process them later. Let us do that now.

  1. On your laptop, open a new terminal
  2. Create a new file called second.txt in your local home directory, which will contain the word "hello", like this:
    • echo hello > ~/second.txt
  3. Now upload the file to your workspace, like this:
    • (note: pay attention to replacing your_ssh_username and workspace_ip with the right values, which you can get from your workspace's details! They are the same as the ones for your ssh connection)

      scp ~/second.txt your_ssh_username@workspace_ip:/data/volume_1

Now that you have uploaded a file (pretend that this was a large dataset), you can verify in the workspace that the file is actually there.

  1. Go back to the original terminal, where you were connected to the workspace via SSH.
  2. On the workspace, run the same command as before, but this time to show the contents of the second file:
    • cat /data/volume_1/second.txt

Can you see that the command returns the word "hello"? Congratulations! That proves that you have uploaded the right file!

4.4 Delete the workspace

You are now ready to delete the workspace. Do it as you did in previous exercises.

See that the volume is still there! Can you see that it is now back in state Available?

You can now launch a new workspace and attach the volume to it again (make sure you choose the app "Ubuntu 18.04 with storage"). Can you still see the data there?

Make sure to delete the workspace once you no longer need it.

4.5 Delete the volume

Following the same steps as for deleting workspaces, you can delete your data volumes. Delete now the volume you were playing with.