Cluster Introduction (psycl, pirol)

This markdown document contains an introduction to the MPCDF clusters available for MPI-P users.

1. Disclaimer
2. Overview of the cluster structure
3. How to get your mpcdf account
4. First login to MPCDF
5. Create an alias for pirol/psycl login
6. Transfer data from MPI-P cluster to any of the MPCDF clusters (and use FileZilla for easy up- and downloads)
7. Working with the cluster
8. Copy data from/to cluster
9. Useful resources

1. Disclaimer

Most of the code is for Mac (Unix-system) users and may need some adjustments for windows users.
The document provides suggestions of how things can be dealt with when using the cluster. But of course there are other ways to use/interact with it.

2. Overview of the cluster structure

Psycl will be discontinued in december 2024. Then all users have to use pirol. The nexus storage is available from both sides, so that it is very advisable to store your data on nexus until the folder structure on pirol is ready.

3. How to get your mpcdf account

You can apply for the access to any of the mpcdf clusters following this link. Aplly for an account access there and additionally you should e-mail Benno Pütz to get assigned to one or more groups. For instance, you may wish to have access to /g/mpsnmr(neuroimaging core unit). Additional permissions can be granted later, too. You don't have to know all potentially relevant folder structures, when applying for an account. Once your request has been approved you will receive two e-mails, one with your user name and one with a password. You have to change the password and add an 2-factor authentication system (e.g. with google authenticator). For this follow the instructions in the e-mails.

4. First login to MPCDF

open a terminal on your local machine
type ssh -XA yourUserName@gate1.mpcdf.mpg.de; don't forget to replace your 'yourUserName' with your actual user name
you will be asked for your password
you will be asked for your OTP
once both is provided you will arrive at gate 1. From here you may either connect to psycl or pirol. Enter the following:

ssh -XA yourUserName@psycl01.bc.rzg.mpg.de OR
ssh -XA yourUserName@pirol01.hpccloud.mpcdf.mpg.de
don't forget to replace your user name

upon your first login your home directoy will be created automatically (/u/yourUserName)
a much faster and easier way to connect with the cluster can be achieved by creating an alias for the login. The next section deals with this.

5. Create an alias for pirol/psycl login (and use FileZilla for easy up- and downloads)

On your local machine in your ssh-folder you need to modify the .config file

Show hidden files by typing: find /path/to/folder -name ".*” (Unix) OR dir /A:H /B /S "C:\path\to\folder” (Windows --> ChatGPT)
Open the .config file (e.g. by typing nano config) and add the lines below for psycl:

# This is ~/.ssh/config
Host *
ServerAliveInterval 10
ServerAliveCountMax 5

Host psycl01.bc.rzg.mpg.de
User yourUserName
ProxyCommand ssh -W %h:%p gate1.mpcdf.mpg.de 2>/dev/null
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
ForwardAgent yes
ForwardX11 yes
  
Host gate1.mpcdf.mpg.de
User  yourUserName
GSSAPIAuthentication yes  
GSSAPIDelegateCredentials yes
ControlMaster auto
ControlPath ~/.ssh/control:%h:%p:%r
ForwardAgent yes
ForwardX11 yes

or for pirol:

# This is ~/.ssh/config
Host *
ServerAliveInterval 10
ServerAliveCountMax 5

Host pirol01.hpccloud.mpcdf.mpg.de
User yourUserName
ProxyCommand ssh -W %h:%p gate1.mpcdf.mpg.de 2>/dev/null
GSSAPIAuthentication yes  
GSSAPIDelegateCredentials yes
ForwardAgent yes  
ForwardX11 yes
  
Host gate1.mpcdf.mpg.de
User YourUserName
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
ControlMaster auto
ControlPath ~/.ssh/control:%h:%p:%r`

don't forget to replace yourUserName with your actual user name

If you wish to add both (pirol and psycl) to your list you only need to copy the full block listed above once and for the second host (e.g. pirol) you simply add the middle block (starting with Host pirol...) below that.
save all changes and close your .config file
Next you have to modify your /.bashrc file (or other bash file depending on the system you are using) as follows:

alias psycl='ssh psycl01.bc.rzg.mpg.de'
alias pirol='ssh pirol01.hpccloud.mpcdf.mpg.de'

you can use any alias you like as long as it is unique. Save and close the file.
for your next login you simply open a terminal and type the alias (e.g. psycl if you want to connect with psycl). You only need the password and OTP once
Once your set your alias you can also use 'FileZilla' and moste likely (but didn't try out myself) 'putty' for easy uploads in a drag and drop manner. For this you first need the corresponding software e.g. FileZilla installed on your local machine. Then open an additional terminal and type 'ssh erhartm@gate1.mpcdf.mpg.de -L 2002:pirol01.hpccloud.mpcdf.mpg.de:22 -N' or 'ssh erhartm@gate1.mpcdf.mpg.de -L 2002:psycl01.bc.rzg.mpg.de:22 -N' depending on the cluster you wish to connect with. If you are not already logged in in another terminal, you will be asked for your password and OTP. After successfully providing both you have created a tunnel and can switch to FileZilla.
Add a new host with the credentials shown below:
Now you can use 'FileZilla' as usually. Navigate, view, up- or download files with the visual interface.

6. Transfer data from MPI-P cluster to any of the MPCDF clusters

on MPI-P cluster you first have to create .tar files of the data you wish to transfer. You can store approx. 500 GB into each .tar file. Otherwise creating the files takes too long and is prone to errors. Do the following:
- open a new screen on the cluster by typing screen -S meaningfulScreenName
- start the job with srun e.g. srun tar -cf NameOfTarFileToBeCreated.tar NameOfFolder1 NameOfFolder2 NameOfFolder3 NameOfFolder* needs to be replaced with the actual foldernames you want to create a .tar file for. You can also tar only one folder or more than 3. You may want to use a loop if you wish to store many different folders in one .tar file.
- detach from screen and wait until the folders have been packed
If the .tar file is ready check the file with: tar -tf nameOfTarFile.tar > /dev/null; echo $?
- the result should be 0if everything has been tared properly
run a md5sumcommand for your .tarfile and save the resulting code somewhere (e.g. md5sum nameOfTarFile.tar)
Now transfer the file with the following command (here shown for psycl but you may replace it with pirol):

scp -o "ProxyJump yourUserName@gate1.mpcdf.mpg.de" nameOfTarFile.tar yourUserName@psycl01.bc.rzg.mpg.de:/your/destination/folder

make sure to replace all yourUserName spaceholders and the nameOfTarFile.tar spaceholder with the acutal names.

Once the file has arrived at psycl (or pirol) you should run another md5sumcommand on the file and check whether the code is the same as the one you had before.
Then unpack your file with tar -xvf nameOfTarFile.tarand delete the .tarfile from both clusters as well as the original un-tared folder you moved.

7. Working with the cluster

7.1 Submitting a job

In general to submit a job on the cluster you should use the batch system (.shfiles). Your job will then be allocated to one or several nodes depending on the job's demands and availability of the nodes. To run a simple script (without parallelization) you can use the following as a template:


#!/bin/sh
#SBATCH --mem-per-cpu=20000  
#SBATCH --job-name=YourJobName
#SBATCH --output=../YourOutfileName.out
#SBATCH --error=../YourErrorFileName.err

#~~~~~  adjust as needed ~~~~~~~~~~~~~
BASEDIR=/Your/Base/Dir
PROJECT=/Your/Project/Dir
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

source ~/.bashrc
module load matlab

WORKINGPATH=${BASEDIR}/${PROJECT}
cd ${WORKINGPATH}

matlab -r "NAmeOfYourMatlabScript"

7.2 Use parallel processing on the cluster

If you want to run several computations in parallel e.g. first-level analysis for a sample of n=500 subjects you can start all 500 jobs at the same time by using the batch system. This is much more efficient than starting 500 jobs individually. For parallel processing you have to think of a system that "translate" the task ID (only a number) into your subject's name/folder. There are many different ways of how this can be done. Here I will show my way, but you may prefer another solution. To keep track of the subject's progress and potential errors I have a table, that I keep updating. It looks like this:

When using parallel processing I first create a task ID, which is basically an array of numbers from 1 to n of my sample. An example shell script for this may look like this:

#!/bin/sh
sbatch --array=1-15 example.sh

This shell script calls another shell script called example.sh, which starts the actual jobs, one for each subject. As an output you will get an error and an outfile for each subject. It is advisable to add printed text after important points of your computation or to implement printed error statements that you can go through in your outfile. In case some subjects do not run, these "check points" are useful hints for you to debug the code.

#!/bin/sh
#SBATCH --mem-per-cpu=20000 
#SBATCH --job-name=exampleScript
#SBATCH --output=ID-%j.out
#SBATCH --error=ID-%j.err
source ~/.bashrc
module load matlab

cd /psycl/u/erhartm/examples
echo "slurm task id = $SLURM_ARRAY_TASK_ID"
matlab -r "A_function_for_single_sub($SLURM_ARRAY_TASK_ID)"

7.3 Debug in interactive mode

If you have trouble identifying a bug in your code you may want to use an interactive mode for debugging and "see" what you are doing or how a variable looks like before the error occurs. If you have added the two lines ForwardAgent yes and ForwardX11 yes you can open matlab with a visual screen and use it as on a local machine. On Mac you need to have 'XMing' or 'XQuartz' installed.

Login to one of the clusters
type module load matlab(or module load R)
type matlab (or R)
be a bit patient. A screen will open with the usual matlab interface.

8. Copy data from/to cluster

To copy data from the cluster to your local machine use: rsync -rauP psycl01.bc.rzg.mpg.de:/psycl/path/to/folder/or/file /destination/on/your/computer

To copy data from your local machine to the cluster use: rsync -rauP /path/to/folder/or/file/to/copy psycl01.bc.rzg.mpg.de:/psycl/destination/folder

Make sure to replace the source and destination paths with your acutal path/file names. And to replace the psycl settings with the ones for pirol (pirol01.hpccloud.mpcdf.mpg.de).

9. Useful resources

If you face an obstacle you can check out many different resources, that are very well documented:

And if you are really stuck you can e-mail the MPCDF helpdesk. Login credentials are the same as for the clusters.

ClusterIntro/ClusterIntro.md