This markdown document contains an introduction to the MPCDF clusters available for MPI-P users.
1. Disclaimer
2. Overview of the cluster structure
3. How to get your mpcdf account
4. First login to MPCDF
5. Create an alias for pirol/psycl login
6. Transfer data from MPI-P cluster to any of the MPCDF clusters (and use FileZilla for easy up- and downloads)
7. Working with the cluster
8. Copy data from/to cluster
9. Useful resources
- Most of the code is for Mac (Unix-system) users and may need some adjustments for windows users.
- The document provides suggestions of how things can be dealt with when using the cluster. But of course there are other ways to use/interact with it.
Psycl will be discontinued in december 2024. Then all users have to use pirol. The nexus storage is available from both sides, so that it is very advisable to store your data on nexus until the folder structure on pirol is ready.
You can apply for the access to any of the mpcdf clusters following this link. Aplly for an account access there and additionally you should e-mail Benno Pütz to get assigned to one or more groups. For instance, you may wish to have access to /g/mpsnmr
(neuroimaging core unit). Additional permissions can be granted later, too. You don't have to know all potentially relevant folder structures, when applying for an account.
Once your request has been approved you will receive two e-mails, one with your user name and one with a password. You have to change the password and add an 2-factor authentication system (e.g. with google authenticator). For this follow the instructions in the e-mails.
- open a terminal on your local machine
- type
ssh -XA yourUserName@gate1.mpcdf.mpg.de
; don't forget to replace your 'yourUserName' with your actual user name - you will be asked for your password
- you will be asked for your OTP
- once both is provided you will arrive at gate 1. From here you may either connect to psycl or pirol. Enter the following:
ssh -XA yourUserName@psycl01.bc.rzg.mpg.de
ORssh -XA yourUserName@pirol01.hpccloud.mpcdf.mpg.de
- don't forget to replace your user name
- upon your first login your home directoy will be created automatically (/u/yourUserName)
- a much faster and easier way to connect with the cluster can be achieved by creating an alias for the login. The next section deals with this.
On your local machine in your ssh-folder you need to modify the .config
file
- Show hidden files by typing:
find /path/to/folder -name ".*”
(Unix) ORdir /A:H /B /S "C:\path\to\folder”
(Windows --> ChatGPT) - Open the .config file (e.g. by typing
nano config
) and add the lines below for psycl:
# This is ~/.ssh/config
Host *
ServerAliveInterval 10
ServerAliveCountMax 5
Host psycl01.bc.rzg.mpg.de
User yourUserName
ProxyCommand ssh -W %h:%p gate1.mpcdf.mpg.de 2>/dev/null
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
ForwardAgent yes
ForwardX11 yes
Host gate1.mpcdf.mpg.de
User yourUserName
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
ControlMaster auto
ControlPath ~/.ssh/control:%h:%p:%r
ForwardAgent yes
ForwardX11 yes
- or for pirol:
# This is ~/.ssh/config
Host *
ServerAliveInterval 10
ServerAliveCountMax 5
Host pirol01.hpccloud.mpcdf.mpg.de
User yourUserName
ProxyCommand ssh -W %h:%p gate1.mpcdf.mpg.de 2>/dev/null
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
ForwardAgent yes
ForwardX11 yes
Host gate1.mpcdf.mpg.de
User YourUserName
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
ControlMaster auto
ControlPath ~/.ssh/control:%h:%p:%r`
- don't forget to replace
yourUserName
with your actual user name
- If you wish to add both (pirol and psycl) to your list you only need to copy the full block listed above once and for the second host (e.g. pirol) you simply add the middle block (starting with
Host pirol...
) below that. - save all changes and close your
.config
file - Next you have to modify your
/.bashrc
file (or other bash file depending on the system you are using) as follows:
alias psycl='ssh psycl01.bc.rzg.mpg.de'
alias pirol='ssh pirol01.hpccloud.mpcdf.mpg.de'
-
you can use any alias you like as long as it is unique. Save and close the file.
-
for your next login you simply open a terminal and type the alias (e.g.
psycl
if you want to connect with psycl). You only need the password and OTP once -
Once your set your alias you can also use 'FileZilla' and moste likely (but didn't try out myself) 'putty' for easy uploads in a drag and drop manner. For this you first need the corresponding software e.g. FileZilla installed on your local machine. Then open an additional terminal and type 'ssh erhartm@gate1.mpcdf.mpg.de -L 2002:pirol01.hpccloud.mpcdf.mpg.de:22 -N' or 'ssh erhartm@gate1.mpcdf.mpg.de -L 2002:psycl01.bc.rzg.mpg.de:22 -N' depending on the cluster you wish to connect with. If you are not already logged in in another terminal, you will be asked for your password and OTP. After successfully providing both you have created a tunnel and can switch to FileZilla.
-
Add a new host with the credentials shown below:
-
Now you can use 'FileZilla' as usually. Navigate, view, up- or download files with the visual interface.
- on MPI-P cluster you first have to create .tar files of the data you wish to transfer. You can store approx. 500 GB into each .tar file. Otherwise creating the files takes too long and is prone to errors. Do the following:
- open a new screen on the cluster by typing
screen -S meaningfulScreenName
- start the job with
srun
e.g.srun tar -cf NameOfTarFileToBeCreated.tar NameOfFolder1 NameOfFolder2 NameOfFolder3 NameOfFolder*
needs to be replaced with the actual foldernames you want to create a.tar
file for. You can also tar only one folder or more than 3. You may want to use a loop if you wish to store many different folders in one.tar
file. - detach from screen and wait until the folders have been packed
- open a new screen on the cluster by typing
- If the
.tar
file is ready check the file with:tar -tf nameOfTarFile.tar > /dev/null; echo $?
- the result should be
0
if everything has been tared properly
- the result should be
- run a
md5sum
command for your.tar
file and save the resulting code somewhere (e.g.md5sum nameOfTarFile.tar
) - Now transfer the file with the following command (here shown for psycl but you may replace it with pirol):
scp -o "ProxyJump yourUserName@gate1.mpcdf.mpg.de" nameOfTarFile.tar yourUserName@psycl01.bc.rzg.mpg.de:/your/destination/folder
- make sure to replace all
yourUserName
spaceholders and thenameOfTarFile.tar
spaceholder with the acutal names.
- Once the file has arrived at psycl (or pirol) you should run another
md5sum
command on the file and check whether the code is the same as the one you had before. - Then unpack your file with
tar -xvf nameOfTarFile.tar
and delete the.tar
file from both clusters as well as the original un-tared folder you moved.
In general to submit a job on the cluster you should use the batch system (.sh
files). Your job will then be allocated to one or several nodes depending on the job's demands and availability of the nodes. To run a simple script (without parallelization) you can use the following as a template:
#!/bin/sh
#SBATCH --mem-per-cpu=20000
#SBATCH --job-name=YourJobName
#SBATCH --output=../YourOutfileName.out
#SBATCH --error=../YourErrorFileName.err
#~~~~~ adjust as needed ~~~~~~~~~~~~~
BASEDIR=/Your/Base/Dir
PROJECT=/Your/Project/Dir
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
source ~/.bashrc
module load matlab
WORKINGPATH=${BASEDIR}/${PROJECT}
cd ${WORKINGPATH}
matlab -r "NAmeOfYourMatlabScript"
If you want to run several computations in parallel e.g. first-level analysis for a sample of n=500 subjects you can start all 500 jobs at the same time by using the batch system. This is much more efficient than starting 500 jobs individually. For parallel processing you have to think of a system that "translate" the task ID (only a number) into your subject's name/folder. There are many different ways of how this can be done. Here I will show my way, but you may prefer another solution. To keep track of the subject's progress and potential errors I have a table, that I keep updating. It looks like this:
When using parallel processing I first create a task ID, which is basically an array of numbers from 1 to n of my sample. An example shell script for this may look like this:
#!/bin/sh
sbatch --array=1-15 example.sh
This shell script calls another shell script called example.sh
, which starts the actual jobs, one for each subject. As an output you will get an error and an outfile for each subject. It is advisable to add printed text after important points of your computation or to implement printed error statements that you can go through in your outfile. In case some subjects do not run, these "check points" are useful hints for you to debug the code.
#!/bin/sh
#SBATCH --mem-per-cpu=20000
#SBATCH --job-name=exampleScript
#SBATCH --output=ID-%j.out
#SBATCH --error=ID-%j.err
source ~/.bashrc
module load matlab
cd /psycl/u/erhartm/examples
echo "slurm task id = $SLURM_ARRAY_TASK_ID"
matlab -r "A_function_for_single_sub($SLURM_ARRAY_TASK_ID)"
If you have trouble identifying a bug in your code you may want to use an interactive mode for debugging and "see" what you are doing or how a variable looks like before the error occurs. If you have added the two lines ForwardAgent yes
and ForwardX11 yes
you can open matlab with a visual screen and use it as on a local machine. On Mac you need to have 'XMing' or 'XQuartz' installed.
- Login to one of the clusters
- type
module load matlab
(ormodule load R
) - type
matlab
(orR
) - be a bit patient. A screen will open with the usual matlab interface.
To copy data from the cluster to your local machine use:
rsync -rauP psycl01.bc.rzg.mpg.de:/psycl/path/to/folder/or/file /destination/on/your/computer
To copy data from your local machine to the cluster use:
rsync -rauP /path/to/folder/or/file/to/copy psycl01.bc.rzg.mpg.de:/psycl/destination/folder
Make sure to replace the source and destination paths with your acutal path/file names. And to replace the psycl
settings with the ones for pirol
(pirol01.hpccloud.mpcdf.mpg.de
).
If you face an obstacle you can check out many different resources, that are very well documented:
- The MPCDF overview page
- The MPCDF overview about gateway machines
- The MPCDF instruction for psycl and pirol
- Our self-made IT-Wiki on github molgen
And if you are really stuck you can e-mail the MPCDF helpdesk. Login credentials are the same as for the clusters.