404
+ +Page not found
+ + +diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 00000000..e69de29b diff --git a/404.html b/404.html new file mode 100644 index 00000000..e1030eab --- /dev/null +++ b/404.html @@ -0,0 +1,148 @@ + + +
+ + + + +Page not found
+ + +Objectives
+Note
+Guides and documentation for the batch system at HPC2N here at: HPC2N’s batch system documentation.
+Using a job script is often recommended.
+Note
+When you submit a job, the system will return the Job ID. You can also get it with squeue -me
. See below.
In the following, JOBSCRIPT is the name you have given your job script and JOBID is the job ID for your job, assigned by Slurm. USERNAME is your username.
+sbatch JOBSCRIPT
squeue -u USERNAME
or squeue --me
srun commands-for-your-job/program
scontrol show job JOBID
scancel JOBID
scancel -u USERNAME
salloc -A PROJECT-ID .......
srun
to run on the allocated resources.srun MYPROGRAM
sacct -l -j JOBID -o jobname,NTasks,nodelist,MaxRSS,MaxVMSize
man sacct
sacct -l -j JOBID -o jobname,NTasks,nodelist,MaxRSS,MaxVMSize | less -S
job-usage JOBID
man sbatch
, man srun
, man ....
Example
+Submit job with sbatch
Check status with squeue --me
b-an01 [~]$ squeue --me
+ JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
+ 27774852 cpu_zen4 simple.s bbrydsoe R 0:00 1 b-cn1701
+
Submit several jobs (here several instances of the same), check on the status
+b-an01 [~]$ sbatch simple.sh
+Submitted batch job 27774872
+b-an01 [~]$ sbatch simple.sh
+Submitted batch job 27774873
+b-an01 [~]$ sbatch simple.sh
+Submitted batch job 27774874
+b-an01 [~]$ squeue --me
+ JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
+ 27774873 cpu_zen4 simple.s bbrydsoe R 0:02 1 b-cn1702
+ 27774874 cpu_zen4 simple.s bbrydsoe R 0:02 1 b-cn1702
+ 27774872 cpu_zen4 simple.s bbrydsoe CG 0:04 1 b-cn1702
+
The status “R” means it is running. “CG” means completing. When a job is pending it has the state “PD”.
+In these examples the jobs all ended up on nodes in the partition cpu_zen4. We will soon talk more about different types of nodes.
+The official name for batch scripts in Slurm is Job Submission Files, but here we will use both names interchangeably. If you search the internet, you will find several other names used, including Slurm submit file, batch submit file, batch script, job script.
+A job submission file can contain any of the commands that you would otherwise issue yourself from the command line. It is, for example, possible to both compile and run a program and also to set any necessary environment values (though remember that Slurm exports the environment variables in your shell per default, so you can also just set them all there before submitting the job).
+Note
+The results from compiling or running your programs can generally be seen after the job has completed, though as Slurm will write to the output file during the run, some results will be available quicker.
+Outputs and any errors will per default be placed in the directory you are running from, though this can be changed.
+Note
+This directory should preferrably be placed under your project storage, since your home directory only has 25 GB of space.
+The output file from the job run will default be named slurm-JOBID.out
. It will contain both output as well as any errors. You can look at the content with vi
, nano
, emacs
, cat
, less
…
The exception is if your program creates its own output files, or if you name the output file(s) differently within your jobscript.
+Note
+You can use Slurm commands within your job script to split the error and output in separate files, and name them as you want. It is highly recommended to include the environment variable %J
(the job ID) in the name, as that is an easy way to get a new name for each time you run the script and thus avoiding the previous output being overwritten.
Example, using the environment variable %J
:
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out
A job submission file can either be very simple, with most of the job attributes specified on the command line, or it may consist of several Slurm directives, comments and executable statements. A Slurm directive provides a way of specifying job attributes in addition to the command line options.
+Naming: You can name your script anything, including the suffix. It does not matter. Just name it something that makes sense to you and helps you remember what the script is for. The standard is to name it with a suffix of .sbatch
or .sh
.
Simple, serial job script
+#!/bin/bash
+# The name of the account you are running in, mandatory.
+#SBATCH -A hpc2nXXXX-YYY
+# Request resources - here for a serial job
+# tasks per core is 1 as default (can be changed with ``-c``)
+#SBATCH -n 1
+# Request runtime for the job (HHH:MM:SS) where 168 hours is the maximum. Here asking for 15 min.
+#SBATCH --time=00:15:00
+
+# Clear the environment from any previously loaded modules
+module purge > /dev/null 2>&1
+
+# Load the module environment suitable for the job - here foss/2022b
+module load foss/2022b
+
+# And finally run the serial jobs
+./my_serial_program
+
Note
+#!/bin/bash
at the beginning of the script, since bash is the only supported shell. Some things may work under other shells, but not everything.#SBATCH
.#
in front of a text line means it is a comment, with the exception of the string #SBATCH
. In order to comment out the Slurm directives, you need to put one more #
in front of the #SBATCH
.#SBATCH
. Otherwise the line will be considered a comment, and ignored.Let us go through the most commonly used arguments:
+projinfo
. The PROJ-ID argument is of the formSimple MPI program
+#!/bin/bash
+# The name of the account you are running in, mandatory.
+#SBATCH -A hpc2nXXXX-YYY
+# Request resources - here for eight MPI tasks
+#SBATCH -n 8
+# Request runtime for the job (HHH:MM:SS) where 168 hours is the maximum. Here asking for 15 min.
+#SBATCH --time=00:15:00
+
+# Clear the environment from any previously loaded modules
+module purge > /dev/null 2>&1
+
+# Load the module environment suitable for the job - here foss/2022b
+module load foss/2022b
+
+# And finally run the job - use srun for MPI jobs, but not for serial jobs
+srun ./my_mpi_program
+
If you have not already done so, clone the material from the website https://github.com/hpc2n/intro-course:
+/proj/nobackup/intro-hpc2n/
. You will now find several small programs and batch scripts which are used in this section and the next, “Simple examples”.
+In this section, we are just going to try submitting a few jobs, checking their status, cancelling a job, and looking at the output.
+Preparations
+foss/2022b
(ml foss/2022b
) on the regular login node. This module is available on all nodes. hello.c
, mpi_hello.c
, mpi_greeting.c
, and mpi_hi.c
+simple.sh
, mpi_greeting.sh
, mpi_hello.sh
, mpi_hi.sh
, multiple-parallel-sequential.sh
, multiple-parallel.sh
, or multiple-parallel-simultaneous.sh
. Exercise: sbatch and squeue
+Submit (sbatch
) one of the batch scripts listed in 3. under preparations. Check with squeue --me
if it is running, pending, or completing.
Exercise: sbatch and scontrol show job
+Submit a few instances of multiple-parallel.sh
and multiple-parallel-sequential.sh
(so they do not finish running before you have time to check on them).
Do scontrol show job JOBID
on one or more of the job IDs. You should be able to see node assigned (unless the job has not yet had one allocated), expected runtime, etc. If the job is running, you can see how long it has run. You will also get paths to submit directory etc.
Exercise: sbatch and scancel
+Submit a few instances of multiple-parallel.sh
and multiple-parallel-sequential.sh
(so they do not finish running before you have time to check on them).
Do squeue --me
and see the jobs listed. Pick one and do scancel JOBID
on it. Do squeue --me
again to see it is no longer there.
Exercise: check output
+Use nano
to open one of the output files slurm-JOBID.out
.
Try adding #SBATCH --error=job.%J.err
and #SBATCH --output=job.%J.out
to one of the batch scripts (you can edit it with nano
). Submit the batch script again. See that the expected files get created.
As mentioned under the introduction, Kebnekaise is a very heterogeneous system, comprised of several different types of CPUs and GPUs. The batch system reflects these several different types of resources.
+At the top we have partitions, which are similar to queues. Each partition is made up of a specific set of nodes. At HPC2N we have three classes of partitions, one for CPU-only nodes, one for GPU nodes and one for large memory nodes. Each node type also has a set of features that can be used to select which node(s) the job should run on.
+The three types of nodes also have corresponding resources one must apply for in SUPR to be able to use them.
+While Kebnekaise has multiple partitions, one for each major type of resource, there is only a single partition, batch
, that users can submit jobs to. The system then figures out which partition(s) the job should be sent to, based on the requested features.
Node overview
+The “Type” can be used if you need a specific type of node. More about that later.
+CPU-only nodes
+CPU | +Memory/core | +number nodes | +Type | +
---|---|---|---|
2 x 14 core Intel broadwell | +4460 MB | +48 | +broadwell (intel_cpu) | +
2 x 14 core Intel skylake | +6785 MB | +52 | +skylake (intel_cpu) | +
2 x 64 core AMD zen3 | +8020 MB | +1 | +zen3 (amd_cpu) | +
2 x 128 core AMD zen4 | +2516 MB | +8 | +zen4 (amd_cpu) | +
GPU enabled nodes
+CPU | +Memory/core | +GPU card | +number nodes | +Type | +
---|---|---|---|---|
2 x 14 core Intel broadwell | +9000 MB | +2 x Nvidia A40 | +4 | +a40 | +
2 x 14 core Intel skylake | +6785 MB | +2 x Nvidia V100 | +10 | +v100 | +
2 x 24 core AMD zen3 | +10600 MB | +2 x Nvidia A100 | +2 | +a100 | +
2 x 24 core AMD zen3 | +10600 MB | +2 x AMD MI100 | +1 | +mi100 | +
2 x 24 core AMD zen4 | +6630 MB | +2 x Nvidia A6000 | +1 | +a6000 | +
2 x 24 core AMD zen4 | +6630 MB | +2 x Nvidia L40s | +10 | +l40s | +
2 x 48 core AMD zen4 | +6630 MB | +4 x Nvidia H100 SXM5 | +2 | +h100 | +
Large memory nodes
+CPU | +Memory/core | +number nodes | +Type | +
---|---|---|---|
4 x 18 core Intel broadwell | +41666 MB | +8 | +largemem | +
To make it possible to target nodes in more detail there are a couple of features defined on each group of nodes. To select a feature one can use the -C
option to sbatch
or salloc
. This sets constraints on the job.
There are several reasons why one might want to do that, including for benchmarks, to be able to replicate results (in some cases), because specific modules are only available for certain architectures, etc.
+To constrain a job to a certain feature, use
+ +Note
+Features can be combined using “and” (&
) or “or” (|
). They should be wrapped in '
’s.
Example:
+ +List of constraints:
+For selecting type of CPU
+Type is:
+For selecting type of GPU
+Type is:
+For GPUs, the above GPU list of constraints can be used either as a specifier to --gpu=type:number
or as a constraint together with an unspecified gpu request --gpu=number
.
For selecting GPUs with certain features
+Type is:
+For selecting large memory nodes
+Type is:
+Nodes with a combination of features: a Zen4 CPU and a GPU with AI features
+ +To use GPU resources one has to explicitly ask for one or more GPUs. Requests for GPUs can be done either in total for the job or per node of the job.
+ + +Asking for a specific type of GPU
+As mentioned before, for GPUs, constraints can be used either as a specifier to
+--gpu=type:number
or as a constraint together with an unspecified gpu request
+--gpu=number
.
where Type is, as mentioned:
+Simple GPU Job - V100
+#!/bin/bash
+#SBATCH -A hpc2nXXXX-YYY
+# Expected time for job to complete
+#SBATCH --time=00:10:00
+# Number of GPU cards needed. Here asking for 2 V100 cards
+#SBATCH --gpu=v100:2
+
+# Clear the environment from any previously loaded modules
+module purge > /dev/null 2>&1
+# Load modules needed for your program - here fosscuda/2021b
+ml fosscuda/2021b
+
+./my-gpu-program
+
Important
+#SBATCH -A hpc2n2024-084
/proj/nobackup/intro-hpc2n
. Keypoints
+sbatch SUBMIT-SCRIPT
. squeue --me
. srun
in front of your executable in the batch script (unless you use software which handles the parallelization itself). Objectives
+There are compilers available for Fortran 77, Fortran 90, Fortran 95, C, and C++. The compilers can produce both general-purpose code and architecture-specific optimized code to improve performance (loop-level optimizations, inter-procedural analysis and cache optimizations).
+Note
+You need to load a compiler suite (and possibly libraries, depending on what you need) before you can compile and link.
+Use ml av
to get a list of available compiler toolchains
as mentioned in the modules - compiler toolchains section.
You load a compiler toolchain the same way you load any other module. They are always available directly, without the need to load prerequisites first.
+Hint
+Code-along!
+Example: Loading foss/2023b
+This compiler toolchain contains: GCC/13.2.0
, BLAS
(with LAPACK
), ScaLAPACK
, and FFTW
.
b-an01 [~]$ ml foss/2023b
+b-an01 [~]$ ml
+
+Currently Loaded Modules:
+ 1) snicenvironment (S) 7) numactl/2.0.16 13) libevent/2.1.12 19) FlexiBLAS/3.3.1
+ 2) systemdefault (S) 8) XZ/5.4.4 14) UCX/1.15.0 20) FFTW/3.3.10
+ 3) GCCcore/13.2.0 9) libxml2/2.11.5 15) PMIx/4.2.6 21) FFTW.MPI/3.3.10
+ 4) zlib/1.2.13 10) libpciaccess/0.17 16) UCC/1.2.0 22) ScaLAPACK/2.2.0-fb
+ 5) binutils/2.40 11) hwloc/2.9.2 17) OpenMPI/4.1.6 23) foss/2023b
+ 6) GCC/13.2.0 12) OpenSSL/1.1 18) OpenBLAS/0.3.24
+
+ Where:
+ S: Module is Sticky, requires --force to unload or purge
+
+
+
+b-an01 [~]$
+
Note
+OpenMP: All compilers has this included, so it is enough to load the module for a specific compiler toolchain and then add the appropriate flag.
+Note
+If you do not name the executable (with the flag -o SOMENAME
, it will be named a.out
as default.
This also means that the next time you compile something, if you also do not name that executable, it will overwrite the previous a.out
file.
Language | +Compiler name | +MPI | +
---|---|---|
Fortran77 | +gfortran | +mpif77 | +
Fortran90 | +gfortran | +mpif90 | +
Fortran95 | +gfortran | +N/A | +
C | +gcc | +mpicc | +
C++ | +g++ | +mpiCC | +
In order to access the MPI compilers, load a compiler toolchain which contains an MPI library.
+Hint
+Code-along!
+Example: compiling a C program
+You can find the file hello.c
in the exercises directory, in the subdirectory “simple”. Or you can download it here: hello.c.
In this example we compile the C program hello.c
and name the output (the executable) hello
.
You can run the executable with ./hello
Example: compiling an MPI C program
+You can find the file mpi_hello.c
in the exercises directory, in the subdirectory “simple”. Or you can download it here: mpi_hello.c.
In this example we compile the MPI C program mpi_hello.c
and name the output (the executable) mpi_hello
.
You then run with `mpirun mpi_hello
Important
+If you later have loaded a different compiler than the one your program was compiled with, you should recompile your program before running it.
+Exercise
+Try loading foss/2023b
and compiling mpi_hello.c
, then unload the module and instead load the module intel/2023b
and see what happens if you try to run with mpirun mpi_hello
.
Note
+List of commonly used flags:
+Hint
+Code-along!
+Example: compiling an OpenMP C program
+You can find the file omp_hello.c
in the exercises directory, in the subdirectory “simple”. Or you can download it here: omp_hello.c.
In this example we compile the OpenMP C program omp_hello.c
and name the output (executable) omp_hello
.
Note
+You can change the number of threads with export OMP_NUM_THREADS=#threads
Hint
+Code-along!
+Example
+Run the binary omp_hello
that we got in the previous example. Set the number of threads to 4 and then rerun the binary.
b-an01 [~]$ ./omp_hello
+Thread 0 says: Hello World
+Thread 0 reports: the number of threads are 1
+b-an01 [~]$ export OMP_NUM_THREADS=4
+b-an01 [~]$ ./omp_hello
+Thread 1 says: Hello World
+Thread 0 says: Hello World
+Thread 0 reports: the number of threads are 4
+Thread 3 says: Hello World
+Thread 2 says: Hello World
+b-an01 [~]$
+
Exercise
+Try yourself! Rerun with OMP_NUM_THREADS set to 1, 2, 4, 8.
+NOTE: Normally you are not supposed to run anything on the command line, but these are very short and light-weight programs.
+Exercise
+You could try with a different toolchain (or version). Remember to unload/purge, load the new toolchain, compile the program again, and then run.
+Language | +Compiler name | +MPI | +
---|---|---|
Fortran77 | +ifort | +mpiifort | +
Fortran90 | +ifort | +mpiifort | +
Fortran95 | +ifort | +N/A | +
C | +icc | +mpiicc | +
C++ | +icpc | +mpiicc | +
In order to access the MPI compilers, load a compiler toolchain which contains an MPI library.
+Example: compiling a C program
+We are again compiling the hello.c
program from before. This time we name the executable hello_intel
to not overwrite the previously created executable.
Note
+List of commonly used flags:
+Using a compiler toolchain by itself is possible but requires a fair bit of manual work, figuring out which paths to add to -I or -L for including files and libraries, and similar.
+To make life as a software builder easier there is a special module available, buildenv
, that can be loaded on top of any toolchain. If it is missing for some toolchain, send a mail to support@hpc2n.umu.se and let us know.
This module defines a large number of environment variables with the relevant settings for the used toolchain. Among other things it sets CC, CXX, F90, FC, MPICC, MPICXX, MPIF90, CFLAGS, FFLAGS, and much more.
+To see all of them, after loading a toolchain do:
+ +To use the environment variables, load buildenv:
+ +Using the environment variable (prefaced with $) for linking is highly recommended!
+Example
+Linking with LAPACK (gcc, C program).
+ +OR use the environment variable $LIBLAPACK
:
Note
+You can see a list of all the libraries on Kebnekaise (June 2024) here: https://docs.hpc2n.umu.se/documentation/compiling/#libraries.
+Keypoints
+ml show buildenv
after loading a compiler toolchain Objectives
++ | Project storage | +$HOME | +/scratch | +
---|---|---|---|
Recommended for batch jobs |
+Yes | +No (size) | +Yes | +
Backed up | +No | +Yes | +No | +
Accessible by batch system |
+Yes | +Yes | +Yes (node only) | +
Performance | +High | +High | +Medium | +
Default readability | +Group only | +Owner | +Owner | +
Permissions management |
+chmod, chgrp, ACL | +chmod, chgrp, ACL | +N/A for batch jobs | +
Notes | +Storage your group get allocated through the storage projects |
+Your home-directory | +Per node | +
This is your home-directory (pointed to by the $HOME
variable). It has a quota limit of 25GB per default. Your home directory is backed up regularly.
Note
+Since the home directory is quite small, it should not be used for most production jobs. These should instead be run from project storage directories.
+To find the path to your home directory, either run pwd
just after logging in, or do the following:
Project storage is where a project’s members have the majority of their storage. It is applied for through SUPR, as a storage project. While storage projects needs to be applied for separately, they are usually linked to a compute project.
+This is where you should keep your data and run your batch jobs from. It offers high performance when accessed from the nodes making it suitable for storage that are to be accessed from parallel jobs, and your home directory (usually) has too little space.
+Project storage is located below /proj/nobackup/
in the directory name selected during the creation of the proposal.
Note
+The project storage is not intended for permanent storage and there is NO BACKUP of /proj/nobackup
.
/proj/nobackup/NAME-YOU-PICKED
Exercise
+Go to the course project storage and create a subdirectory for yourself.
+Now is a good time to prepare the course material and download the exercises. The easiest way to do so is by cloning the whole intro-course repository from GitHub.
+Exercise
+/proj/nobackup/intro-hpc2n
git clone https://github.com/hpc2n/intro-course.git
You will get a directory called intro-course
. Below it you will find a directory called “exercises” where the majority of the exercises for the batch system section is located.
The size of the storage depends on the allocation. There are small, medium, and large storage projects, each with their own requirements. You can read about this on SUPR. The quota limits are specific for the project as such, there are no user level quotas on that space.
+Our recommendation is that you use the project storage instead of /scratch
when working on Compute nodes or Login nodes.
On the computers at HPC2N there is a directory called /scratch
. It is a small local area split between the users using the node and it can be used for saving (temporary) files you create or need during your computations. Please do not save files in /scratch
you don’t need when not running jobs on the machine, and please make sure your job removes any temporary files it creates.
Note
+When anybody need more space than available on /scratch
, we will remove the oldest/largest files without any notices.
More information about the file system, as well as archiving and compressing files, at the HPC2N documentation about File Systems.
+Keypoints
+/home/u/username
and is pointed to by the environment variable $HOME
./proj/nobackup/NAME-YOU-PICKED
/proj/nobackup/intro-hpc2n
.This material
+Here you will find the content of the workshop “Introduction to Kebnekaise”.
+You can download the markdown files for the presentation as well as the exercises from https://github.com/hpc2n/intro-course
+git clone https://github.com/hpc2n/intro-course.git
in a terminal windowSome useful links:
+Prerequisites
+Content
+Application examples (batch system)
+This course will consist of lectures and type-alongs, as well as a few exercises where you get to try out what you have just learned.
+Instructors
+Time | +Topic | +Activity | +
---|---|---|
11:15 | +Welcome+Syllabus | ++ |
11:20 | +Introduction to Kebnekaise and HPC2N | +Lecture | +
11:45 | +Projects and accounts | +Lecture | +
11:50 | +Logging in & editors | +Lecture+exercise | +
12:05 | +The File System | +Lecture+code along | +
12:15 | +LUNCH BREAK | ++ |
13:15 | +The Module System | +Lecture+code along+exercise | +
13:35 | +Compiling | +Lecture+code along+exercise | +
13:50 | +The Batch System | +Lecture+code along | +
14:10 | +Simple Examples | +Lecture+exercises | +
14:45 | +COFFEE BREAK | ++ |
15:00 | +Application Examples | +Lecture+code along+exercises | +
16:40 | +Questions+Summary | ++ |
17:00 | +END OF COURSE | ++ |
+ +
+Note
+High Performance Computing Center North (HPC2N) is
+HPC2N provides state-of-the-art resources and expertise:
+Primary objective
+To raise the national and local level of HPC competence and transfer HPC knowledge and technology to new users in academia and industry.
+HPC2N is hosted by:
++
Partners:
++ + +
+Funded mainly by Umeå University, with contributions from the other HPC2N partners.
+Involved in several projects and collaborations:
++ +
++ +
++
+Management:
+Application experts:
+Others:
+System and support:
+The current supercomputer at HPC2N. It is a very heterogeneous system.
+Kebnekaise was
+In 2024 Kebnekaise was extended with
+Kebnekaise will be continuosly upgraded, as old hardware gets retired.
+Kebnekaise have CPU-only, GPU enabled and large memory nodes.
+The CPU-only nodes are:
+The GPU enabled nodes are:
+The large memory nodes are:
+GPUs can have different types of cores:
+GPU Type | +CUDA cores / stream processors | +TENSOR cores / matrix cores | +RT cores | +
---|---|---|---|
A40 | +10752 | +336 | ++ |
V100 | +5120 | +640 | ++ |
A100 | +6912 | +432 | ++ |
MI100 | +7680 | +480 | ++ |
A6000 | +10752 | +386 | ++ |
L40S | +18176 | +568 | +142 | +
H100 | +16896 | +528 | ++ |
NOTE that just like you cannot really compare CPU cores directly (speed etc.) you also cannot just compare CUDA/TENSOR/RT etc. cores directly (more efficient design, faster, etc.)
+Basically four types of storage are available at HPC2N:
+/home/X/Xyz
, $HOME
, ~
/proj/nobackup/abc
$SNIC_TMP
Also
+Compute projects
+To use Kebnekaise, you must be a member of a compute project.
+A compute project contains a certain amount of storage. If more storage is required, you must be a member of a storage project.
+Note
+As Kebnekaise is a local cluster, you need to be affiliated with UmU, IRF, SLU, Miun, or LTU to use it.
+Projects are applied for through SUPR (https://supr.naiss.se).
+I will cover more details in a later section, where we go more into detail about HPC2N and Kebnekaise.
+What is HPC?
+High Performance Computing (definition)
+“High Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business.”
+ +A problem can be large for two main reasons:
+The former can be remedied by increasing the performance
+The latter by adding more memory / storage
+Two memory models are relevant for HPC:
+
+
The programming model changes when we aim for extra performance and/or memory:
+Complexity grows when we aim for extra performance and/or memory/storage:
+When you have your account, you can login to Kebnekaise. This can be done with any number of SSH clients or with ThinLinc (the easiest option if you need a graphical interface).
+Objectives
+Note
+kebnekaise.hpc2n.umu.se
kebnekaise-tl.hpc2n.umu.se
https://kebnekaise-tl.hpc2n.umu.se:300/
In addition, there is a login node for the AMD-based nodes. We will talk more about this later: kebnekaise-amd.hpc2n.umu.se
. For ThinLinc access: kebnekaise-amd-tl.hpc2n.umu.se
ThinLinc is recommended for this course
+ThinLinc: a cross-platform remote desktop server from Cendio AB. Especially useful when you need software with a graphical interface.
+This is what we recommend you use for this course, unless you have a preferred SSH client.
+sudo dpkg -i PATH-TO-FILE/FILE-YOU-DOWNLOADED.deb
kebnekaise-tl.hpc2n.umu.se
. Enter your username.
+You get your first, temporary HPC2N password from this page: HPC2N passwords.
+That page can also be used to reset your HPC2N password if you have forgotten it.
+Note that you are authenticating through SUPR, using that service’s login credentials!
+Warning
+The HPC2N password and the SUPR password are separate! The HPC2N password and your university/department password are also separate!
+Exercise
+Login to Kebnekaise.
+Exercise: Change your password after first login
+ONLY do this if you have logged in for the first time/is still using the termporary password you got from the HPC2N password reset service!
+Changing password is done using the passwd command:
+ +Use a good password that combines letters of different case. Do not use dictionary words. Avoid using the same password that you also use in other places.
+It will first ask for your current password. Type in that and press enter. Then type in the new password, enter, and repeat. You have changed the password.
+We are not going to transfer any files as part of this course, but you may have to do so as part of your workflow when using Kebnekaise (or another HPC centre) for your research.
+This section will only talk briefly about file transfers. You can find more information and examples on HPC2N’s File transfer documentation.
+SCP (Secure CoPy) is a simple way of transferring files between two machines that use the SSH (Secure SHell) protocol. You may use SCP to connect to any system where you have SSH (log-in) access.
+These examples show how to use scp from the command-line. Graphical programs exists for doing scp transfer.
+The command-lone scp program should already be installed.
+Remote to local
+Transfer a file from Kebnekaise to your local system, while on your local system
+ +Local to remote
+Transfer a local file to Kebnekaise, while on your local system
+ +Recursive directory copy from a local system to a remote system
+The directory sourcedirectory
is here copied as a subdirectory to somedir
SFTP (SSH File Transfer Protocol or sometimes called Secure File Transfer Protocol) is a network protocol that provides file transfer over a reliable data stream.
+SFTP is a command -line program on most Unix, Linux, and Mac OS X systems. It is also available as a protocol choice in some graphical file transfer programs.
+Example: From a local system to a remote system
+enterprise-d [~]$ sftp user@kebnekaise.hpc2n.umu.se
+Connecting to kebnekaise.hpc2n.umu.se...
+user@kebnekaise.hpc2n.umu.se's password:
+sftp> put file.c C/file.c
+Uploading file.c to /home/u/user/C/file.c
+file.c 100% 1 0.0KB/s 00:00
+sftp> put -P irf.png pic/
+Uploading irf.png to /home/u/user/pic/irf.png
+irf.png 100% 2100 2.1KB/s 00:00
+sftp>
+
Here you need to download a client: WinSCP, FileZilla (sftp), PSCP/PSFTP, …
+You can transfer with sftp or scp.
+There is documentation in HPC2N’s documentation pages for Windows file transfers.
+Since the editors on a Linux system are different to those you may be familiar with from Windows or macOS, here follows a short overview.
+There are command-line editors and graphical editors. If you are connecting with a regular SSH client, it will be simplest to use a command-line editor. If you are using ThinLinc, you can use command-line editors or graphical editors as you want.
+These are all good editors for using on the command line:
+ +They are all installed on Kebnekaise.
+Of these, vi/vim
as well as emacs
are probably the most powerful, though the latter is better in a GUI environment. The easiest editor to use if you are not familiar with any of them is nano
.
Nano
+nano FILENAME
on the command line and press Enter
. FILENAME
is whatever you want to call your file.FILENAME
is a file that already exists, nano
will open the file. If it dows not exist, it will be created.^
before the letter-commands means you should press CTRL
and then the letter (while keeping CTRL
down).CTRL
and then x
while holding CTRL
down (this is written CTRL-x
or ^x
). nano
will ask you if you want to save the content of the buffer to the file. After that it will exit.There is a manual for nano
here.
If you are connecting with ThinLinc, you will be presented with a graphical user interface (GUI).
+From there you can either
+Applications
-> System Tools
-> MATE Terminal
) Applications
-> Accessories
. This gives several editor options, of which these have a graphical interface:If you are not familiar with any of these, a good recommendation would be to use Text Editor/gedit
.
Text Editor/gedit
+gedit
”: Applications
-> Accessories
-> Text Editor
.Open
” in the top menu.Save
” in the menu.Find
” and “Find and Replace
”.Keypoints
+Objectives
+Most programs are accessed by first loading them as a ‘module’.
+Modules are:
+module spider
or ml spider
module spider MODULE
or ml spider MODULE
module spider MODULE/VERSION
or ml spider MODULE/VERSION
module avail
or ml av
module list
or ml
module load MODULE
or ml MODULE
module load MODULE/VERSION
or ml MODULE/VERSION
module unload MODULE
or ml -MODULE
ml show MODULE
or module show MODULE
module purge
or ml purge
Important!
+Not all the modules (and versions) are the same on the skylake/broadwell nodes and the zen3/zen4 nodes.
+The regular login node kebnekaise.hpc2n.umu.se
has the modules available on skylake/broadwell nodes. (ThinLinc: kebnekaise-tl.hpc2n.umu.se
)
In order to check if a module is available on the zen3/zen4 nodes, login to kebnekaise-amd.hpc2n.umu.se
. (ThinLinc: kebnekaise-amd-tl.hpc2n.umu.se
).
Hint
+Code-along!
+b-an01 [~]$ ml spider Python
+
+---------------------------------------------------------------------------------------------------------
+ Python:
+---------------------------------------------------------------------------------------------------------
+ Description:
+ Python is a programming language that lets you work more quickly and integrate your systems more effectively.
+
+ Versions:
+ Python/2.7.15
+ Python/2.7.16
+ Python/2.7.18-bare
+ Python/2.7.18
+ Python/3.7.2
+ Python/3.7.4
+ Python/3.8.2
+ Python/3.8.6
+ Python/3.9.5-bare
+ Python/3.9.5
+ Python/3.9.6-bare
+ Python/3.9.6
+ Python/3.10.4-bare
+ Python/3.10.4
+ Python/3.10.8-bare
+ Python/3.10.8
+ Python/3.11.3
+ Python/3.11.5
+ Other possible modules matches:
+ Biopython Boost.Python Brotli-python GitPython IPython Python-bundle-PyPI flatbuffers-python ...
+
+---------------------------------------------------------------------------------------------------------
+ To find other possible module matches execute:
+
+ $ module -r spider '.*Python.*'
+
+---------------------------------------------------------------------------------------------------------
+ For detailed information about a specific "Python" package (including how to load the modules) use the module's full name.
+ Note that names that have a trailing (E) are extensions provided by other modules.
+ For example:
+
+ $ module spider Python/3.11.5
+---------------------------------------------------------------------------------------------------------
+
+
+
+b-an01 [~]$
+
b-an01 [~]$ ml spider Python/3.11.5
+
+---------------------------------------------------------------------------------------------------------
+ Python: Python/3.11.5
+---------------------------------------------------------------------------------------------------------
+ Description:
+ Python is a programming language that lets you work more quickly and integrate your systems more effectively.
+
+ You will need to load all module(s) on any one of the lines below before the "Python/3.11.5" module is available to load.
+
+ GCCcore/13.2.0
+
+ This module provides the following extensions:
+
+ flit_core/3.9.0 (E), packaging/23.2 (E), pip/23.2.1 (E), setuptools-scm/8.0.4 (E), setuptools/68.2.2 (E), tomli/2.0.1 (E), typing_extensions/4.8.0 (E), wheel/0.41.2 (E)
+
+ Help:
+ Description
+ ===========
+ Python is a programming language that lets you work more quickly and integrate your systems more effectively.
+
+ More information
+ ================
+ - Homepage: https://python.org/
+
+
+ Included extensions
+ ===================
+ flit_core-3.9.0, packaging-23.2, pip-23.2.1, setuptools-68.2.2, setuptools-
+ scm-8.0.4, tomli-2.0.1, typing_extensions-4.8.0, wheel-0.41.2
+
+
+
+
+
+b-an01 [~]$
+
Here we also show the loaded module before and after the load. For illustration, we use first ml
and then module list
:
b-an01 [~]$ ml
+
+Currently Loaded Modules:
+ 1) snicenvironment (S) 2) systemdefault (S)
+
+ Where:
+ S: Module is Sticky, requires --force to unload or purge
+
+
+
+b-an01 [~]$ module load GCCcore/13.2.0 Python/3.11.5
+b-an01 [~]$ module list
+
+Currently Loaded Modules:
+ 1) snicenvironment (S) 4) zlib/1.2.13 7) ncurses/6.4 10) SQLite/3.43.1 13) OpenSSL/1.1
+ 2) systemdefault (S) 5) binutils/2.40 8) libreadline/8.2 11) XZ/5.4.4 14) Python/3.11.5
+ 3) GCCcore/13.2.0 6) bzip2/1.0.8 9) Tcl/8.6.13 12) libffi/3.4.4
+
+ Where:
+ S: Module is Sticky, requires --force to unload or purge
+
+
+
+b-an01 [~]$
+
Python/3.11.5
(on the regular login node)In this example we unload the module Python/3.11.5
, but not the prerequisite GCCcore/13.2.0
. We also look at the output of module list
before and after.
b-an01 [~]$ module list
+
+Currently Loaded Modules:
+ 1) snicenvironment (S) 4) zlib/1.2.13 7) ncurses/6.4 10) SQLite/3.43.1 13) OpenSSL/1.1
+ 2) systemdefault (S) 5) binutils/2.40 8) libreadline/8.2 11) XZ/5.4.4 14) Python/3.11.5
+ 3) GCCcore/13.2.0 6) bzip2/1.0.8 9) Tcl/8.6.13 12) libffi/3.4.4
+
+ Where:
+ S: Module is Sticky, requires --force to unload or purge
+
+
+b-an01 [~]$ ml unload Python/3.11.5
+b-an01 [~]$ module list
+
+Currently Loaded Modules:
+ 1) snicenvironment (S) 2) systemdefault (S) 3) GCCcore/13.2.0
+
+ Where:
+ S: Module is Sticky, requires --force to unload or purge
+
+
+
+b-an01 [~]$
+
As you can see, the prerequisite did not get unloaded. This is on purpose, because you may have other things loaded which uses the prerequisite.
+module purge
except the ‘sticky’ modules (some needed things for the environment) (on the regular login node)First we load some modules. Here Python 3.11.5, SciPy-bundle, and prerequisites for them. We also do module list
after loading the modules and after using module purge
.
b-an01 [~]$ ml GCC/13.2.0
+b-an01 [~]$ ml Python/3.11.5 ml SciPy-bundle/2023.11
+b-an01 [~]$ ml list
+
+Currently Loaded Modules:
+ 1) snicenvironment (S) 7) bzip2/1.0.8 13) libffi/3.4.4 19) cffi/1.15.1
+ 2) systemdefault (S) 8) ncurses/6.4 14) OpenSSL/1.1 20) cryptography/41.0.5
+ 3) GCCcore/13.2.0 9) libreadline/8.2 15) Python/3.11.5 21) virtualenv/20.24.6
+ 4) zlib/1.2.13 10) Tcl/8.6.13 16) OpenBLAS/0.3.24 22) Python-bundle-PyPI/2023.10
+ 5) binutils/2.40 11) SQLite/3.43.1 17) FlexiBLAS/3.3.1 23) pybind11/2.11.1
+ 6) GCC/13.2.0 12) XZ/5.4.4 18) FFTW/3.3.10 24) SciPy-bundle/2023.11
+
+ Where:
+ S: Module is Sticky, requires --force to unload or purge
+
+
+
+b-an01 [~]$ ml purge
+The following modules were not unloaded:
+ (Use "module --force purge" to unload all):
+
+ 1) snicenvironment 2) systemdefault
+b-an01 [~]$ ml list
+
+Currently Loaded Modules:
+ 1) snicenvironment (S) 2) systemdefault (S)
+
+ Where:
+ S: Module is Sticky, requires --force to unload or purge
+
+
+
+b-an01 [~]$
+
Note
+module load
on the same line. Or you can do them one at a time, as you want.GCC/13.2.0
and Python/3.11.5
. You can now do ml av
to see which versions of other modules you want to load, say SciPy-bundle, are compatible. If you know the name of the module you want, you can even start writing module load SciPy-bundle/
and press TAB
- the system will then autocomplete to the compatible one(s). Exercise
+Login to kebnekaise-amd
(can be easily done with ssh kebnekaise-amd
from a terminal window on the regular login node). Check if the versions of Python available differs from on the regular login node.
Compiler toolchains load bundles of software making up a complete environment for compiling/using a specific prebuilt software. Includes some/all of: compiler suite, MPI, BLAS, LAPACK, ScaLapack, FFTW, CUDA.
+Some currently available toolchains (check ml av
for versions and full, updated list):
Exercise
+Check which versions of the foss
toolchain exist. Load one of them. Check which modules you now have loaded. Remove all the (non-sticky) modules.
Keypoints
+module load MODULE
module unload MODULE
module purge
module spider
module spider MODULE
module spider MODULE/VERSION
module list
More information
+Note
+In order to have an account at HPC2N, you need to be a member of a compute project.
+You can either join a project or apply for one yourself (if you fulfill the requirements).
+There are both storage projects and compute projects. The storage projects are for when the amount of storage included with the compute project is not enough.
+Important
+You cannot have a storage project without a compute project!
+Kebnekaise is only open for local project requests!
+Apply for compute projects in SUPR.
+Info
+After applying on SUPR, the project(s) will be reviewed.
+
+
+2. Pick a compute project to link:
+
+3. Showing linked projects:
+
+4. Members of the storage project after linking:
+
When you have a project / have become member of a project, you can apply for an account at HPC2N. This is done in SUPR, under “Accounts”: https://supr.naiss.se/account/.
+Your account request will be processed within a week. You will then get an email with information about logging in and links to getting started information.
+More information on the account process can be found on HPC2N’s documentation pages: https://www.hpc2n.umu.se/documentation/access-and-accounts/users
+ +' + escapeHtml(summary) +'
' + noResultsText + '
'); + } +} + +function doSearch () { + var query = document.getElementById('mkdocs-search-query').value; + if (query.length > min_search_length) { + if (!window.Worker) { + displayResults(search(query)); + } else { + searchWorker.postMessage({query: query}); + } + } else { + // Clear results for short queries + displayResults([]); + } +} + +function initSearch () { + var search_input = document.getElementById('mkdocs-search-query'); + if (search_input) { + search_input.addEventListener("keyup", doSearch); + } + var term = getSearchTermFromLocation(); + if (term) { + search_input.value = term; + doSearch(); + } +} + +function onWorkerMessage (e) { + if (e.data.allowSearch) { + initSearch(); + } else if (e.data.results) { + var results = e.data.results; + displayResults(results); + } else if (e.data.config) { + min_search_length = e.data.config.min_search_length-1; + } +} + +if (!window.Worker) { + console.log('Web Worker API not supported'); + // load index in main thread + $.getScript(joinUrl(base_url, "search/worker.js")).done(function () { + console.log('Loaded worker'); + init(); + window.postMessage = function (msg) { + onWorkerMessage({data: msg}); + }; + }).fail(function (jqxhr, settings, exception) { + console.error('Could not load worker.js'); + }); +} else { + // Wrap search in a web worker + var searchWorker = new Worker(joinUrl(base_url, "search/worker.js")); + searchWorker.postMessage({init: true}); + searchWorker.onmessage = onWorkerMessage; +} diff --git a/search/search_index.json b/search/search_index.json new file mode 100644 index 00000000..548a84b2 --- /dev/null +++ b/search/search_index.json @@ -0,0 +1 @@ +{"config":{"indexing":"full","lang":["en"],"min_search_length":3,"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"","text":"Welcome to the course: Introduction to Kebnekaise \u00b6 This material Here you will find the content of the workshop \u201cIntroduction to Kebnekaise\u201d. You can download the markdown files for the presentation as well as the exercises from https://github.com/hpc2n/intro-course Click the gren \u201cCode\u201d button Either copy the url for the repo under HTTPS and do git clone https://github.com/hpc2n/intro-course.git in a terminal window OR pick \u201cDownload zip\u201d to get a zip file with the content. Some useful links: Documentation about Linux at HPC2N: https://docs.hpc2n.umu.se/tutorials/linuxguide/ Get started guide: https://docs.hpc2n.umu.se/tutorials/quickstart/ Documentation pages at HPC2N: https://docs.hpc2n.umu.se/ Prerequisites Basic knowledge about Linux (if you need a refresher, you could take the course \u201cIntroduction to Linux\u201d which runs immediately before this course. Info and registration here: https://www.hpc2n.umu.se/events/courses/2024/fall/intro-linux . An account at SUPR and at HPC2N. You should have already been contacted about getting these if you did not have them already. Content This course aims to give a brief, but comprehensive introduction to Kebnekaise. You will learn about HPC2N, HPC, and Kebnekaise hardware How to use our systems: Logging in & editors The File System The Module System Compiling and linking The Batch System Simple examples (batch system) Application examples (batch system) This course will consist of lectures and type-alongs, as well as a few exercises where you get to try out what you have just learned. Instructors Birgitte Bryds\u00f6, HPC2N Pedro Ojeda-May, HPC2N Preliminary schedule \u00b6 Time Topic Activity 11:15 Welcome+Syllabus 11:20 Introduction to Kebnekaise and HPC2N Lecture 11:45 Projects and accounts Lecture 11:50 Logging in & editors Lecture+exercise 12:05 The File System Lecture+code along 12:15 LUNCH BREAK 13:15 The Module System Lecture+code along+exercise 13:35 Compiling Lecture+code along+exercise 13:50 The Batch System Lecture+code along 14:10 Simple Examples Lecture+exercises 14:45 COFFEE BREAK 15:00 Application Examples Lecture+code along+exercises 16:40 Questions+Summary 17:00 END OF COURSE","title":"Home"},{"location":"#welcome__to__the__course__introduction__to__kebnekaise","text":"This material Here you will find the content of the workshop \u201cIntroduction to Kebnekaise\u201d. You can download the markdown files for the presentation as well as the exercises from https://github.com/hpc2n/intro-course Click the gren \u201cCode\u201d button Either copy the url for the repo under HTTPS and do git clone https://github.com/hpc2n/intro-course.git in a terminal window OR pick \u201cDownload zip\u201d to get a zip file with the content. Some useful links: Documentation about Linux at HPC2N: https://docs.hpc2n.umu.se/tutorials/linuxguide/ Get started guide: https://docs.hpc2n.umu.se/tutorials/quickstart/ Documentation pages at HPC2N: https://docs.hpc2n.umu.se/ Prerequisites Basic knowledge about Linux (if you need a refresher, you could take the course \u201cIntroduction to Linux\u201d which runs immediately before this course. Info and registration here: https://www.hpc2n.umu.se/events/courses/2024/fall/intro-linux . An account at SUPR and at HPC2N. You should have already been contacted about getting these if you did not have them already. Content This course aims to give a brief, but comprehensive introduction to Kebnekaise. You will learn about HPC2N, HPC, and Kebnekaise hardware How to use our systems: Logging in & editors The File System The Module System Compiling and linking The Batch System Simple examples (batch system) Application examples (batch system) This course will consist of lectures and type-alongs, as well as a few exercises where you get to try out what you have just learned. Instructors Birgitte Bryds\u00f6, HPC2N Pedro Ojeda-May, HPC2N","title":"Welcome to the course: Introduction to Kebnekaise"},{"location":"#preliminary__schedule","text":"Time Topic Activity 11:15 Welcome+Syllabus 11:20 Introduction to Kebnekaise and HPC2N Lecture 11:45 Projects and accounts Lecture 11:50 Logging in & editors Lecture+exercise 12:05 The File System Lecture+code along 12:15 LUNCH BREAK 13:15 The Module System Lecture+code along+exercise 13:35 Compiling Lecture+code along+exercise 13:50 The Batch System Lecture+code along 14:10 Simple Examples Lecture+exercises 14:45 COFFEE BREAK 15:00 Application Examples Lecture+code along+exercises 16:40 Questions+Summary 17:00 END OF COURSE","title":"Preliminary schedule"},{"location":"batch/","text":"The Batch System (SLURM) \u00b6 Objectives Get information about what a batch system is and which one is used at HPC2N. Learn basic commands for the batch system used at HPC2N. How to create a basic batch script. Managing your job: submitting, status, cancelling, checking\u2026 Learn how to allocate specific parts of Kebnekaise: skylake, zen3/zen4, GPUs\u2026 Large/long/parallel jobs must be run through the batch system. Kebnekaise is running Slurm . Slurm is an Open Source job scheduler, which provides three key functions. Keeps track of available system resources. Enforces local system resource usage and job scheduling policies. Manages a job queue, distributing work across resources according to policies. In order to run a batch job, you need to create and submit a SLURM submit file (also called a batch submit file, a batch script, or a job script). Note Guides and documentation for the batch system at HPC2N here at: HPC2N\u2019s batch system documentation . Basic commands \u00b6 Using a job script is often recommended. If you ask for the resources on the command line, you will wait for the program to run before you can use the window again (unless you can send it to the background with &). If you use a job script you have an easy record of the commands you used, to reuse or edit for later use. Note When you submit a job, the system will return the Job ID. You can also get it with squeue -me . See below. In the following, JOBSCRIPT is the name you have given your job script and JOBID is the job ID for your job, assigned by Slurm. USERNAME is your username. Submit job : sbatch JOBSCRIPT Get list of your jobs : squeue -u USERNAME or squeue --me Give the Slurm commands on the command line : srun commands-for-your-job/program Check on a specific job : scontrol show job JOBID Delete a specific job : scancel JOBID Delete all your own jobs : scancel -u USERNAME Request an interactive allocation : salloc -A PROJECT-ID ....... Note that you will still be on the login node when the prompt returns and you MUST preface with srun to run on the allocated resources. I.e. srun MYPROGRAM Get more detailed info about jobs : sacct -l -j JOBID -o jobname,NTasks,nodelist,MaxRSS,MaxVMSize More flags etc. can be found with man sacct The output will be very wide. To view in a friendlier format, use sacct -l -j JOBID -o jobname,NTasks,nodelist,MaxRSS,MaxVMSize | less -S this makes it sideways scrollable, using the left/right arrow key Web url with graphical info about a job: job-usage JOBID More information: man sbatch , man srun , man .... Example Submit job with sbatch b-an01 [ ~ ] $ sbatch simple.sh Submitted batch job 27774852 Check status with squeue --me b-an01 [ ~ ] $ squeue --me JOBID PARTITION NAME USER ST TIME NODES NODELIST ( REASON ) 27774852 cpu_zen4 simple.s bbrydsoe R 0 :00 1 b-cn1701 Submit several jobs (here several instances of the same), check on the status b-an01 [ ~ ] $ sbatch simple.sh Submitted batch job 27774872 b-an01 [ ~ ] $ sbatch simple.sh Submitted batch job 27774873 b-an01 [ ~ ] $ sbatch simple.sh Submitted batch job 27774874 b-an01 [ ~ ] $ squeue --me JOBID PARTITION NAME USER ST TIME NODES NODELIST ( REASON ) 27774873 cpu_zen4 simple.s bbrydsoe R 0 :02 1 b-cn1702 27774874 cpu_zen4 simple.s bbrydsoe R 0 :02 1 b-cn1702 27774872 cpu_zen4 simple.s bbrydsoe CG 0 :04 1 b-cn1702 The status \u201cR\u201d means it is running. \u201cCG\u201d means completing. When a job is pending it has the state \u201cPD\u201d. In these examples the jobs all ended up on nodes in the partition cpu_zen4. We will soon talk more about different types of nodes. Job scripts and output \u00b6 The official name for batch scripts in Slurm is Job Submission Files, but here we will use both names interchangeably. If you search the internet, you will find several other names used, including Slurm submit file, batch submit file, batch script, job script. A job submission file can contain any of the commands that you would otherwise issue yourself from the command line. It is, for example, possible to both compile and run a program and also to set any necessary environment values (though remember that Slurm exports the environment variables in your shell per default, so you can also just set them all there before submitting the job). Note The results from compiling or running your programs can generally be seen after the job has completed, though as Slurm will write to the output file during the run, some results will be available quicker. Outputs and any errors will per default be placed in the directory you are running from, though this can be changed. Note This directory should preferrably be placed under your project storage, since your home directory only has 25 GB of space. The output file from the job run will default be named slurm-JOBID.out . It will contain both output as well as any errors. You can look at the content with vi , nano , emacs , cat , less \u2026 The exception is if your program creates its own output files, or if you name the output file(s) differently within your jobscript. Note You can use Slurm commands within your job script to split the error and output in separate files, and name them as you want. It is highly recommended to include the environment variable %J (the job ID) in the name, as that is an easy way to get a new name for each time you run the script and thus avoiding the previous output being overwritten. Example, using the environment variable %J : Error file: #SBATCH --error=job.%J.err Output file: #SBATCH --output=job.%J.out Job scripts \u00b6 A job submission file can either be very simple, with most of the job attributes specified on the command line, or it may consist of several Slurm directives, comments and executable statements. A Slurm directive provides a way of specifying job attributes in addition to the command line options. Naming : You can name your script anything, including the suffix. It does not matter. Just name it something that makes sense to you and helps you remember what the script is for. The standard is to name it with a suffix of .sbatch or .sh . Simple, serial job script #!/bin/bash # The name of the account you are running in, mandatory. #SBATCH -A hpc2nXXXX-YYY # Request resources - here for a serial job # tasks per core is 1 as default (can be changed with ``-c``) #SBATCH -n 1 # Request runtime for the job (HHH:MM:SS) where 168 hours is the maximum. Here asking for 15 min. #SBATCH --time=00:15:00 # Clear the environment from any previously loaded modules module purge > /dev/null 2 > & 1 # Load the module environment suitable for the job - here foss/2022b module load foss/2022b # And finally run the serial jobs ./my_serial_program Note You have to always include #!/bin/bash at the beginning of the script, since bash is the only supported shell. Some things may work under other shells, but not everything. All Slurm directives start with #SBATCH . One (or more) # in front of a text line means it is a comment, with the exception of the string #SBATCH . In order to comment out the Slurm directives, you need to put one more # in front of the #SBATCH . It is important to use capital letters for #SBATCH . Otherwise the line will be considered a comment, and ignored. Let us go through the most commonly used arguments: -A PROJ-ID : The project that should be accounted. It is a simple conversion from the SUPR project id. You can also find your project account with the command projinfo . The PROJ-ID argument is of the form hpc2nXXXX-YYY (HPC2N local project) -N : number of nodes. If this is not given, enough will be allocated to fullfill the requirements of -n and/or -c. A range can be given. If you ask for, say, 1-1, then you will get 1 and only 1 node, no matter what you ask for otherwise. It will also assure that all the processors will be allocated on the same node. -n : number of tasks. -c : cores per task. Request that a specific number of cores be allocated to each task. This can be useful if the job is multi-threaded and requires more than one core per task for optimal performance. The default is one core per task. Simple MPI program #!/bin/bash # The name of the account you are running in, mandatory. #SBATCH -A hpc2nXXXX-YYY # Request resources - here for eight MPI tasks #SBATCH -n 8 # Request runtime for the job (HHH:MM:SS) where 168 hours is the maximum. Here asking for 15 min. #SBATCH --time=00:15:00 # Clear the environment from any previously loaded modules module purge > /dev/null 2 > & 1 # Load the module environment suitable for the job - here foss/2022b module load foss/2022b # And finally run the job - use srun for MPI jobs, but not for serial jobs srun ./my_mpi_program Exercises \u00b6 If you have not already done so, clone the material from the website https://github.com/hpc2n/intro-course : Change to the storage area you created under /proj/nobackup/intro-hpc2n/ . Clone the material: git clone https://github.com/hpc2n/intro-course.git Change to the subdirectory with the exercises: cd intro-course/exercises/simple You will now find several small programs and batch scripts which are used in this section and the next, \u201cSimple examples\u201d. In this section, we are just going to try submitting a few jobs, checking their status, cancelling a job, and looking at the output. Preparations Load the module foss/2022b ( ml foss/2022b ) on the regular login node. This module is available on all nodes. Compile the following programs: hello.c , mpi_hello.c , mpi_greeting.c , and mpi_hi.c gcc -o hello hello.c mpicc -o mpi_hello mpi_hello.c mpicc -o mpi_greeting mpi_greeting.c mpicc -o mpi_hi mpi_hi.c If you compiled and named the executables as above, you should be able to submit the following batch scripts directly: simple.sh , mpi_greeting.sh , mpi_hello.sh , mpi_hi.sh , multiple-parallel-sequential.sh , multiple-parallel.sh , or multiple-parallel-simultaneous.sh . Exercise: sbatch and squeue Submit ( sbatch ) one of the batch scripts listed in 3. under preparations. Check with squeue --me if it is running, pending, or completing. Exercise: sbatch and scontrol show job Submit a few instances of multiple-parallel.sh and multiple-parallel-sequential.sh (so they do not finish running before you have time to check on them). Do scontrol show job JOBID on one or more of the job IDs. You should be able to see node assigned (unless the job has not yet had one allocated), expected runtime, etc. If the job is running, you can see how long it has run. You will also get paths to submit directory etc. Exercise: sbatch and scancel Submit a few instances of multiple-parallel.sh and multiple-parallel-sequential.sh (so they do not finish running before you have time to check on them). Do squeue --me and see the jobs listed. Pick one and do scancel JOBID on it. Do squeue --me again to see it is no longer there. Exercise: check output Use nano to open one of the output files slurm-JOBID.out . Try adding #SBATCH --error=job.%J.err and #SBATCH --output=job.%J.out to one of the batch scripts (you can edit it with nano ). Submit the batch script again. See that the expected files get created. Using the different parts of Kebnekaise \u00b6 As mentioned under the introduction, Kebnekaise is a very heterogeneous system, comprised of several different types of CPUs and GPUs. The batch system reflects these several different types of resources. At the top we have partitions, which are similar to queues. Each partition is made up of a specific set of nodes. At HPC2N we have three classes of partitions, one for CPU-only nodes, one for GPU nodes and one for large memory nodes. Each node type also has a set of features that can be used to select which node(s) the job should run on. The three types of nodes also have corresponding resources one must apply for in SUPR to be able to use them. While Kebnekaise has multiple partitions, one for each major type of resource, there is only a single partition, batch , that users can submit jobs to. The system then figures out which partition(s) the job should be sent to, based on the requested features. Node overview The \u201cType\u201d can be used if you need a specific type of node. More about that later. CPU-only nodes CPU Memory/core number nodes Type 2 x 14 core Intel broadwell 4460 MB 48 broadwell (intel_cpu) 2 x 14 core Intel skylake 6785 MB 52 skylake (intel_cpu) 2 x 64 core AMD zen3 8020 MB 1 zen3 (amd_cpu) 2 x 128 core AMD zen4 2516 MB 8 zen4 (amd_cpu) GPU enabled nodes CPU Memory/core GPU card number nodes Type 2 x 14 core Intel broadwell 9000 MB 2 x Nvidia A40 4 a40 2 x 14 core Intel skylake 6785 MB 2 x Nvidia V100 10 v100 2 x 24 core AMD zen3 10600 MB 2 x Nvidia A100 2 a100 2 x 24 core AMD zen3 10600 MB 2 x AMD MI100 1 mi100 2 x 24 core AMD zen4 6630 MB 2 x Nvidia A6000 1 a6000 2 x 24 core AMD zen4 6630 MB 2 x Nvidia L40s 10 l40s 2 x 48 core AMD zen4 6630 MB 4 x Nvidia H100 SXM5 2 h100 Large memory nodes CPU Memory/core number nodes Type 4 x 18 core Intel broadwell 41666 MB 8 largemem Requesting features \u00b6 To make it possible to target nodes in more detail there are a couple of features defined on each group of nodes. To select a feature one can use the -C option to sbatch or salloc . This sets constraints on the job. There are several reasons why one might want to do that, including for benchmarks, to be able to replicate results (in some cases), because specific modules are only available for certain architectures, etc. To constrain a job to a certain feature, use #SBATCH -C Type Note Features can be combined using \u201cand\u201d ( & ) or \u201cor\u201d ( | ). They should be wrapped in ' \u2019s. Example: #SBATCH -C 'zen3|zen4' List of constraints: For selecting type of CPU Type is: intel_cpu broadwell skylake amd_cpu zen3 zen4 For selecting type of GPU Type is: v100 a40 a6000 a100 l40s h100 mi100 For GPUs, the above GPU list of constraints can be used either as a specifier to --gpu=type:number or as a constraint together with an unspecified gpu request --gpu=number . For selecting GPUs with certain features Type is: nvidia_gpu (Any Nvidia GPU) amd_gpu (Any AMD GPU) GPU_SP (GPU with single precision capability) GPU_DP (GPU with double precision capability) GPU_AI (GPU with AI features, like half precisions and lower) GPU_ML (GPU with ML features, like half precisions and lower) For selecting large memory nodes Type is: largemem Examples, constraints \u00b6 Only nodes with Zen4 #SBATCH -C zen4 Nodes with a combination of features: a Zen4 CPU and a GPU with AI features #SBATCH -C 'zen4&GPU_AI' Nodes with either a Zen3 CPU or a Zen4 CPU #SBATCH -C 'zen3|zen4' Examples, requesting GPUs \u00b6 To use GPU resources one has to explicitly ask for one or more GPUs. Requests for GPUs can be done either in total for the job or per node of the job. Ask for one GPU of any kind #SBATCH --gpus=1 Another way to ask for one GPU of any kind #SBATCH --gpus-per-node=1 Asking for a specific type of GPU As mentioned before, for GPUs, constraints can be used either as a specifier to --gpu=type:number or as a constraint together with an unspecified gpu request --gpu=number . #SBATCH --gpus=Type:NUMBER where Type is, as mentioned: v100 a40 a6000 a100 l40s h100 mi100 Simple GPU Job - V100 #!/bin/bash #SBATCH -A hpc2nXXXX-YYY # Expected time for job to complete #SBATCH --time=00:10:00 # Number of GPU cards needed. Here asking for 2 V100 cards #SBATCH --gpu=v100:2 # Clear the environment from any previously loaded modules module purge > /dev/null 2 > & 1 # Load modules needed for your program - here fosscuda/2021b ml fosscuda/2021b ./my-gpu-program Important The course project has the following project ID: hpc2n2024-084 In order to use it in a batch job, add this to the batch script: #SBATCH -A hpc2n2024-084 We have a storage project linked to the compute project: intro-hpc2n . You find it in /proj/nobackup/intro-hpc2n . Remember to create your own directory under it. Keypoints To submit a job, you first need to create a batch submit script, which you then submit with sbatch SUBMIT-SCRIPT . You can get a list of your running and pending jobs with squeue --me . Kebnekaise has many different nodes, both CPU and GPU. It is possible to constrain the the job to run only on specific types of nodes. If your job is an MPI job, you need to use srun in front of your executable in the batch script (unless you use software which handles the parallelization itself).","title":"The Batch System"},{"location":"batch/#the__batch__system__slurm","text":"Objectives Get information about what a batch system is and which one is used at HPC2N. Learn basic commands for the batch system used at HPC2N. How to create a basic batch script. Managing your job: submitting, status, cancelling, checking\u2026 Learn how to allocate specific parts of Kebnekaise: skylake, zen3/zen4, GPUs\u2026 Large/long/parallel jobs must be run through the batch system. Kebnekaise is running Slurm . Slurm is an Open Source job scheduler, which provides three key functions. Keeps track of available system resources. Enforces local system resource usage and job scheduling policies. Manages a job queue, distributing work across resources according to policies. In order to run a batch job, you need to create and submit a SLURM submit file (also called a batch submit file, a batch script, or a job script). Note Guides and documentation for the batch system at HPC2N here at: HPC2N\u2019s batch system documentation .","title":"The Batch System (SLURM)"},{"location":"batch/#basic__commands","text":"Using a job script is often recommended. If you ask for the resources on the command line, you will wait for the program to run before you can use the window again (unless you can send it to the background with &). If you use a job script you have an easy record of the commands you used, to reuse or edit for later use. Note When you submit a job, the system will return the Job ID. You can also get it with squeue -me . See below. In the following, JOBSCRIPT is the name you have given your job script and JOBID is the job ID for your job, assigned by Slurm. USERNAME is your username. Submit job : sbatch JOBSCRIPT Get list of your jobs : squeue -u USERNAME or squeue --me Give the Slurm commands on the command line : srun commands-for-your-job/program Check on a specific job : scontrol show job JOBID Delete a specific job : scancel JOBID Delete all your own jobs : scancel -u USERNAME Request an interactive allocation : salloc -A PROJECT-ID ....... Note that you will still be on the login node when the prompt returns and you MUST preface with srun to run on the allocated resources. I.e. srun MYPROGRAM Get more detailed info about jobs : sacct -l -j JOBID -o jobname,NTasks,nodelist,MaxRSS,MaxVMSize More flags etc. can be found with man sacct The output will be very wide. To view in a friendlier format, use sacct -l -j JOBID -o jobname,NTasks,nodelist,MaxRSS,MaxVMSize | less -S this makes it sideways scrollable, using the left/right arrow key Web url with graphical info about a job: job-usage JOBID More information: man sbatch , man srun , man .... Example Submit job with sbatch b-an01 [ ~ ] $ sbatch simple.sh Submitted batch job 27774852 Check status with squeue --me b-an01 [ ~ ] $ squeue --me JOBID PARTITION NAME USER ST TIME NODES NODELIST ( REASON ) 27774852 cpu_zen4 simple.s bbrydsoe R 0 :00 1 b-cn1701 Submit several jobs (here several instances of the same), check on the status b-an01 [ ~ ] $ sbatch simple.sh Submitted batch job 27774872 b-an01 [ ~ ] $ sbatch simple.sh Submitted batch job 27774873 b-an01 [ ~ ] $ sbatch simple.sh Submitted batch job 27774874 b-an01 [ ~ ] $ squeue --me JOBID PARTITION NAME USER ST TIME NODES NODELIST ( REASON ) 27774873 cpu_zen4 simple.s bbrydsoe R 0 :02 1 b-cn1702 27774874 cpu_zen4 simple.s bbrydsoe R 0 :02 1 b-cn1702 27774872 cpu_zen4 simple.s bbrydsoe CG 0 :04 1 b-cn1702 The status \u201cR\u201d means it is running. \u201cCG\u201d means completing. When a job is pending it has the state \u201cPD\u201d. In these examples the jobs all ended up on nodes in the partition cpu_zen4. We will soon talk more about different types of nodes.","title":"Basic commands"},{"location":"batch/#job__scripts__and__output","text":"The official name for batch scripts in Slurm is Job Submission Files, but here we will use both names interchangeably. If you search the internet, you will find several other names used, including Slurm submit file, batch submit file, batch script, job script. A job submission file can contain any of the commands that you would otherwise issue yourself from the command line. It is, for example, possible to both compile and run a program and also to set any necessary environment values (though remember that Slurm exports the environment variables in your shell per default, so you can also just set them all there before submitting the job). Note The results from compiling or running your programs can generally be seen after the job has completed, though as Slurm will write to the output file during the run, some results will be available quicker. Outputs and any errors will per default be placed in the directory you are running from, though this can be changed. Note This directory should preferrably be placed under your project storage, since your home directory only has 25 GB of space. The output file from the job run will default be named slurm-JOBID.out . It will contain both output as well as any errors. You can look at the content with vi , nano , emacs , cat , less \u2026 The exception is if your program creates its own output files, or if you name the output file(s) differently within your jobscript. Note You can use Slurm commands within your job script to split the error and output in separate files, and name them as you want. It is highly recommended to include the environment variable %J (the job ID) in the name, as that is an easy way to get a new name for each time you run the script and thus avoiding the previous output being overwritten. Example, using the environment variable %J : Error file: #SBATCH --error=job.%J.err Output file: #SBATCH --output=job.%J.out","title":"Job scripts and output"},{"location":"batch/#job__scripts","text":"A job submission file can either be very simple, with most of the job attributes specified on the command line, or it may consist of several Slurm directives, comments and executable statements. A Slurm directive provides a way of specifying job attributes in addition to the command line options. Naming : You can name your script anything, including the suffix. It does not matter. Just name it something that makes sense to you and helps you remember what the script is for. The standard is to name it with a suffix of .sbatch or .sh . Simple, serial job script #!/bin/bash # The name of the account you are running in, mandatory. #SBATCH -A hpc2nXXXX-YYY # Request resources - here for a serial job # tasks per core is 1 as default (can be changed with ``-c``) #SBATCH -n 1 # Request runtime for the job (HHH:MM:SS) where 168 hours is the maximum. Here asking for 15 min. #SBATCH --time=00:15:00 # Clear the environment from any previously loaded modules module purge > /dev/null 2 > & 1 # Load the module environment suitable for the job - here foss/2022b module load foss/2022b # And finally run the serial jobs ./my_serial_program Note You have to always include #!/bin/bash at the beginning of the script, since bash is the only supported shell. Some things may work under other shells, but not everything. All Slurm directives start with #SBATCH . One (or more) # in front of a text line means it is a comment, with the exception of the string #SBATCH . In order to comment out the Slurm directives, you need to put one more # in front of the #SBATCH . It is important to use capital letters for #SBATCH . Otherwise the line will be considered a comment, and ignored. Let us go through the most commonly used arguments: -A PROJ-ID : The project that should be accounted. It is a simple conversion from the SUPR project id. You can also find your project account with the command projinfo . The PROJ-ID argument is of the form hpc2nXXXX-YYY (HPC2N local project) -N : number of nodes. If this is not given, enough will be allocated to fullfill the requirements of -n and/or -c. A range can be given. If you ask for, say, 1-1, then you will get 1 and only 1 node, no matter what you ask for otherwise. It will also assure that all the processors will be allocated on the same node. -n : number of tasks. -c : cores per task. Request that a specific number of cores be allocated to each task. This can be useful if the job is multi-threaded and requires more than one core per task for optimal performance. The default is one core per task. Simple MPI program #!/bin/bash # The name of the account you are running in, mandatory. #SBATCH -A hpc2nXXXX-YYY # Request resources - here for eight MPI tasks #SBATCH -n 8 # Request runtime for the job (HHH:MM:SS) where 168 hours is the maximum. Here asking for 15 min. #SBATCH --time=00:15:00 # Clear the environment from any previously loaded modules module purge > /dev/null 2 > & 1 # Load the module environment suitable for the job - here foss/2022b module load foss/2022b # And finally run the job - use srun for MPI jobs, but not for serial jobs srun ./my_mpi_program","title":"Job scripts"},{"location":"batch/#exercises","text":"If you have not already done so, clone the material from the website https://github.com/hpc2n/intro-course : Change to the storage area you created under /proj/nobackup/intro-hpc2n/ . Clone the material: git clone https://github.com/hpc2n/intro-course.git Change to the subdirectory with the exercises: cd intro-course/exercises/simple You will now find several small programs and batch scripts which are used in this section and the next, \u201cSimple examples\u201d. In this section, we are just going to try submitting a few jobs, checking their status, cancelling a job, and looking at the output. Preparations Load the module foss/2022b ( ml foss/2022b ) on the regular login node. This module is available on all nodes. Compile the following programs: hello.c , mpi_hello.c , mpi_greeting.c , and mpi_hi.c gcc -o hello hello.c mpicc -o mpi_hello mpi_hello.c mpicc -o mpi_greeting mpi_greeting.c mpicc -o mpi_hi mpi_hi.c If you compiled and named the executables as above, you should be able to submit the following batch scripts directly: simple.sh , mpi_greeting.sh , mpi_hello.sh , mpi_hi.sh , multiple-parallel-sequential.sh , multiple-parallel.sh , or multiple-parallel-simultaneous.sh . Exercise: sbatch and squeue Submit ( sbatch ) one of the batch scripts listed in 3. under preparations. Check with squeue --me if it is running, pending, or completing. Exercise: sbatch and scontrol show job Submit a few instances of multiple-parallel.sh and multiple-parallel-sequential.sh (so they do not finish running before you have time to check on them). Do scontrol show job JOBID on one or more of the job IDs. You should be able to see node assigned (unless the job has not yet had one allocated), expected runtime, etc. If the job is running, you can see how long it has run. You will also get paths to submit directory etc. Exercise: sbatch and scancel Submit a few instances of multiple-parallel.sh and multiple-parallel-sequential.sh (so they do not finish running before you have time to check on them). Do squeue --me and see the jobs listed. Pick one and do scancel JOBID on it. Do squeue --me again to see it is no longer there. Exercise: check output Use nano to open one of the output files slurm-JOBID.out . Try adding #SBATCH --error=job.%J.err and #SBATCH --output=job.%J.out to one of the batch scripts (you can edit it with nano ). Submit the batch script again. See that the expected files get created.","title":"Exercises"},{"location":"batch/#using__the__different__parts__of__kebnekaise","text":"As mentioned under the introduction, Kebnekaise is a very heterogeneous system, comprised of several different types of CPUs and GPUs. The batch system reflects these several different types of resources. At the top we have partitions, which are similar to queues. Each partition is made up of a specific set of nodes. At HPC2N we have three classes of partitions, one for CPU-only nodes, one for GPU nodes and one for large memory nodes. Each node type also has a set of features that can be used to select which node(s) the job should run on. The three types of nodes also have corresponding resources one must apply for in SUPR to be able to use them. While Kebnekaise has multiple partitions, one for each major type of resource, there is only a single partition, batch , that users can submit jobs to. The system then figures out which partition(s) the job should be sent to, based on the requested features. Node overview The \u201cType\u201d can be used if you need a specific type of node. More about that later. CPU-only nodes CPU Memory/core number nodes Type 2 x 14 core Intel broadwell 4460 MB 48 broadwell (intel_cpu) 2 x 14 core Intel skylake 6785 MB 52 skylake (intel_cpu) 2 x 64 core AMD zen3 8020 MB 1 zen3 (amd_cpu) 2 x 128 core AMD zen4 2516 MB 8 zen4 (amd_cpu) GPU enabled nodes CPU Memory/core GPU card number nodes Type 2 x 14 core Intel broadwell 9000 MB 2 x Nvidia A40 4 a40 2 x 14 core Intel skylake 6785 MB 2 x Nvidia V100 10 v100 2 x 24 core AMD zen3 10600 MB 2 x Nvidia A100 2 a100 2 x 24 core AMD zen3 10600 MB 2 x AMD MI100 1 mi100 2 x 24 core AMD zen4 6630 MB 2 x Nvidia A6000 1 a6000 2 x 24 core AMD zen4 6630 MB 2 x Nvidia L40s 10 l40s 2 x 48 core AMD zen4 6630 MB 4 x Nvidia H100 SXM5 2 h100 Large memory nodes CPU Memory/core number nodes Type 4 x 18 core Intel broadwell 41666 MB 8 largemem","title":"Using the different parts of Kebnekaise"},{"location":"batch/#requesting__features","text":"To make it possible to target nodes in more detail there are a couple of features defined on each group of nodes. To select a feature one can use the -C option to sbatch or salloc . This sets constraints on the job. There are several reasons why one might want to do that, including for benchmarks, to be able to replicate results (in some cases), because specific modules are only available for certain architectures, etc. To constrain a job to a certain feature, use #SBATCH -C Type Note Features can be combined using \u201cand\u201d ( & ) or \u201cor\u201d ( | ). They should be wrapped in ' \u2019s. Example: #SBATCH -C 'zen3|zen4' List of constraints: For selecting type of CPU Type is: intel_cpu broadwell skylake amd_cpu zen3 zen4 For selecting type of GPU Type is: v100 a40 a6000 a100 l40s h100 mi100 For GPUs, the above GPU list of constraints can be used either as a specifier to --gpu=type:number or as a constraint together with an unspecified gpu request --gpu=number . For selecting GPUs with certain features Type is: nvidia_gpu (Any Nvidia GPU) amd_gpu (Any AMD GPU) GPU_SP (GPU with single precision capability) GPU_DP (GPU with double precision capability) GPU_AI (GPU with AI features, like half precisions and lower) GPU_ML (GPU with ML features, like half precisions and lower) For selecting large memory nodes Type is: largemem","title":"Requesting features"},{"location":"batch/#examples__constraints","text":"Only nodes with Zen4 #SBATCH -C zen4 Nodes with a combination of features: a Zen4 CPU and a GPU with AI features #SBATCH -C 'zen4&GPU_AI' Nodes with either a Zen3 CPU or a Zen4 CPU #SBATCH -C 'zen3|zen4'","title":"Examples, constraints"},{"location":"batch/#examples__requesting__gpus","text":"To use GPU resources one has to explicitly ask for one or more GPUs. Requests for GPUs can be done either in total for the job or per node of the job. Ask for one GPU of any kind #SBATCH --gpus=1 Another way to ask for one GPU of any kind #SBATCH --gpus-per-node=1 Asking for a specific type of GPU As mentioned before, for GPUs, constraints can be used either as a specifier to --gpu=type:number or as a constraint together with an unspecified gpu request --gpu=number . #SBATCH --gpus=Type:NUMBER where Type is, as mentioned: v100 a40 a6000 a100 l40s h100 mi100 Simple GPU Job - V100 #!/bin/bash #SBATCH -A hpc2nXXXX-YYY # Expected time for job to complete #SBATCH --time=00:10:00 # Number of GPU cards needed. Here asking for 2 V100 cards #SBATCH --gpu=v100:2 # Clear the environment from any previously loaded modules module purge > /dev/null 2 > & 1 # Load modules needed for your program - here fosscuda/2021b ml fosscuda/2021b ./my-gpu-program Important The course project has the following project ID: hpc2n2024-084 In order to use it in a batch job, add this to the batch script: #SBATCH -A hpc2n2024-084 We have a storage project linked to the compute project: intro-hpc2n . You find it in /proj/nobackup/intro-hpc2n . Remember to create your own directory under it. Keypoints To submit a job, you first need to create a batch submit script, which you then submit with sbatch SUBMIT-SCRIPT . You can get a list of your running and pending jobs with squeue --me . Kebnekaise has many different nodes, both CPU and GPU. It is possible to constrain the the job to run only on specific types of nodes. If your job is an MPI job, you need to use srun in front of your executable in the batch script (unless you use software which handles the parallelization itself).","title":"Examples, requesting GPUs"},{"location":"compilers/","text":"Compiling and Linking with Libraries \u00b6 Objectives Learn about the compilers at HPC2N How to load the compiler toolchains How to use the compilers What are the popular flags How to link with libraries. Installed compilers \u00b6 There are compilers available for Fortran 77, Fortran 90, Fortran 95, C, and C++. The compilers can produce both general-purpose code and architecture-specific optimized code to improve performance (loop-level optimizations, inter-procedural analysis and cache optimizations). Loading compilers \u00b6 Note You need to load a compiler suite (and possibly libraries, depending on what you need) before you can compile and link. Use ml av to get a list of available compiler toolchains as mentioned in the modules - compiler toolchains section. You load a compiler toolchain the same way you load any other module. They are always available directly, without the need to load prerequisites first. Hint Code-along! Example: Loading foss/2023b This compiler toolchain contains: GCC/13.2.0 , BLAS (with LAPACK ), ScaLAPACK , and FFTW . b-an01 [ ~ ] $ ml foss/2023b b-an01 [ ~ ] $ ml Currently Loaded Modules: 1 ) snicenvironment ( S ) 7 ) numactl/2.0.16 13 ) libevent/2.1.12 19 ) FlexiBLAS/3.3.1 2 ) systemdefault ( S ) 8 ) XZ/5.4.4 14 ) UCX/1.15.0 20 ) FFTW/3.3.10 3 ) GCCcore/13.2.0 9 ) libxml2/2.11.5 15 ) PMIx/4.2.6 21 ) FFTW.MPI/3.3.10 4 ) zlib/1.2.13 10 ) libpciaccess/0.17 16 ) UCC/1.2.0 22 ) ScaLAPACK/2.2.0-fb 5 ) binutils/2.40 11 ) hwloc/2.9.2 17 ) OpenMPI/4.1.6 23 ) foss/2023b 6 ) GCC/13.2.0 12 ) OpenSSL/1.1 18 ) OpenBLAS/0.3.24 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ Compiling \u00b6 Note OpenMP : All compilers has this included, so it is enough to load the module for a specific compiler toolchain and then add the appropriate flag. Note If you do not name the executable (with the flag -o SOMENAME , it will be named a.out as default. This also means that the next time you compile something, if you also do not name that executable, it will overwrite the previous a.out file. Compiling with GCC \u00b6 Language Compiler name MPI Fortran77 gfortran mpif77 Fortran90 gfortran mpif90 Fortran95 gfortran N/A C gcc mpicc C++ g++ mpiCC In order to access the MPI compilers, load a compiler toolchain which contains an MPI library . Hint Code-along! Example: compiling a C program You can find the file hello.c in the exercises directory, in the subdirectory \u201csimple\u201d. Or you can download it here: hello.c . In this example we compile the C program hello.c and name the output (the executable) hello . b-an01 [ ~ ] $ gcc hello.c -o hello You can run the executable with ./hello Example: compiling an MPI C program You can find the file mpi_hello.c in the exercises directory, in the subdirectory \u201csimple\u201d. Or you can download it here: mpi_hello.c . In this example we compile the MPI C program mpi_hello.c and name the output (the executable) mpi_hello . b-an01 [ ~ ] $ mpicc mpi_hello.c -o mpi_hello You then run with `mpirun mpi_hello Important If you later have loaded a different compiler than the one your program was compiled with, you should recompile your program before running it. Exercise Try loading foss/2023b and compiling mpi_hello.c , then unload the module and instead load the module intel/2023b and see what happens if you try to run with mpirun mpi_hello . Flags \u00b6 Note List of commonly used flags: -o file Place output in file \u2018file\u2019. -c Compile or assemble the source files, but do not link. -fopenmp Enable handling of the OpenMP directives. -g Produce debugging information in the operating systems native format. -O or -O1 Optimize. The compiler tried to reduce code size and execution time. -O2 Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. -O3 Optimize even more. The compiler will also do loop unrolling and function inlining. RECOMMENDED -O0 Do not optimize. This is the default. -Os Optimize for size. -Ofast Disregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard-compliant programs. It turns on -ffast-math and the Fortran-specific -fno-protect-parens and -fstack-arrays. -ffast-math Sets the options -fno-math-errno , -funsafe-math-optimizations , -ffinite-math-only , -fno-rounding-math , -fno-signaling-nans and -fcx-limited-range . -l library Search the library named \u2018library\u2019 when linking. Hint Code-along! Example: compiling an OpenMP C program You can find the file omp_hello.c in the exercises directory, in the subdirectory \u201csimple\u201d. Or you can download it here: omp_hello.c . In this example we compile the OpenMP C program omp_hello.c and name the output (executable) omp_hello . b-an01 [ ~ ] $ gcc -fopenmp omp_hello.c -o omp_hello Note You can change the number of threads with export OMP_NUM_THREADS=#threads Hint Code-along! Example Run the binary omp_hello that we got in the previous example. Set the number of threads to 4 and then rerun the binary. b-an01 [ ~ ] $ ./omp_hello Thread 0 says: Hello World Thread 0 reports: the number of threads are 1 b-an01 [ ~ ] $ export OMP_NUM_THREADS = 4 b-an01 [ ~ ] $ ./omp_hello Thread 1 says: Hello World Thread 0 says: Hello World Thread 0 reports: the number of threads are 4 Thread 3 says: Hello World Thread 2 says: Hello World b-an01 [ ~ ] $ Exercise Try yourself! Rerun with OMP_NUM_THREADS set to 1, 2, 4, 8. NOTE : Normally you are not supposed to run anything on the command line, but these are very short and light-weight programs. Exercise You could try with a different toolchain (or version). Remember to unload/purge, load the new toolchain, compile the program again, and then run. Compiling with Intel \u00b6 Language Compiler name MPI Fortran77 ifort mpiifort Fortran90 ifort mpiifort Fortran95 ifort N/A C icc mpiicc C++ icpc mpiicc In order to access the MPI compilers, load a compiler toolchain which contains an MPI library . Example: compiling a C program We are again compiling the hello.c program from before. This time we name the executable hello_intel to not overwrite the previously created executable. b-an01 [ ~ ] $ icc hello.c -o hello Flags \u00b6 Note List of commonly used flags: -fast This option maximizes speed across the entire program. -g Produce symbolic debug information in an object file. The -g option changes the default optimization from -O2 to -O0 . It is often a good idea to add -traceback also, so the compiler generates extra information in the object file to provide source file traceback information. -debug all Enables generation of enhanced debugging information. You need to also specify -g -O0 Disable optimizations. Use if you want to be certain of getting correct code. Otherwise use -O2 for speed. -O Same as -O2 -O1 Optimize to favor code size and code locality. Disables loop unrolling. -O1 may improve performance for applications with very large code size, many branches, and execution time not dominated by code within loops. In most cases, -O2 is recommended over -O1 . -O2 (default) Optimize for code speed. This is the generally recommended optimization level. -O3 Enable -O2 optimizations and in addition, enable more aggressive optimizations such as loop and memory access transformation, and prefetching. The -O3 option optimizes for maximum speed, but may not improve performance for some programs and may in some cases even slow down code. -Os Enable speed optimizations, but disable some optimizations that increase code size for small speed benefit. -fpe{0,1,3} Allows some control over floating-point exception (divide by zero, overflow, invalid operation, underflow, denormalized number, positive infinity, negative infinity or a NaN) handling for the main program at runtime. Fortran only. -qopenmp Enable the parallelizer to generate multi-threaded code based on the OpenMP directives. -parallel Enable the auto-parallelizer to generate multi-threaded code for loops that can be safely executed in parallel. Linking \u00b6 Build environment \u00b6 Using a compiler toolchain by itself is possible but requires a fair bit of manual work, figuring out which paths to add to -I or -L for including files and libraries, and similar. To make life as a software builder easier there is a special module available, buildenv , that can be loaded on top of any toolchain. If it is missing for some toolchain, send a mail to support@hpc2n.umu.se and let us know. This module defines a large number of environment variables with the relevant settings for the used toolchain. Among other things it sets CC, CXX, F90, FC, MPICC, MPICXX, MPIF90, CFLAGS, FFLAGS, and much more. To see all of them, after loading a toolchain do: ml show buildenv To use the environment variables, load buildenv: ml buildenv Using the environment variable (prefaced with $) for linking is highly recommended! Example Linking with LAPACK (gcc, C program). gcc -o PROGRAM PROGRAM.c -lflexiblas -lgfortran OR use the environment variable $LIBLAPACK : gcc -o PROGRAM PROGRAM.c $LIBLAPACK Note You can see a list of all the libraries on Kebnekaise (June 2024) here: https://docs.hpc2n.umu.se/documentation/compiling/#libraries . Keypoints In order to compile a program, you must first load a \u201ccompiler toolchain\u201d module Kebnekaise has both GCC and Intel compilers installed The GCC compilers are: gfortran gcc g++ The Intel compilers are: ifort icc icpc Compiling MPI programs can be done after loading a compiler toolchains which contains MPI libraries The easiest way to figure out how to link with a library is to use ml show buildenv after loading a compiler toolchain","title":"Compiling"},{"location":"compilers/#compiling__and__linking__with__libraries","text":"Objectives Learn about the compilers at HPC2N How to load the compiler toolchains How to use the compilers What are the popular flags How to link with libraries.","title":"Compiling and Linking with Libraries"},{"location":"compilers/#installed__compilers","text":"There are compilers available for Fortran 77, Fortran 90, Fortran 95, C, and C++. The compilers can produce both general-purpose code and architecture-specific optimized code to improve performance (loop-level optimizations, inter-procedural analysis and cache optimizations).","title":"Installed compilers"},{"location":"compilers/#loading__compilers","text":"Note You need to load a compiler suite (and possibly libraries, depending on what you need) before you can compile and link. Use ml av to get a list of available compiler toolchains as mentioned in the modules - compiler toolchains section. You load a compiler toolchain the same way you load any other module. They are always available directly, without the need to load prerequisites first. Hint Code-along! Example: Loading foss/2023b This compiler toolchain contains: GCC/13.2.0 , BLAS (with LAPACK ), ScaLAPACK , and FFTW . b-an01 [ ~ ] $ ml foss/2023b b-an01 [ ~ ] $ ml Currently Loaded Modules: 1 ) snicenvironment ( S ) 7 ) numactl/2.0.16 13 ) libevent/2.1.12 19 ) FlexiBLAS/3.3.1 2 ) systemdefault ( S ) 8 ) XZ/5.4.4 14 ) UCX/1.15.0 20 ) FFTW/3.3.10 3 ) GCCcore/13.2.0 9 ) libxml2/2.11.5 15 ) PMIx/4.2.6 21 ) FFTW.MPI/3.3.10 4 ) zlib/1.2.13 10 ) libpciaccess/0.17 16 ) UCC/1.2.0 22 ) ScaLAPACK/2.2.0-fb 5 ) binutils/2.40 11 ) hwloc/2.9.2 17 ) OpenMPI/4.1.6 23 ) foss/2023b 6 ) GCC/13.2.0 12 ) OpenSSL/1.1 18 ) OpenBLAS/0.3.24 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $","title":"Loading compilers"},{"location":"compilers/#compiling","text":"Note OpenMP : All compilers has this included, so it is enough to load the module for a specific compiler toolchain and then add the appropriate flag. Note If you do not name the executable (with the flag -o SOMENAME , it will be named a.out as default. This also means that the next time you compile something, if you also do not name that executable, it will overwrite the previous a.out file.","title":"Compiling"},{"location":"compilers/#compiling__with__gcc","text":"Language Compiler name MPI Fortran77 gfortran mpif77 Fortran90 gfortran mpif90 Fortran95 gfortran N/A C gcc mpicc C++ g++ mpiCC In order to access the MPI compilers, load a compiler toolchain which contains an MPI library . Hint Code-along! Example: compiling a C program You can find the file hello.c in the exercises directory, in the subdirectory \u201csimple\u201d. Or you can download it here: hello.c . In this example we compile the C program hello.c and name the output (the executable) hello . b-an01 [ ~ ] $ gcc hello.c -o hello You can run the executable with ./hello Example: compiling an MPI C program You can find the file mpi_hello.c in the exercises directory, in the subdirectory \u201csimple\u201d. Or you can download it here: mpi_hello.c . In this example we compile the MPI C program mpi_hello.c and name the output (the executable) mpi_hello . b-an01 [ ~ ] $ mpicc mpi_hello.c -o mpi_hello You then run with `mpirun mpi_hello Important If you later have loaded a different compiler than the one your program was compiled with, you should recompile your program before running it. Exercise Try loading foss/2023b and compiling mpi_hello.c , then unload the module and instead load the module intel/2023b and see what happens if you try to run with mpirun mpi_hello .","title":"Compiling with GCC"},{"location":"compilers/#flags","text":"Note List of commonly used flags: -o file Place output in file \u2018file\u2019. -c Compile or assemble the source files, but do not link. -fopenmp Enable handling of the OpenMP directives. -g Produce debugging information in the operating systems native format. -O or -O1 Optimize. The compiler tried to reduce code size and execution time. -O2 Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. -O3 Optimize even more. The compiler will also do loop unrolling and function inlining. RECOMMENDED -O0 Do not optimize. This is the default. -Os Optimize for size. -Ofast Disregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard-compliant programs. It turns on -ffast-math and the Fortran-specific -fno-protect-parens and -fstack-arrays. -ffast-math Sets the options -fno-math-errno , -funsafe-math-optimizations , -ffinite-math-only , -fno-rounding-math , -fno-signaling-nans and -fcx-limited-range . -l library Search the library named \u2018library\u2019 when linking. Hint Code-along! Example: compiling an OpenMP C program You can find the file omp_hello.c in the exercises directory, in the subdirectory \u201csimple\u201d. Or you can download it here: omp_hello.c . In this example we compile the OpenMP C program omp_hello.c and name the output (executable) omp_hello . b-an01 [ ~ ] $ gcc -fopenmp omp_hello.c -o omp_hello Note You can change the number of threads with export OMP_NUM_THREADS=#threads Hint Code-along! Example Run the binary omp_hello that we got in the previous example. Set the number of threads to 4 and then rerun the binary. b-an01 [ ~ ] $ ./omp_hello Thread 0 says: Hello World Thread 0 reports: the number of threads are 1 b-an01 [ ~ ] $ export OMP_NUM_THREADS = 4 b-an01 [ ~ ] $ ./omp_hello Thread 1 says: Hello World Thread 0 says: Hello World Thread 0 reports: the number of threads are 4 Thread 3 says: Hello World Thread 2 says: Hello World b-an01 [ ~ ] $ Exercise Try yourself! Rerun with OMP_NUM_THREADS set to 1, 2, 4, 8. NOTE : Normally you are not supposed to run anything on the command line, but these are very short and light-weight programs. Exercise You could try with a different toolchain (or version). Remember to unload/purge, load the new toolchain, compile the program again, and then run.","title":"Flags"},{"location":"compilers/#compiling__with__intel","text":"Language Compiler name MPI Fortran77 ifort mpiifort Fortran90 ifort mpiifort Fortran95 ifort N/A C icc mpiicc C++ icpc mpiicc In order to access the MPI compilers, load a compiler toolchain which contains an MPI library . Example: compiling a C program We are again compiling the hello.c program from before. This time we name the executable hello_intel to not overwrite the previously created executable. b-an01 [ ~ ] $ icc hello.c -o hello","title":"Compiling with Intel"},{"location":"compilers/#flags_1","text":"Note List of commonly used flags: -fast This option maximizes speed across the entire program. -g Produce symbolic debug information in an object file. The -g option changes the default optimization from -O2 to -O0 . It is often a good idea to add -traceback also, so the compiler generates extra information in the object file to provide source file traceback information. -debug all Enables generation of enhanced debugging information. You need to also specify -g -O0 Disable optimizations. Use if you want to be certain of getting correct code. Otherwise use -O2 for speed. -O Same as -O2 -O1 Optimize to favor code size and code locality. Disables loop unrolling. -O1 may improve performance for applications with very large code size, many branches, and execution time not dominated by code within loops. In most cases, -O2 is recommended over -O1 . -O2 (default) Optimize for code speed. This is the generally recommended optimization level. -O3 Enable -O2 optimizations and in addition, enable more aggressive optimizations such as loop and memory access transformation, and prefetching. The -O3 option optimizes for maximum speed, but may not improve performance for some programs and may in some cases even slow down code. -Os Enable speed optimizations, but disable some optimizations that increase code size for small speed benefit. -fpe{0,1,3} Allows some control over floating-point exception (divide by zero, overflow, invalid operation, underflow, denormalized number, positive infinity, negative infinity or a NaN) handling for the main program at runtime. Fortran only. -qopenmp Enable the parallelizer to generate multi-threaded code based on the OpenMP directives. -parallel Enable the auto-parallelizer to generate multi-threaded code for loops that can be safely executed in parallel.","title":"Flags"},{"location":"compilers/#linking","text":"","title":"Linking"},{"location":"compilers/#build__environment","text":"Using a compiler toolchain by itself is possible but requires a fair bit of manual work, figuring out which paths to add to -I or -L for including files and libraries, and similar. To make life as a software builder easier there is a special module available, buildenv , that can be loaded on top of any toolchain. If it is missing for some toolchain, send a mail to support@hpc2n.umu.se and let us know. This module defines a large number of environment variables with the relevant settings for the used toolchain. Among other things it sets CC, CXX, F90, FC, MPICC, MPICXX, MPIF90, CFLAGS, FFLAGS, and much more. To see all of them, after loading a toolchain do: ml show buildenv To use the environment variables, load buildenv: ml buildenv Using the environment variable (prefaced with $) for linking is highly recommended! Example Linking with LAPACK (gcc, C program). gcc -o PROGRAM PROGRAM.c -lflexiblas -lgfortran OR use the environment variable $LIBLAPACK : gcc -o PROGRAM PROGRAM.c $LIBLAPACK Note You can see a list of all the libraries on Kebnekaise (June 2024) here: https://docs.hpc2n.umu.se/documentation/compiling/#libraries . Keypoints In order to compile a program, you must first load a \u201ccompiler toolchain\u201d module Kebnekaise has both GCC and Intel compilers installed The GCC compilers are: gfortran gcc g++ The Intel compilers are: ifort icc icpc Compiling MPI programs can be done after loading a compiler toolchains which contains MPI libraries The easiest way to figure out how to link with a library is to use ml show buildenv after loading a compiler toolchain","title":"Build environment"},{"location":"filesystem/","text":"The File System \u00b6 Objectives Learn about the file system on Kebnekaise Find the project storage for this course and create your own subdirectory Overview \u00b6 Project storage $HOME /scratch Recommended for batch jobs Yes No (size) Yes Backed up No Yes No Accessible by batch system Yes Yes Yes (node only) Performance High High Medium Default readability Group only Owner Owner Permissions management chmod, chgrp, ACL chmod, chgrp, ACL N/A for batch jobs Notes Storage your group get allocated through the storage projects Your home-directory Per node $HOME \u00b6 This is your home-directory (pointed to by the $HOME variable). It has a quota limit of 25GB per default. Your home directory is backed up regularly. Note Since the home directory is quite small, it should not be used for most production jobs. These should instead be run from project storage directories. To find the path to your home directory, either run pwd just after logging in, or do the following: b-an01 [ ~/store ] $ cd b-an01 [ ~ ] $ pwd /home/u/username b-an01 [ ~ ] $ Project storage \u00b6 Project storage is where a project\u2019s members have the majority of their storage. It is applied for through SUPR, as a storage project. While storage projects needs to be applied for separately, they are usually linked to a compute project. This is where you should keep your data and run your batch jobs from. It offers high performance when accessed from the nodes making it suitable for storage that are to be accessed from parallel jobs, and your home directory (usually) has too little space. Project storage is located below /proj/nobackup/ in the directory name selected during the creation of the proposal. Note The project storage is not intended for permanent storage and there is NO BACKUP of /proj/nobackup . Using project storage \u00b6 If you have a storage project, you should use that to run your jobs. You (your PI) will either choose a directory name when you/they apply for the storage project or get the project id as default name. The location of the storage project in the file system is /proj/nobackup/NAME-YOU-PICKED Since the storage project is shared between all users of the project, you should go to that directory and create a subdirectory for your things, which you will then be using.- For this course the storage is in /proj/nobackup/intro-hpc2n Exercise Go to the course project storage and create a subdirectory for yourself. Now is a good time to prepare the course material and download the exercises. The easiest way to do so is by cloning the whole intro-course repository from GitHub. Exercise Go to the subdirectory you created under /proj/nobackup/intro-hpc2n Clone the repository for the course: git clone https://github.com/hpc2n/intro-course.git You will get a directory called intro-course . Below it you will find a directory called \u201cexercises\u201d where the majority of the exercises for the batch system section is located. Quota \u00b6 The size of the storage depends on the allocation. There are small, medium, and large storage projects, each with their own requirements. You can read about this on SUPR. The quota limits are specific for the project as such, there are no user level quotas on that space. /scratch \u00b6 Our recommendation is that you use the project storage instead of /scratch when working on Compute nodes or Login nodes. On the computers at HPC2N there is a directory called /scratch . It is a small local area split between the users using the node and it can be used for saving (temporary) files you create or need during your computations. Please do not save files in /scratch you don\u2019t need when not running jobs on the machine, and please make sure your job removes any temporary files it creates. Note When anybody need more space than available on /scratch , we will remove the oldest/largest files without any notices. More information about the file system, as well as archiving and compressing files, at the HPC2N documentation about File Systems . Keypoints When you login to Kebnekaise, you will end up in your home-directory. Your home-directory is in /home/u/username and is pointed to by the environment variable $HOME . Your project storage is located in /proj/nobackup/NAME-YOU-PICKED For this course it is /proj/nobackup/intro-hpc2n . The project storage is NOT backed up. You should run the batch jobs from your project storage.","title":"The File System"},{"location":"filesystem/#the__file__system","text":"Objectives Learn about the file system on Kebnekaise Find the project storage for this course and create your own subdirectory","title":"The File System"},{"location":"filesystem/#overview","text":"Project storage $HOME /scratch Recommended for batch jobs Yes No (size) Yes Backed up No Yes No Accessible by batch system Yes Yes Yes (node only) Performance High High Medium Default readability Group only Owner Owner Permissions management chmod, chgrp, ACL chmod, chgrp, ACL N/A for batch jobs Notes Storage your group get allocated through the storage projects Your home-directory Per node","title":"Overview"},{"location":"filesystem/#home","text":"This is your home-directory (pointed to by the $HOME variable). It has a quota limit of 25GB per default. Your home directory is backed up regularly. Note Since the home directory is quite small, it should not be used for most production jobs. These should instead be run from project storage directories. To find the path to your home directory, either run pwd just after logging in, or do the following: b-an01 [ ~/store ] $ cd b-an01 [ ~ ] $ pwd /home/u/username b-an01 [ ~ ] $","title":"$HOME"},{"location":"filesystem/#project__storage","text":"Project storage is where a project\u2019s members have the majority of their storage. It is applied for through SUPR, as a storage project. While storage projects needs to be applied for separately, they are usually linked to a compute project. This is where you should keep your data and run your batch jobs from. It offers high performance when accessed from the nodes making it suitable for storage that are to be accessed from parallel jobs, and your home directory (usually) has too little space. Project storage is located below /proj/nobackup/ in the directory name selected during the creation of the proposal. Note The project storage is not intended for permanent storage and there is NO BACKUP of /proj/nobackup .","title":"Project storage"},{"location":"filesystem/#using__project__storage","text":"If you have a storage project, you should use that to run your jobs. You (your PI) will either choose a directory name when you/they apply for the storage project or get the project id as default name. The location of the storage project in the file system is /proj/nobackup/NAME-YOU-PICKED Since the storage project is shared between all users of the project, you should go to that directory and create a subdirectory for your things, which you will then be using.- For this course the storage is in /proj/nobackup/intro-hpc2n Exercise Go to the course project storage and create a subdirectory for yourself. Now is a good time to prepare the course material and download the exercises. The easiest way to do so is by cloning the whole intro-course repository from GitHub. Exercise Go to the subdirectory you created under /proj/nobackup/intro-hpc2n Clone the repository for the course: git clone https://github.com/hpc2n/intro-course.git You will get a directory called intro-course . Below it you will find a directory called \u201cexercises\u201d where the majority of the exercises for the batch system section is located.","title":"Using project storage"},{"location":"filesystem/#quota","text":"The size of the storage depends on the allocation. There are small, medium, and large storage projects, each with their own requirements. You can read about this on SUPR. The quota limits are specific for the project as such, there are no user level quotas on that space.","title":"Quota"},{"location":"filesystem/#scratch","text":"Our recommendation is that you use the project storage instead of /scratch when working on Compute nodes or Login nodes. On the computers at HPC2N there is a directory called /scratch . It is a small local area split between the users using the node and it can be used for saving (temporary) files you create or need during your computations. Please do not save files in /scratch you don\u2019t need when not running jobs on the machine, and please make sure your job removes any temporary files it creates. Note When anybody need more space than available on /scratch , we will remove the oldest/largest files without any notices. More information about the file system, as well as archiving and compressing files, at the HPC2N documentation about File Systems . Keypoints When you login to Kebnekaise, you will end up in your home-directory. Your home-directory is in /home/u/username and is pointed to by the environment variable $HOME . Your project storage is located in /proj/nobackup/NAME-YOU-PICKED For this course it is /proj/nobackup/intro-hpc2n . The project storage is NOT backed up. You should run the batch jobs from your project storage.","title":"/scratch"},{"location":"intro/","text":"Introduction to HPC2N, Kebnekaise and HPC \u00b6 Welcome page and syllabus: https://hpc2n.github.io/intro-linux/index.html Also link at the House symbol at the top of the page. HPC2N \u00b6 Note High Performance Computing Center North (HPC2N) is a competence center for Scientific and Parallel Computing part of National Academic Infrastructure for Super\u00adcomputing in Sweden (NAISS) HPC2N provides state-of-the-art resources and expertise: Scalable and parallel HPC Large-scale storage facilities (Project storage (Lustre), SweStore, Tape) Grid and cloud computing (WLCG NT1, Swedish Science Cloud) National Data Science Node in \u201dEpidemiology and Biology of Infections\u201d (DDLS) Software for e-Science applications All levels of user support Primary, advanced, dedicated Application Experts (AEs) Primary objective To raise the national and local level of HPC competence and transfer HPC knowledge and technology to new users in academia and industry. HPC2N partners \u00b6 HPC2N is hosted by: Partners: HPC2N funding and collaborations \u00b6 Funded mainly by Ume\u00e5 University , with contributions from the other HPC2N partners . Involved in several projects and collaborations : HPC2N training and other services \u00b6 User support (primary, advanced, dedicated) Research group meetings @ UmU Also at the partner sites Online \u201cHPC2N fika\u201d User training and education program 0.5 \u2013 5 days; ready-to-run exercises intro courses: our system, Linux, R, Python, Julia, Matlab, Git intermediate courses Parallel programming and tools (OpenMP, MPI, debugging, perf. analyzers, Matlab, R, MD simulation, ML, GPU, \u2026) Courses this fall Introduction to Linux, 16 September 2024 Introduction to HPC2N and Kebnekaise, 16 September 2024 Basic Singularity, 16 October 2024 Introduction to running R, Python, Julia, and Matlab in HPC, 22-25 October 2024 Introduction to Git, 25-29 November 2024 Using Python in an HPC environment, 5-6 December 2024 Updated list: https://www.hpc2n.umu.se/events/courses Workshops and seminars NGSSC / SeSE & university courses HPC2N personnel \u00b6 Management: Paolo Bientinesi, director Bj\u00f6rn Torkelsson, deputy director Lena Hellman, administrator Application experts: Jerry Eriksson Pedro Ojeda May Birgitte Bryds\u00f6 \u00c5ke Sandgren Others: Mikael R\u00e4nnar (WLCG coord) Research Engineers under DDLS, HPC2N/SciLifeLab Paul Dulaud, System Developer, IT Abdullah Aziz, Data Engineer Nalina Hamsaiyni Venkatesh, Data Steward System and support: Erik Andersson Birgitte Bryds\u00f6 Niklas Edmundsson (Tape coord) My Karlsson Roger Oscarsson \u00c5ke Sandgren Mattias Wadenstein (NeIC, Tier1) HPC2N application experts \u00b6 HPC2N provides advanced and dedicated support in the form of Application Experts (AEs) : Jerry Eriksson: Profiling, Machine learning (DNN), MPI, OpenMP, OpenACC Pedro Ojeda May: Molecular dynamics, Profiling, QM/MM, NAMD, Amber, Gromacs, GAUSSIAN, R, Python \u00c5ke Sandgren: General high level programming assistance, VASP, Gromacs, Amber Birgitte Bryds\u00f6: General HPC, R, Python Contact through regular support HPC2N users by discipline \u00b6 Users from several scientific disciplines: Biosciences and medicine Chemistry Computing science Engineering Materials science Mathematics and statistics Physics including space physics ML, DL, and other AI HPC2N users by discipline, largest users \u00b6 Users from several scientific disciplines: Biosciences and medicine Chemistry Computing science Engineering Materials science Mathematics and statistics Physics including space physics Machine learning and artificial intelligence (several new projects) HPC2N users by software \u00b6 Kebnekaise \u00b6 The current supercomputer at HPC2N. It is a very heterogeneous system. Named after a massif (contains some of Sweden\u2019s highest mountain peaks) Kebnekaise was delivered by Lenovo and installed during the summer 2016 Opened up for general availability on November 7, 2016 In 2018, Kebnekaise was extended with 52 Intel Xeon Gold 6132 (Skylake) nodes, as well as 10 NVidian V100 (Volta) GPU nodes In 2023, Kebnekaise was extended with 2 dual NVIDIA A100 GPU nodes one many-core AMD Zen3 CPU node In 2024 Kebnekaise was extended with 2 Dual socket GPU-nodes: Lenovo ThinkSystem SR675 V3 2 x AMD EPYC 9454 48C 290W 2.75GHz Processor 768GB [24x 32GB TruDDR5 4800MHz RDIMM-A] 1 x 3.84TB Read Intensive NVMe PCIe 4.0 x4 HS SSD 1 x NVIDIA H100 SXM5 700W 80G HBM3 GPU Board 10 dual-socket GPU-nodes: ThinkSystem SR665 V3 2 x AMD EPYC 9254 24C 200W 2.9GHz Processor 384GB [24x 16GB TruDDR5 4800MHz RDIMM-A] 1 x 1.92TB Read Intensive NVMe PCIe 5.0 x4 HS SSD 2 x NVIDIA L40S 48GB PCIe Gen4 Passive GPU 8 dual-socket CPU only: ThinkSystem SR645 V3 2 x AMD EPYC 9754 128C 360W 2.25GHz Processor 768GB [24x 32GB TruDDR5 4800MHz RDIMM-A] 1 x 1 3.84TB Read Intensive NVMe PCIe 4.0 x4 HS SSD Kebnekaise will be continuosly upgraded, as old hardware gets retired. Current hardware in Kebnekaise \u00b6 Kebnekaise have CPU-only, GPU enabled and large memory nodes. The CPU-only nodes are: 2 x 14 core Intel broadwell 4460 MB memory / core 48 nodes Total of 41.6 TFlops/s 2 x 14 core Intel skylake 6785 MB memory / core 52 nodes Total of 87 TFlops/s 2 x 64 core AMD zen3 8020 MB / core 1 node Total of 11 TFlops/s 2 x 128 core AMD zen4 2516 MB / core 8 nodes Total of 216 TFlops/s The GPU enabled nodes are: 2 x 14 core Intel broadwell 9000 MB memory / core 2 x Nvidia A40 4 nodes Total of 83 TFlops/s 2 x 14 core Intel skylake 6785 MB memory / core 2 x Nvidia V100 10 nodes Total of 75 TFlops/s 2 x 24 core AMD zen3 10600 MB / core 2 x Nvidia A100 2 nodes 2 x 24 core AMD zen3 10600 MB / core 2 x AMD MI100 1 node 2 x 24 core AMD zen4 6630 MB / core 2 x Nvidia A6000 1 node 2 x 24 core AMD zen4 6630 MB / core 2 x Nvidia L40s 10 nodes 2 x 48 core AMD zen4 6630 MB / core 4 x Nvidia H100 SXM5 2 nodes The large memory nodes are: 4 x 18 core Intel broadwell 41666 MB memory / core 8 nodes Total of 13.6 TFlops/s for all these nodes GPUs can have different types of cores: CUDA cores : General-purpose cores for a variety of parallel computing tasks. Not as efficicent as specizalized cores. CUDA cores is only on NVidia. The (mostly) equivalent is called stream processors on AMD. Tensor cores : Made for matrix multiplications. Good for deep learning and AI workloads involving large matrix operations. Can be used for general-purpose as well, but less efficient for this. Tensor cores is the NVidia name. AMD has a somewhat equivalent core type called matrix cores . RT (ray tracing) cores : Cores that are optimized for tasks involving ray tracing, like rendering images or video. GPU Type CUDA cores / stream processors TENSOR cores / matrix cores RT cores A40 10752 336 V100 5120 640 A100 6912 432 MI100 7680 480 A6000 10752 386 L40S 18176 568 142 H100 16896 528 NOTE that just like you cannot really compare CPU cores directly (speed etc.) you also cannot just compare CUDA/TENSOR/RT etc. cores directly (more efficient design, faster, etc.) Kebnekaise - HPC2N storage \u00b6 Basically four types of storage are available at HPC2N: Home directory /home/X/Xyz , $HOME , ~ 25 GB, user owned Project storage /proj/nobackup/abc Shared among project members Local scratch space $SNIC_TMP SSD (170GB), per job, per node, \u201cvolatile\u201d Tape Storage Backup Long term storage Also SweStore \u2014 disk based (dCache) Research Data Storage Infrastructure, for active research data and operated by NAISS, WLCG Kebnekaise - projects \u00b6 Compute projects To use Kebnekaise, you must be a member of a compute project . A compute project has a certain number of core hours allocated for it per month A regular CPU core cost 1 core hour per hour, other resources (e.g., GPUs) cost more Not a hard limit but projects that go over the allocation get lower priority A compute project contains a certain amount of storage. If more storage is required, you must be a member of a storage project . Note As Kebnekaise is a local cluster, you need to be affiliated with UmU, IRF, SLU, Miun, or LTU to use it. Projects are applied for through SUPR ( https://supr.naiss.se ). I will cover more details in a later section, where we go more into detail about HPC2N and Kebnekaise. HPC \u00b6 What is HPC? High Performance Computing (definition) \u201cHigh Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business.\u201d From: https://insidehpc.com/hpc-basic-training/what-is-hpc/ High Performance Computing - opening the definition \u00b6 Aggregating computing power \u00b6 147 nodes totalling 6808 CPU cores and 64 GPUs (totalling 751616 CUDA cores, 33472 TENSOR cores + 960 matrix cores, 284 RT cores) Compared to 4-8 cores in a common modern laptop Higher performance \u00b6 More than 527,000,000,000,000 arithmetical operations per second (527 trillion (billion)) in the CPU cores Compared to 200,000,000,000 Flops in a modern laptop (200 billion (milliard) Solve large problems \u00b6 When does a problem become large enough for HPC? Are there other reasons for using HPC resources? (Memory, software, support, etc.) High Performance Computing - large problems \u00b6 A problem can be large for two main reasons: Execution time : The time required to form a solution to the problem is very long Memory / storage use : The solution of the problem requires a lot of memory and/or storage The former can be remedied by increasing the performance More cores, more nodes, GPUs, \u2026 The latter by adding more memory / storage More memory per node (including large memory nodes), more nodes, \u2026 Kebnekaise: 128GB - 192GB, 384GB, 512GB, 768GB, 3TB Large storage solutions, \u2026 High Performance Computing - what counts as HPC \u00b6 High Performance Computing - other reasons \u00b6 Specialized (expensive) hardware GPUs, including those optimized for AI Kebnekaise has V100, A100, A40, MI100, A6000, L40S, H100 High-end CPUs (AVX-512 etc) and ECC memory Software HPC2N holds licenses for several softwares Software is pre-configured and ready-to-use Support and documentation High Performance Computing - memory models \u00b6 Two memory models are relevant for HPC: Shared memory: Single memory space for all data. Everyone can access the same data Straightforward to use Distributed memory: Multiple distinct memory spaces. Everyone has direct access only to the local data Requires communication High Performance Computing - programming models \u00b6 The programming model changes when we aim for extra performance and/or memory: Single-core: Matlab, Python, C, Fortran, \u2026 Single stream of operations Multi-core: Vectorized Matlab, pthreads, OpenMP Multiple streams of operations Work distribution, coordination (synchronization, etc), \u2026 Distributed memory: MPI, \u2026 Multiple streams of operations Work distribution, coordination (synchronization, etc), \u2026 Data distribution and communication GPUs: CUDA, OpenCL, OpenACC, OpenMP, \u2026 Many lightweight streams of operations Work distribution, coordination (synchronization, etc), \u2026 Data distribution across memory spaces and movement High Performance Computing - software \u00b6 Complexity grows when we aim for extra performance and/or memory/storage: Single-core: LAPACK, \u2026 Load correct toolchain etc Multi-core: LAPACK + parallel BLAS, \u2026 Load correct toolchain etc Allocate correct number of cores, configure software to use correct number of cores, \u2026 Distributed memory}: ScaLAPACK, \u2026 Load correct toolchain etc Allocate correct number of nodes and cores , configure software to use correct number of nodes and cores , \u2026 Data distribution, storage, \u2026 GPUs: MAGMA, TensorFlow, \u2026 Load correct toolchain etc Allocate correct number of cores and GPUs , configure software to use correct number of cores and GPUs , \u2026","title":"Introduction to Kebnekaise and HPC2N"},{"location":"intro/#introduction__to__hpc2n__kebnekaise__and__hpc","text":"Welcome page and syllabus: https://hpc2n.github.io/intro-linux/index.html Also link at the House symbol at the top of the page.","title":"Introduction to HPC2N, Kebnekaise and HPC"},{"location":"intro/#hpc2n","text":"Note High Performance Computing Center North (HPC2N) is a competence center for Scientific and Parallel Computing part of National Academic Infrastructure for Super\u00adcomputing in Sweden (NAISS) HPC2N provides state-of-the-art resources and expertise: Scalable and parallel HPC Large-scale storage facilities (Project storage (Lustre), SweStore, Tape) Grid and cloud computing (WLCG NT1, Swedish Science Cloud) National Data Science Node in \u201dEpidemiology and Biology of Infections\u201d (DDLS) Software for e-Science applications All levels of user support Primary, advanced, dedicated Application Experts (AEs) Primary objective To raise the national and local level of HPC competence and transfer HPC knowledge and technology to new users in academia and industry.","title":"HPC2N"},{"location":"intro/#hpc2n__partners","text":"HPC2N is hosted by: Partners:","title":"HPC2N partners"},{"location":"intro/#hpc2n__funding__and__collaborations","text":"Funded mainly by Ume\u00e5 University , with contributions from the other HPC2N partners . Involved in several projects and collaborations :","title":"HPC2N funding and collaborations"},{"location":"intro/#hpc2n__training__and__other__services","text":"User support (primary, advanced, dedicated) Research group meetings @ UmU Also at the partner sites Online \u201cHPC2N fika\u201d User training and education program 0.5 \u2013 5 days; ready-to-run exercises intro courses: our system, Linux, R, Python, Julia, Matlab, Git intermediate courses Parallel programming and tools (OpenMP, MPI, debugging, perf. analyzers, Matlab, R, MD simulation, ML, GPU, \u2026) Courses this fall Introduction to Linux, 16 September 2024 Introduction to HPC2N and Kebnekaise, 16 September 2024 Basic Singularity, 16 October 2024 Introduction to running R, Python, Julia, and Matlab in HPC, 22-25 October 2024 Introduction to Git, 25-29 November 2024 Using Python in an HPC environment, 5-6 December 2024 Updated list: https://www.hpc2n.umu.se/events/courses Workshops and seminars NGSSC / SeSE & university courses","title":"HPC2N training and other services"},{"location":"intro/#hpc2n__personnel","text":"Management: Paolo Bientinesi, director Bj\u00f6rn Torkelsson, deputy director Lena Hellman, administrator Application experts: Jerry Eriksson Pedro Ojeda May Birgitte Bryds\u00f6 \u00c5ke Sandgren Others: Mikael R\u00e4nnar (WLCG coord) Research Engineers under DDLS, HPC2N/SciLifeLab Paul Dulaud, System Developer, IT Abdullah Aziz, Data Engineer Nalina Hamsaiyni Venkatesh, Data Steward System and support: Erik Andersson Birgitte Bryds\u00f6 Niklas Edmundsson (Tape coord) My Karlsson Roger Oscarsson \u00c5ke Sandgren Mattias Wadenstein (NeIC, Tier1)","title":"HPC2N personnel"},{"location":"intro/#hpc2n__application__experts","text":"HPC2N provides advanced and dedicated support in the form of Application Experts (AEs) : Jerry Eriksson: Profiling, Machine learning (DNN), MPI, OpenMP, OpenACC Pedro Ojeda May: Molecular dynamics, Profiling, QM/MM, NAMD, Amber, Gromacs, GAUSSIAN, R, Python \u00c5ke Sandgren: General high level programming assistance, VASP, Gromacs, Amber Birgitte Bryds\u00f6: General HPC, R, Python Contact through regular support","title":"HPC2N application experts"},{"location":"intro/#hpc2n__users__by__discipline","text":"Users from several scientific disciplines: Biosciences and medicine Chemistry Computing science Engineering Materials science Mathematics and statistics Physics including space physics ML, DL, and other AI","title":"HPC2N users by discipline"},{"location":"intro/#hpc2n__users__by__discipline__largest__users","text":"Users from several scientific disciplines: Biosciences and medicine Chemistry Computing science Engineering Materials science Mathematics and statistics Physics including space physics Machine learning and artificial intelligence (several new projects)","title":"HPC2N users by discipline, largest users"},{"location":"intro/#hpc2n__users__by__software","text":"","title":"HPC2N users by software"},{"location":"intro/#kebnekaise","text":"The current supercomputer at HPC2N. It is a very heterogeneous system. Named after a massif (contains some of Sweden\u2019s highest mountain peaks) Kebnekaise was delivered by Lenovo and installed during the summer 2016 Opened up for general availability on November 7, 2016 In 2018, Kebnekaise was extended with 52 Intel Xeon Gold 6132 (Skylake) nodes, as well as 10 NVidian V100 (Volta) GPU nodes In 2023, Kebnekaise was extended with 2 dual NVIDIA A100 GPU nodes one many-core AMD Zen3 CPU node In 2024 Kebnekaise was extended with 2 Dual socket GPU-nodes: Lenovo ThinkSystem SR675 V3 2 x AMD EPYC 9454 48C 290W 2.75GHz Processor 768GB [24x 32GB TruDDR5 4800MHz RDIMM-A] 1 x 3.84TB Read Intensive NVMe PCIe 4.0 x4 HS SSD 1 x NVIDIA H100 SXM5 700W 80G HBM3 GPU Board 10 dual-socket GPU-nodes: ThinkSystem SR665 V3 2 x AMD EPYC 9254 24C 200W 2.9GHz Processor 384GB [24x 16GB TruDDR5 4800MHz RDIMM-A] 1 x 1.92TB Read Intensive NVMe PCIe 5.0 x4 HS SSD 2 x NVIDIA L40S 48GB PCIe Gen4 Passive GPU 8 dual-socket CPU only: ThinkSystem SR645 V3 2 x AMD EPYC 9754 128C 360W 2.25GHz Processor 768GB [24x 32GB TruDDR5 4800MHz RDIMM-A] 1 x 1 3.84TB Read Intensive NVMe PCIe 4.0 x4 HS SSD Kebnekaise will be continuosly upgraded, as old hardware gets retired.","title":"Kebnekaise"},{"location":"intro/#current__hardware__in__kebnekaise","text":"Kebnekaise have CPU-only, GPU enabled and large memory nodes. The CPU-only nodes are: 2 x 14 core Intel broadwell 4460 MB memory / core 48 nodes Total of 41.6 TFlops/s 2 x 14 core Intel skylake 6785 MB memory / core 52 nodes Total of 87 TFlops/s 2 x 64 core AMD zen3 8020 MB / core 1 node Total of 11 TFlops/s 2 x 128 core AMD zen4 2516 MB / core 8 nodes Total of 216 TFlops/s The GPU enabled nodes are: 2 x 14 core Intel broadwell 9000 MB memory / core 2 x Nvidia A40 4 nodes Total of 83 TFlops/s 2 x 14 core Intel skylake 6785 MB memory / core 2 x Nvidia V100 10 nodes Total of 75 TFlops/s 2 x 24 core AMD zen3 10600 MB / core 2 x Nvidia A100 2 nodes 2 x 24 core AMD zen3 10600 MB / core 2 x AMD MI100 1 node 2 x 24 core AMD zen4 6630 MB / core 2 x Nvidia A6000 1 node 2 x 24 core AMD zen4 6630 MB / core 2 x Nvidia L40s 10 nodes 2 x 48 core AMD zen4 6630 MB / core 4 x Nvidia H100 SXM5 2 nodes The large memory nodes are: 4 x 18 core Intel broadwell 41666 MB memory / core 8 nodes Total of 13.6 TFlops/s for all these nodes GPUs can have different types of cores: CUDA cores : General-purpose cores for a variety of parallel computing tasks. Not as efficicent as specizalized cores. CUDA cores is only on NVidia. The (mostly) equivalent is called stream processors on AMD. Tensor cores : Made for matrix multiplications. Good for deep learning and AI workloads involving large matrix operations. Can be used for general-purpose as well, but less efficient for this. Tensor cores is the NVidia name. AMD has a somewhat equivalent core type called matrix cores . RT (ray tracing) cores : Cores that are optimized for tasks involving ray tracing, like rendering images or video. GPU Type CUDA cores / stream processors TENSOR cores / matrix cores RT cores A40 10752 336 V100 5120 640 A100 6912 432 MI100 7680 480 A6000 10752 386 L40S 18176 568 142 H100 16896 528 NOTE that just like you cannot really compare CPU cores directly (speed etc.) you also cannot just compare CUDA/TENSOR/RT etc. cores directly (more efficient design, faster, etc.)","title":"Current hardware in Kebnekaise"},{"location":"intro/#kebnekaise__-__hpc2n__storage","text":"Basically four types of storage are available at HPC2N: Home directory /home/X/Xyz , $HOME , ~ 25 GB, user owned Project storage /proj/nobackup/abc Shared among project members Local scratch space $SNIC_TMP SSD (170GB), per job, per node, \u201cvolatile\u201d Tape Storage Backup Long term storage Also SweStore \u2014 disk based (dCache) Research Data Storage Infrastructure, for active research data and operated by NAISS, WLCG","title":"Kebnekaise - HPC2N storage"},{"location":"intro/#kebnekaise__-__projects","text":"Compute projects To use Kebnekaise, you must be a member of a compute project . A compute project has a certain number of core hours allocated for it per month A regular CPU core cost 1 core hour per hour, other resources (e.g., GPUs) cost more Not a hard limit but projects that go over the allocation get lower priority A compute project contains a certain amount of storage. If more storage is required, you must be a member of a storage project . Note As Kebnekaise is a local cluster, you need to be affiliated with UmU, IRF, SLU, Miun, or LTU to use it. Projects are applied for through SUPR ( https://supr.naiss.se ). I will cover more details in a later section, where we go more into detail about HPC2N and Kebnekaise.","title":"Kebnekaise - projects"},{"location":"intro/#hpc","text":"What is HPC? High Performance Computing (definition) \u201cHigh Performance Computing most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business.\u201d From: https://insidehpc.com/hpc-basic-training/what-is-hpc/","title":"HPC"},{"location":"intro/#high__performance__computing__-__opening__the__definition","text":"","title":"High Performance Computing - opening the definition"},{"location":"intro/#aggregating__computing__power","text":"147 nodes totalling 6808 CPU cores and 64 GPUs (totalling 751616 CUDA cores, 33472 TENSOR cores + 960 matrix cores, 284 RT cores) Compared to 4-8 cores in a common modern laptop","title":"Aggregating computing power"},{"location":"intro/#higher__performance","text":"More than 527,000,000,000,000 arithmetical operations per second (527 trillion (billion)) in the CPU cores Compared to 200,000,000,000 Flops in a modern laptop (200 billion (milliard)","title":"Higher performance"},{"location":"intro/#solve__large__problems","text":"When does a problem become large enough for HPC? Are there other reasons for using HPC resources? (Memory, software, support, etc.)","title":"Solve large problems"},{"location":"intro/#high__performance__computing__-__large__problems","text":"A problem can be large for two main reasons: Execution time : The time required to form a solution to the problem is very long Memory / storage use : The solution of the problem requires a lot of memory and/or storage The former can be remedied by increasing the performance More cores, more nodes, GPUs, \u2026 The latter by adding more memory / storage More memory per node (including large memory nodes), more nodes, \u2026 Kebnekaise: 128GB - 192GB, 384GB, 512GB, 768GB, 3TB Large storage solutions, \u2026","title":"High Performance Computing - large problems"},{"location":"intro/#high__performance__computing__-__what__counts__as__hpc","text":"","title":"High Performance Computing - what counts as HPC"},{"location":"intro/#high__performance__computing__-__other__reasons","text":"Specialized (expensive) hardware GPUs, including those optimized for AI Kebnekaise has V100, A100, A40, MI100, A6000, L40S, H100 High-end CPUs (AVX-512 etc) and ECC memory Software HPC2N holds licenses for several softwares Software is pre-configured and ready-to-use Support and documentation","title":"High Performance Computing - other reasons"},{"location":"intro/#high__performance__computing__-__memory__models","text":"Two memory models are relevant for HPC: Shared memory: Single memory space for all data. Everyone can access the same data Straightforward to use Distributed memory: Multiple distinct memory spaces. Everyone has direct access only to the local data Requires communication","title":"High Performance Computing - memory models"},{"location":"intro/#high__performance__computing__-__programming__models","text":"The programming model changes when we aim for extra performance and/or memory: Single-core: Matlab, Python, C, Fortran, \u2026 Single stream of operations Multi-core: Vectorized Matlab, pthreads, OpenMP Multiple streams of operations Work distribution, coordination (synchronization, etc), \u2026 Distributed memory: MPI, \u2026 Multiple streams of operations Work distribution, coordination (synchronization, etc), \u2026 Data distribution and communication GPUs: CUDA, OpenCL, OpenACC, OpenMP, \u2026 Many lightweight streams of operations Work distribution, coordination (synchronization, etc), \u2026 Data distribution across memory spaces and movement","title":"High Performance Computing - programming models"},{"location":"intro/#high__performance__computing__-__software","text":"Complexity grows when we aim for extra performance and/or memory/storage: Single-core: LAPACK, \u2026 Load correct toolchain etc Multi-core: LAPACK + parallel BLAS, \u2026 Load correct toolchain etc Allocate correct number of cores, configure software to use correct number of cores, \u2026 Distributed memory}: ScaLAPACK, \u2026 Load correct toolchain etc Allocate correct number of nodes and cores , configure software to use correct number of nodes and cores , \u2026 Data distribution, storage, \u2026 GPUs: MAGMA, TensorFlow, \u2026 Load correct toolchain etc Allocate correct number of cores and GPUs , configure software to use correct number of cores and GPUs , \u2026","title":"High Performance Computing - software"},{"location":"login/","text":"Logging in \u00b6 When you have your account, you can login to Kebnekaise. This can be done with any number of SSH clients or with ThinLinc (the easiest option if you need a graphical interface). Objectives Login to Kebnekaise, either with ThinLinc or your SSH client of choice. Kebnekaise login servers \u00b6 Note The main login node of Kebnekaise: kebnekaise.hpc2n.umu.se ThinLinc login node: kebnekaise-tl.hpc2n.umu.se ThinLinc through a browser (less features): https://kebnekaise-tl.hpc2n.umu.se:300/ In addition, there is a login node for the AMD-based nodes. We will talk more about this later: kebnekaise-amd.hpc2n.umu.se . For ThinLinc access: kebnekaise-amd-tl.hpc2n.umu.se ThinLinc is recommended for this course ThinLinc: a cross-platform remote desktop server from Cendio AB. Especially useful when you need software with a graphical interface. This is what we recommend you use for this course, unless you have a preferred SSH client. Using ThinLinc \u00b6 Download the client from https://www.cendio.com/thinlinc/download . Install it. Windows: Run the downloaded .exe file to install. macOS: Information on the ThinLinc macOS info page . Linux Ubuntu: Download the .deb file. Run sudo dpkg -i PATH-TO-FILE/FILE-YOU-DOWNLOADED.deb Start the client. Enter the name of the server: kebnekaise-tl.hpc2n.umu.se . Enter your username. Go to \u201cOptions\u201d \\(->\\) \u201cSecurity\u201d. Check that authentication method is set to password. Go to \u201cOptions\u201d \\(->\\) \u201cScreen\u201d. Uncheck \u201cFull screen mode\u201d. Enter your HPC2N password. Click \u201cConnect\u201d Click \u201cContinue\u201d when you are being told that the server\u2019s host key is not in the registry. Wait for the ThinLinc desktop to open. Password \u00b6 You get your first, temporary HPC2N password from this page: HPC2N passwords . That page can also be used to reset your HPC2N password if you have forgotten it. Note that you are authenticating through SUPR, using that service\u2019s login credentials! Warning The HPC2N password and the SUPR password are separate! The HPC2N password and your university/department password are also separate! Exercise Login to Kebnekaise. If you are using ThinLinc, first install the ThinLinc client. If you are using another SSH client, install it first if you have not already done so. Change password \u00b6 Exercise: Change your password after first login ONLY do this if you have logged in for the first time/is still using the termporary password you got from the HPC2N password reset service! Changing password is done using the passwd command: passwd Use a good password that combines letters of different case. Do not use dictionary words. Avoid using the same password that you also use in other places. It will first ask for your current password. Type in that and press enter. Then type in the new password, enter, and repeat. You have changed the password. File transfers \u00b6 We are not going to transfer any files as part of this course, but you may have to do so as part of your workflow when using Kebnekaise (or another HPC centre) for your research. This section will only talk briefly about file transfers. You can find more information and examples on HPC2N\u2019s File transfer documentation . Linux, OS X \u00b6 scp \u00b6 SCP (Secure CoPy) is a simple way of transferring files between two machines that use the SSH (Secure SHell) protocol. You may use SCP to connect to any system where you have SSH (log-in) access. These examples show how to use scp from the command-line. Graphical programs exists for doing scp transfer. The command-lone scp program should already be installed. Remote to local Transfer a file from Kebnekaise to your local system, while on your local system scp username@kebnekaise.hpc2n.umu.se:file . Local to remote Transfer a local file to Kebnekaise, while on your local system scp file username@kebnekaise.hpc2n.umu.se:file Recursive directory copy from a local system to a remote system The directory sourcedirectory is here copied as a subdirectory to somedir scp -r sourcedirectory/ username@kebnekaise.hpc2n.umu.se:somedir/ sftp \u00b6 SFTP (SSH File Transfer Protocol or sometimes called Secure File Transfer Protocol) is a network protocol that provides file transfer over a reliable data stream. SFTP is a command -line program on most Unix, Linux, and Mac OS X systems. It is also available as a protocol choice in some graphical file transfer programs. Example: From a local system to a remote system enterprise-d [ ~ ] $ sftp user@kebnekaise.hpc2n.umu.se Connecting to kebnekaise.hpc2n.umu.se... user@kebnekaise.hpc2n.umu.se ' s password: sftp> put file.c C/file.c Uploading file.c to /home/u/user/C/file.c file.c 100 % 1 0 .0KB/s 00 :00 sftp> put -P irf.png pic/ Uploading irf.png to /home/u/user/pic/irf.png irf.png 100 % 2100 2 .1KB/s 00 :00 sftp> Windows \u00b6 Here you need to download a client: WinSCP, FileZilla (sftp), PSCP/PSFTP, \u2026 You can transfer with sftp or scp. There is documentation in HPC2N\u2019s documentation pages for Windows file transfers . Editors \u00b6 Since the editors on a Linux system are different to those you may be familiar with from Windows or macOS, here follows a short overview. There are command-line editors and graphical editors. If you are connecting with a regular SSH client, it will be simplest to use a command-line editor. If you are using ThinLinc, you can use command-line editors or graphical editors as you want. Command-line \u00b6 These are all good editors for using on the command line: nano vi , vim emacs They are all installed on Kebnekaise. Of these, vi/vim as well as emacs are probably the most powerful, though the latter is better in a GUI environment. The easiest editor to use if you are not familiar with any of them is nano . Nano Starting \u201cnano\u201d: Type nano FILENAME on the command line and press Enter . FILENAME is whatever you want to call your file. If FILENAME is a file that already exists, nano will open the file. If it dows not exist, it will be created. You now get an editor that looks like this: First thing to notice is that many of the commands are listed at the bottom. The ^ before the letter-commands means you should press CTRL and then the letter (while keeping CTRL down). Your prompt is in the editor window itself, and you can just type (or copy and paste) the content you want in your file. When you want to exit (and possibly save), you press CTRL and then x while holding CTRL down (this is written CTRL-x or ^x ). nano will ask you if you want to save the content of the buffer to the file. After that it will exit. There is a manual for nano here . GUI \u00b6 If you are connecting with ThinLinc , you will be presented with a graphical user interface (GUI). From there you can either open a terminal window/shell ( Applications -> System Tools -> MATE Terminal ) or you can choose editors from the menu by going to Applications -> Accessories . This gives several editor options, of which these have a graphical interface: Text Editor (gedit) Pluma - the default editor on the MATE desktop environments (that Thinlinc runs) Atom - not just an editor, but an IDE Emacs (GUI) NEdit \u201cNirvana Text Editor\u201d If you are not familiar with any of these, a good recommendation would be to use Text Editor/gedit . Text Editor/gedit Starting \u201c gedit \u201d: From the menu, choose Applications -> Accessories -> Text Editor . You then get a window that looks like this: You can open files by clicking \u201c Open \u201d in the top menu. Clicking the small file icon with a green plus will create a new document. Save by clicking \u201c Save \u201d in the menu. The menu on the top right (the three horizontal lines) gives you several other options, including \u201c Find \u201d and \u201c Find and Replace \u201d. Keypoints You can login with ThinLinc or another SSH client ThinLinc is easiest if you need a GUI There are several command-line editors: vi/vim, nano, emacs, \u2026 And several GUI editors, which works best when using ThinLinc: gedit, pluma, atom, emacs (gui), nedit, \u2026","title":"Logging in"},{"location":"login/#logging__in","text":"When you have your account, you can login to Kebnekaise. This can be done with any number of SSH clients or with ThinLinc (the easiest option if you need a graphical interface). Objectives Login to Kebnekaise, either with ThinLinc or your SSH client of choice.","title":"Logging in"},{"location":"login/#kebnekaise__login__servers","text":"Note The main login node of Kebnekaise: kebnekaise.hpc2n.umu.se ThinLinc login node: kebnekaise-tl.hpc2n.umu.se ThinLinc through a browser (less features): https://kebnekaise-tl.hpc2n.umu.se:300/ In addition, there is a login node for the AMD-based nodes. We will talk more about this later: kebnekaise-amd.hpc2n.umu.se . For ThinLinc access: kebnekaise-amd-tl.hpc2n.umu.se ThinLinc is recommended for this course ThinLinc: a cross-platform remote desktop server from Cendio AB. Especially useful when you need software with a graphical interface. This is what we recommend you use for this course, unless you have a preferred SSH client.","title":"Kebnekaise login servers"},{"location":"login/#using__thinlinc","text":"Download the client from https://www.cendio.com/thinlinc/download . Install it. Windows: Run the downloaded .exe file to install. macOS: Information on the ThinLinc macOS info page . Linux Ubuntu: Download the .deb file. Run sudo dpkg -i PATH-TO-FILE/FILE-YOU-DOWNLOADED.deb Start the client. Enter the name of the server: kebnekaise-tl.hpc2n.umu.se . Enter your username. Go to \u201cOptions\u201d \\(->\\) \u201cSecurity\u201d. Check that authentication method is set to password. Go to \u201cOptions\u201d \\(->\\) \u201cScreen\u201d. Uncheck \u201cFull screen mode\u201d. Enter your HPC2N password. Click \u201cConnect\u201d Click \u201cContinue\u201d when you are being told that the server\u2019s host key is not in the registry. Wait for the ThinLinc desktop to open.","title":"Using ThinLinc"},{"location":"login/#password","text":"You get your first, temporary HPC2N password from this page: HPC2N passwords . That page can also be used to reset your HPC2N password if you have forgotten it. Note that you are authenticating through SUPR, using that service\u2019s login credentials! Warning The HPC2N password and the SUPR password are separate! The HPC2N password and your university/department password are also separate! Exercise Login to Kebnekaise. If you are using ThinLinc, first install the ThinLinc client. If you are using another SSH client, install it first if you have not already done so.","title":"Password"},{"location":"login/#change__password","text":"Exercise: Change your password after first login ONLY do this if you have logged in for the first time/is still using the termporary password you got from the HPC2N password reset service! Changing password is done using the passwd command: passwd Use a good password that combines letters of different case. Do not use dictionary words. Avoid using the same password that you also use in other places. It will first ask for your current password. Type in that and press enter. Then type in the new password, enter, and repeat. You have changed the password.","title":"Change password"},{"location":"login/#file__transfers","text":"We are not going to transfer any files as part of this course, but you may have to do so as part of your workflow when using Kebnekaise (or another HPC centre) for your research. This section will only talk briefly about file transfers. You can find more information and examples on HPC2N\u2019s File transfer documentation .","title":"File transfers"},{"location":"login/#linux__os__x","text":"","title":"Linux, OS X"},{"location":"login/#scp","text":"SCP (Secure CoPy) is a simple way of transferring files between two machines that use the SSH (Secure SHell) protocol. You may use SCP to connect to any system where you have SSH (log-in) access. These examples show how to use scp from the command-line. Graphical programs exists for doing scp transfer. The command-lone scp program should already be installed. Remote to local Transfer a file from Kebnekaise to your local system, while on your local system scp username@kebnekaise.hpc2n.umu.se:file . Local to remote Transfer a local file to Kebnekaise, while on your local system scp file username@kebnekaise.hpc2n.umu.se:file Recursive directory copy from a local system to a remote system The directory sourcedirectory is here copied as a subdirectory to somedir scp -r sourcedirectory/ username@kebnekaise.hpc2n.umu.se:somedir/","title":"scp"},{"location":"login/#sftp","text":"SFTP (SSH File Transfer Protocol or sometimes called Secure File Transfer Protocol) is a network protocol that provides file transfer over a reliable data stream. SFTP is a command -line program on most Unix, Linux, and Mac OS X systems. It is also available as a protocol choice in some graphical file transfer programs. Example: From a local system to a remote system enterprise-d [ ~ ] $ sftp user@kebnekaise.hpc2n.umu.se Connecting to kebnekaise.hpc2n.umu.se... user@kebnekaise.hpc2n.umu.se ' s password: sftp> put file.c C/file.c Uploading file.c to /home/u/user/C/file.c file.c 100 % 1 0 .0KB/s 00 :00 sftp> put -P irf.png pic/ Uploading irf.png to /home/u/user/pic/irf.png irf.png 100 % 2100 2 .1KB/s 00 :00 sftp>","title":"sftp"},{"location":"login/#windows","text":"Here you need to download a client: WinSCP, FileZilla (sftp), PSCP/PSFTP, \u2026 You can transfer with sftp or scp. There is documentation in HPC2N\u2019s documentation pages for Windows file transfers .","title":"Windows"},{"location":"login/#editors","text":"Since the editors on a Linux system are different to those you may be familiar with from Windows or macOS, here follows a short overview. There are command-line editors and graphical editors. If you are connecting with a regular SSH client, it will be simplest to use a command-line editor. If you are using ThinLinc, you can use command-line editors or graphical editors as you want.","title":"Editors"},{"location":"login/#command-line","text":"These are all good editors for using on the command line: nano vi , vim emacs They are all installed on Kebnekaise. Of these, vi/vim as well as emacs are probably the most powerful, though the latter is better in a GUI environment. The easiest editor to use if you are not familiar with any of them is nano . Nano Starting \u201cnano\u201d: Type nano FILENAME on the command line and press Enter . FILENAME is whatever you want to call your file. If FILENAME is a file that already exists, nano will open the file. If it dows not exist, it will be created. You now get an editor that looks like this: First thing to notice is that many of the commands are listed at the bottom. The ^ before the letter-commands means you should press CTRL and then the letter (while keeping CTRL down). Your prompt is in the editor window itself, and you can just type (or copy and paste) the content you want in your file. When you want to exit (and possibly save), you press CTRL and then x while holding CTRL down (this is written CTRL-x or ^x ). nano will ask you if you want to save the content of the buffer to the file. After that it will exit. There is a manual for nano here .","title":"Command-line"},{"location":"login/#gui","text":"If you are connecting with ThinLinc , you will be presented with a graphical user interface (GUI). From there you can either open a terminal window/shell ( Applications -> System Tools -> MATE Terminal ) or you can choose editors from the menu by going to Applications -> Accessories . This gives several editor options, of which these have a graphical interface: Text Editor (gedit) Pluma - the default editor on the MATE desktop environments (that Thinlinc runs) Atom - not just an editor, but an IDE Emacs (GUI) NEdit \u201cNirvana Text Editor\u201d If you are not familiar with any of these, a good recommendation would be to use Text Editor/gedit . Text Editor/gedit Starting \u201c gedit \u201d: From the menu, choose Applications -> Accessories -> Text Editor . You then get a window that looks like this: You can open files by clicking \u201c Open \u201d in the top menu. Clicking the small file icon with a green plus will create a new document. Save by clicking \u201c Save \u201d in the menu. The menu on the top right (the three horizontal lines) gives you several other options, including \u201c Find \u201d and \u201c Find and Replace \u201d. Keypoints You can login with ThinLinc or another SSH client ThinLinc is easiest if you need a GUI There are several command-line editors: vi/vim, nano, emacs, \u2026 And several GUI editors, which works best when using ThinLinc: gedit, pluma, atom, emacs (gui), nedit, \u2026","title":"GUI"},{"location":"modules/","text":"The Module System (Lmod) \u00b6 Objectives Learn the basics of the module system which is used to access most of the software on Kebnekaise Try some of the most used commands for the module system: find/list software modules load/unload software modules Learn about compiler toolchains Most programs are accessed by first loading them as a \u2018module\u2019. Modules are: used to set up your environment (paths to executables, libraries, etc.) for using a particular (set of) software package(s) a tool to help users manage their Unix/Linux shell environment, allowing groups of related environment-variable settings to be made or removed dynamically allows having multiple versions of a program or package available by just loading the proper module are installed in a hierarchial layout. This means that some modules are only available after loading a specific compiler and/or MPI version. Useful commands (Lmod) \u00b6 See which modules exists: module spider or ml spider See which versions exist of a specific module: module spider MODULE or ml spider MODULE See prerequisites and how to load a specfic version of a module: module spider MODULE/VERSION or ml spider MODULE/VERSION List modules depending only on what is currently loaded: module avail or ml av See which modules are currently loaded: module list or ml Loading a module: module load MODULE or ml MODULE Loading a specific version of a module: module load MODULE/VERSION or ml MODULE/VERSION Unload a module: module unload MODULE or ml -MODULE Get more information about a module: ml show MODULE or module show MODULE Unload all modules except the \u2018sticky\u2019 modules: module purge or ml purge Important! Not all the modules (and versions) are the same on the skylake/broadwell nodes and the zen3/zen4 nodes. The regular login node kebnekaise.hpc2n.umu.se has the modules available on skylake/broadwell nodes. (ThinLinc: kebnekaise-tl.hpc2n.umu.se ) In order to check if a module is available on the zen3/zen4 nodes, login to kebnekaise-amd.hpc2n.umu.se . (ThinLinc: kebnekaise-amd-tl.hpc2n.umu.se ). Hint Code-along! Example: checking which versions exist of the module \u2018Python\u2019 on the regular login node b-an01 [ ~ ] $ ml spider Python --------------------------------------------------------------------------------------------------------- Python: --------------------------------------------------------------------------------------------------------- Description: Python is a programming language that lets you work more quickly and integrate your systems more effectively. Versions: Python/2.7.15 Python/2.7.16 Python/2.7.18-bare Python/2.7.18 Python/3.7.2 Python/3.7.4 Python/3.8.2 Python/3.8.6 Python/3.9.5-bare Python/3.9.5 Python/3.9.6-bare Python/3.9.6 Python/3.10.4-bare Python/3.10.4 Python/3.10.8-bare Python/3.10.8 Python/3.11.3 Python/3.11.5 Other possible modules matches: Biopython Boost.Python Brotli-python GitPython IPython Python-bundle-PyPI flatbuffers-python ... --------------------------------------------------------------------------------------------------------- To find other possible module matches execute: $ module -r spider '.*Python.*' --------------------------------------------------------------------------------------------------------- For detailed information about a specific \"Python\" package ( including how to load the modules ) use the module ' s full name. Note that names that have a trailing ( E ) are extensions provided by other modules. For example: $ module spider Python/3.11.5 --------------------------------------------------------------------------------------------------------- b-an01 [ ~ ] $ Example: Check how to load a specific Python version (3.11.5 in this example) on the regular login node b-an01 [ ~ ] $ ml spider Python/3.11.5 --------------------------------------------------------------------------------------------------------- Python: Python/3.11.5 --------------------------------------------------------------------------------------------------------- Description: Python is a programming language that lets you work more quickly and integrate your systems more effectively. You will need to load all module ( s ) on any one of the lines below before the \"Python/3.11.5\" module is available to load. GCCcore/13.2.0 This module provides the following extensions: flit_core/3.9.0 ( E ) , packaging/23.2 ( E ) , pip/23.2.1 ( E ) , setuptools-scm/8.0.4 ( E ) , setuptools/68.2.2 ( E ) , tomli/2.0.1 ( E ) , typing_extensions/4.8.0 ( E ) , wheel/0.41.2 ( E ) Help: Description =========== Python is a programming language that lets you work more quickly and integrate your systems more effectively. More information ================ - Homepage: https://python.org/ Included extensions =================== flit_core-3.9.0, packaging-23.2, pip-23.2.1, setuptools-68.2.2, setuptools- scm-8.0.4, tomli-2.0.1, typing_extensions-4.8.0, wheel-0.41.2 b-an01 [ ~ ] $ Example: Load Python/3.11.5 and its prerequisite(s) (on the regular login node) Here we also show the loaded module before and after the load. For illustration, we use first ml and then module list : b-an01 [ ~ ] $ ml Currently Loaded Modules: 1 ) snicenvironment ( S ) 2 ) systemdefault ( S ) Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ module load GCCcore/13.2.0 Python/3.11.5 b-an01 [ ~ ] $ module list Currently Loaded Modules: 1 ) snicenvironment ( S ) 4 ) zlib/1.2.13 7 ) ncurses/6.4 10 ) SQLite/3.43.1 13 ) OpenSSL/1.1 2 ) systemdefault ( S ) 5 ) binutils/2.40 8 ) libreadline/8.2 11 ) XZ/5.4.4 14 ) Python/3.11.5 3 ) GCCcore/13.2.0 6 ) bzip2/1.0.8 9 ) Tcl/8.6.13 12 ) libffi/3.4.4 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ Example: Unloading the module Python/3.11.5 (on the regular login node) In this example we unload the module Python/3.11.5 , but not the prerequisite GCCcore/13.2.0 . We also look at the output of module list before and after. b-an01 [ ~ ] $ module list Currently Loaded Modules: 1 ) snicenvironment ( S ) 4 ) zlib/1.2.13 7 ) ncurses/6.4 10 ) SQLite/3.43.1 13 ) OpenSSL/1.1 2 ) systemdefault ( S ) 5 ) binutils/2.40 8 ) libreadline/8.2 11 ) XZ/5.4.4 14 ) Python/3.11.5 3 ) GCCcore/13.2.0 6 ) bzip2/1.0.8 9 ) Tcl/8.6.13 12 ) libffi/3.4.4 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ ml unload Python/3.11.5 b-an01 [ ~ ] $ module list Currently Loaded Modules: 1 ) snicenvironment ( S ) 2 ) systemdefault ( S ) 3 ) GCCcore/13.2.0 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ As you can see, the prerequisite did not get unloaded. This is on purpose, because you may have other things loaded which uses the prerequisite. Example: unloading every module you have loaded, with module purge except the \u2018sticky\u2019 modules (some needed things for the environment) (on the regular login node) First we load some modules. Here Python 3.11.5, SciPy-bundle, and prerequisites for them. We also do module list after loading the modules and after using module purge . b-an01 [ ~ ] $ ml GCC/13.2.0 b-an01 [ ~ ] $ ml Python/3.11.5 ml SciPy-bundle/2023.11 b-an01 [ ~ ] $ ml list Currently Loaded Modules: 1 ) snicenvironment ( S ) 7 ) bzip2/1.0.8 13 ) libffi/3.4.4 19 ) cffi/1.15.1 2 ) systemdefault ( S ) 8 ) ncurses/6.4 14 ) OpenSSL/1.1 20 ) cryptography/41.0.5 3 ) GCCcore/13.2.0 9 ) libreadline/8.2 15 ) Python/3.11.5 21 ) virtualenv/20.24.6 4 ) zlib/1.2.13 10 ) Tcl/8.6.13 16 ) OpenBLAS/0.3.24 22 ) Python-bundle-PyPI/2023.10 5 ) binutils/2.40 11 ) SQLite/3.43.1 17 ) FlexiBLAS/3.3.1 23 ) pybind11/2.11.1 6 ) GCC/13.2.0 12 ) XZ/5.4.4 18 ) FFTW/3.3.10 24 ) SciPy-bundle/2023.11 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ ml purge The following modules were not unloaded: ( Use \"module --force purge\" to unload all ) : 1 ) snicenvironment 2 ) systemdefault b-an01 [ ~ ] $ ml list Currently Loaded Modules: 1 ) snicenvironment ( S ) 2 ) systemdefault ( S ) Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ Note You can do several module load on the same line. Or you can do them one at a time, as you want. The modules have to be loaded in order! You cannot list the prerequisite after the module that needs it! One advantage to loading modules one at a time is that you can then find compatible modules that depend on that version easily. Example: you have loaded GCC/13.2.0 and Python/3.11.5 . You can now do ml av to see which versions of other modules you want to load, say SciPy-bundle, are compatible. If you know the name of the module you want, you can even start writing module load SciPy-bundle/ and press TAB - the system will then autocomplete to the compatible one(s). Exercise Login to kebnekaise-amd (can be easily done with ssh kebnekaise-amd from a terminal window on the regular login node). Check if the versions of Python available differs from on the regular login node. Compiler Toolchains \u00b6 Compiler toolchains load bundles of software making up a complete environment for compiling/using a specific prebuilt software. Includes some/all of: compiler suite, MPI, BLAS, LAPACK, ScaLapack, FFTW, CUDA. Some currently available toolchains (check ml av for versions and full, updated list): GCC : GCC only gcccuda : GCC and CUDA foss : GCC, OpenMPI, OpenBLAS/LAPACK, FFTW, ScaLAPACK gompi : GCC, OpenMPI gompic : GCC, OpenMPI, CUDA gomkl : GCC, OpenMPI, MKL iccifort : icc, ifort iccifortcuda : icc, ifort, CUDA iimpi : icc, ifort, IntelMPI iimpic : iccifort, CUDA, impi intel : icc, ifort, IntelMPI, IntelMKL intel-compilers : icc, ifort (classic and oneAPI) intelcuda : intel and CUDA iompi : iccifort and OpenMPI Exercise Check which versions of the foss toolchain exist. Load one of them. Check which modules you now have loaded. Remove all the (non-sticky) modules. Keypoints The software on Kebnekaise is mostly accessed through the module system. The modules are arranged in a hierarchial layout; many modules have prerequisites that needs to be loaded first. Important commands to the module system: Loading: module load MODULE Unloading: module unload MODULE Unload all modules: module purge List all modules in the system: module spider List versions available of a specific module: module spider MODULE Show how to load a specific module and version: module spider MODULE/VERSION List the modules you have currently loaded: module list Compiler toolchains are modules containing compiler suites and various libraries More information There is more information about the module system and how to work with it in HPC2N\u2019s documentation for the modules system .","title":"The Module System"},{"location":"modules/#the__module__system__lmod","text":"Objectives Learn the basics of the module system which is used to access most of the software on Kebnekaise Try some of the most used commands for the module system: find/list software modules load/unload software modules Learn about compiler toolchains Most programs are accessed by first loading them as a \u2018module\u2019. Modules are: used to set up your environment (paths to executables, libraries, etc.) for using a particular (set of) software package(s) a tool to help users manage their Unix/Linux shell environment, allowing groups of related environment-variable settings to be made or removed dynamically allows having multiple versions of a program or package available by just loading the proper module are installed in a hierarchial layout. This means that some modules are only available after loading a specific compiler and/or MPI version.","title":"The Module System (Lmod)"},{"location":"modules/#useful__commands__lmod","text":"See which modules exists: module spider or ml spider See which versions exist of a specific module: module spider MODULE or ml spider MODULE See prerequisites and how to load a specfic version of a module: module spider MODULE/VERSION or ml spider MODULE/VERSION List modules depending only on what is currently loaded: module avail or ml av See which modules are currently loaded: module list or ml Loading a module: module load MODULE or ml MODULE Loading a specific version of a module: module load MODULE/VERSION or ml MODULE/VERSION Unload a module: module unload MODULE or ml -MODULE Get more information about a module: ml show MODULE or module show MODULE Unload all modules except the \u2018sticky\u2019 modules: module purge or ml purge Important! Not all the modules (and versions) are the same on the skylake/broadwell nodes and the zen3/zen4 nodes. The regular login node kebnekaise.hpc2n.umu.se has the modules available on skylake/broadwell nodes. (ThinLinc: kebnekaise-tl.hpc2n.umu.se ) In order to check if a module is available on the zen3/zen4 nodes, login to kebnekaise-amd.hpc2n.umu.se . (ThinLinc: kebnekaise-amd-tl.hpc2n.umu.se ). Hint Code-along! Example: checking which versions exist of the module \u2018Python\u2019 on the regular login node b-an01 [ ~ ] $ ml spider Python --------------------------------------------------------------------------------------------------------- Python: --------------------------------------------------------------------------------------------------------- Description: Python is a programming language that lets you work more quickly and integrate your systems more effectively. Versions: Python/2.7.15 Python/2.7.16 Python/2.7.18-bare Python/2.7.18 Python/3.7.2 Python/3.7.4 Python/3.8.2 Python/3.8.6 Python/3.9.5-bare Python/3.9.5 Python/3.9.6-bare Python/3.9.6 Python/3.10.4-bare Python/3.10.4 Python/3.10.8-bare Python/3.10.8 Python/3.11.3 Python/3.11.5 Other possible modules matches: Biopython Boost.Python Brotli-python GitPython IPython Python-bundle-PyPI flatbuffers-python ... --------------------------------------------------------------------------------------------------------- To find other possible module matches execute: $ module -r spider '.*Python.*' --------------------------------------------------------------------------------------------------------- For detailed information about a specific \"Python\" package ( including how to load the modules ) use the module ' s full name. Note that names that have a trailing ( E ) are extensions provided by other modules. For example: $ module spider Python/3.11.5 --------------------------------------------------------------------------------------------------------- b-an01 [ ~ ] $ Example: Check how to load a specific Python version (3.11.5 in this example) on the regular login node b-an01 [ ~ ] $ ml spider Python/3.11.5 --------------------------------------------------------------------------------------------------------- Python: Python/3.11.5 --------------------------------------------------------------------------------------------------------- Description: Python is a programming language that lets you work more quickly and integrate your systems more effectively. You will need to load all module ( s ) on any one of the lines below before the \"Python/3.11.5\" module is available to load. GCCcore/13.2.0 This module provides the following extensions: flit_core/3.9.0 ( E ) , packaging/23.2 ( E ) , pip/23.2.1 ( E ) , setuptools-scm/8.0.4 ( E ) , setuptools/68.2.2 ( E ) , tomli/2.0.1 ( E ) , typing_extensions/4.8.0 ( E ) , wheel/0.41.2 ( E ) Help: Description =========== Python is a programming language that lets you work more quickly and integrate your systems more effectively. More information ================ - Homepage: https://python.org/ Included extensions =================== flit_core-3.9.0, packaging-23.2, pip-23.2.1, setuptools-68.2.2, setuptools- scm-8.0.4, tomli-2.0.1, typing_extensions-4.8.0, wheel-0.41.2 b-an01 [ ~ ] $ Example: Load Python/3.11.5 and its prerequisite(s) (on the regular login node) Here we also show the loaded module before and after the load. For illustration, we use first ml and then module list : b-an01 [ ~ ] $ ml Currently Loaded Modules: 1 ) snicenvironment ( S ) 2 ) systemdefault ( S ) Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ module load GCCcore/13.2.0 Python/3.11.5 b-an01 [ ~ ] $ module list Currently Loaded Modules: 1 ) snicenvironment ( S ) 4 ) zlib/1.2.13 7 ) ncurses/6.4 10 ) SQLite/3.43.1 13 ) OpenSSL/1.1 2 ) systemdefault ( S ) 5 ) binutils/2.40 8 ) libreadline/8.2 11 ) XZ/5.4.4 14 ) Python/3.11.5 3 ) GCCcore/13.2.0 6 ) bzip2/1.0.8 9 ) Tcl/8.6.13 12 ) libffi/3.4.4 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ Example: Unloading the module Python/3.11.5 (on the regular login node) In this example we unload the module Python/3.11.5 , but not the prerequisite GCCcore/13.2.0 . We also look at the output of module list before and after. b-an01 [ ~ ] $ module list Currently Loaded Modules: 1 ) snicenvironment ( S ) 4 ) zlib/1.2.13 7 ) ncurses/6.4 10 ) SQLite/3.43.1 13 ) OpenSSL/1.1 2 ) systemdefault ( S ) 5 ) binutils/2.40 8 ) libreadline/8.2 11 ) XZ/5.4.4 14 ) Python/3.11.5 3 ) GCCcore/13.2.0 6 ) bzip2/1.0.8 9 ) Tcl/8.6.13 12 ) libffi/3.4.4 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ ml unload Python/3.11.5 b-an01 [ ~ ] $ module list Currently Loaded Modules: 1 ) snicenvironment ( S ) 2 ) systemdefault ( S ) 3 ) GCCcore/13.2.0 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ As you can see, the prerequisite did not get unloaded. This is on purpose, because you may have other things loaded which uses the prerequisite. Example: unloading every module you have loaded, with module purge except the \u2018sticky\u2019 modules (some needed things for the environment) (on the regular login node) First we load some modules. Here Python 3.11.5, SciPy-bundle, and prerequisites for them. We also do module list after loading the modules and after using module purge . b-an01 [ ~ ] $ ml GCC/13.2.0 b-an01 [ ~ ] $ ml Python/3.11.5 ml SciPy-bundle/2023.11 b-an01 [ ~ ] $ ml list Currently Loaded Modules: 1 ) snicenvironment ( S ) 7 ) bzip2/1.0.8 13 ) libffi/3.4.4 19 ) cffi/1.15.1 2 ) systemdefault ( S ) 8 ) ncurses/6.4 14 ) OpenSSL/1.1 20 ) cryptography/41.0.5 3 ) GCCcore/13.2.0 9 ) libreadline/8.2 15 ) Python/3.11.5 21 ) virtualenv/20.24.6 4 ) zlib/1.2.13 10 ) Tcl/8.6.13 16 ) OpenBLAS/0.3.24 22 ) Python-bundle-PyPI/2023.10 5 ) binutils/2.40 11 ) SQLite/3.43.1 17 ) FlexiBLAS/3.3.1 23 ) pybind11/2.11.1 6 ) GCC/13.2.0 12 ) XZ/5.4.4 18 ) FFTW/3.3.10 24 ) SciPy-bundle/2023.11 Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ ml purge The following modules were not unloaded: ( Use \"module --force purge\" to unload all ) : 1 ) snicenvironment 2 ) systemdefault b-an01 [ ~ ] $ ml list Currently Loaded Modules: 1 ) snicenvironment ( S ) 2 ) systemdefault ( S ) Where: S: Module is Sticky, requires --force to unload or purge b-an01 [ ~ ] $ Note You can do several module load on the same line. Or you can do them one at a time, as you want. The modules have to be loaded in order! You cannot list the prerequisite after the module that needs it! One advantage to loading modules one at a time is that you can then find compatible modules that depend on that version easily. Example: you have loaded GCC/13.2.0 and Python/3.11.5 . You can now do ml av to see which versions of other modules you want to load, say SciPy-bundle, are compatible. If you know the name of the module you want, you can even start writing module load SciPy-bundle/ and press TAB - the system will then autocomplete to the compatible one(s). Exercise Login to kebnekaise-amd (can be easily done with ssh kebnekaise-amd from a terminal window on the regular login node). Check if the versions of Python available differs from on the regular login node.","title":"Useful commands (Lmod)"},{"location":"modules/#compiler__toolchains","text":"Compiler toolchains load bundles of software making up a complete environment for compiling/using a specific prebuilt software. Includes some/all of: compiler suite, MPI, BLAS, LAPACK, ScaLapack, FFTW, CUDA. Some currently available toolchains (check ml av for versions and full, updated list): GCC : GCC only gcccuda : GCC and CUDA foss : GCC, OpenMPI, OpenBLAS/LAPACK, FFTW, ScaLAPACK gompi : GCC, OpenMPI gompic : GCC, OpenMPI, CUDA gomkl : GCC, OpenMPI, MKL iccifort : icc, ifort iccifortcuda : icc, ifort, CUDA iimpi : icc, ifort, IntelMPI iimpic : iccifort, CUDA, impi intel : icc, ifort, IntelMPI, IntelMKL intel-compilers : icc, ifort (classic and oneAPI) intelcuda : intel and CUDA iompi : iccifort and OpenMPI Exercise Check which versions of the foss toolchain exist. Load one of them. Check which modules you now have loaded. Remove all the (non-sticky) modules. Keypoints The software on Kebnekaise is mostly accessed through the module system. The modules are arranged in a hierarchial layout; many modules have prerequisites that needs to be loaded first. Important commands to the module system: Loading: module load MODULE Unloading: module unload MODULE Unload all modules: module purge List all modules in the system: module spider List versions available of a specific module: module spider MODULE Show how to load a specific module and version: module spider MODULE/VERSION List the modules you have currently loaded: module list Compiler toolchains are modules containing compiler suites and various libraries More information There is more information about the module system and how to work with it in HPC2N\u2019s documentation for the modules system .","title":"Compiler Toolchains"},{"location":"projectsaccounts/","text":"Projects - compute and storage \u00b6 Note In order to have an account at HPC2N, you need to be a member of a compute project. You can either join a project or apply for one yourself (if you fulfill the requirements). There are both storage projects and compute projects. The storage projects are for when the amount of storage included with the compute project is not enough. Important You cannot have a storage project without a compute project! Kebnekaise is only open for local project requests! The PI must be affiliated with UmU, LTU, IRF, MiUN, or SLU. You can still add members (join) from anywhere. Application process \u00b6 Apply for compute projects in SUPR . Login to SUPR (create SUPR account if you do not have one). Click \u201cRounds\u201d in the left menu. Pick \u201cCompute Rounds\u201d. Pick \u201cCentre Local Compute\u201d. Pick \u201cHPC2N Local Compute YYYY\u201d. Choose \u201cCreate New Proposal for HPC2N Local Compute YYYY\u201d. Create from scratch or use earlier proposal as starting point. Agree to the default storage if 500GB is enough. More information: https://supr.naiss.se/round/open_or_pending_type/?type=Centre+Local+Compute If the above mentioned default storage is not enough, you will need to apply for a Local storage project : https://supr.naiss.se/round/open_or_pending_type/?type=Centre+Local+Storage Info As default, you have 25GB in your home directory. If you need more, you/your PI can accept the \u201cdefault storage\u201d you will be offered after applying for compute resources. The default storage is 500GB. If you need more than that, you/your PI will have to apply for a storage project. When you have both, link them together. It is done from the storage project. This way all members of the compute project also becomes members of the storage project. After applying on SUPR, the project(s) will be reviewed. Linking a compute project to a storage project \u00b6 Before linking (SUPR): 2. Pick a compute project to link: 3. Showing linked projects: 4. Members of the storage project after linking: Accounts \u00b6 When you have a project / have become member of a project, you can apply for an account at HPC2N. This is done in SUPR, under \u201cAccounts\u201d: https://supr.naiss.se/account/ . Your account request will be processed within a week. You will then get an email with information about logging in and links to getting started information. More information on the account process can be found on HPC2N\u2019s documentation pages: https://www.hpc2n.umu.se/documentation/access-and-accounts/users","title":"Projects and Accounts"},{"location":"projectsaccounts/#projects__-__compute__and__storage","text":"Note In order to have an account at HPC2N, you need to be a member of a compute project. You can either join a project or apply for one yourself (if you fulfill the requirements). There are both storage projects and compute projects. The storage projects are for when the amount of storage included with the compute project is not enough. Important You cannot have a storage project without a compute project! Kebnekaise is only open for local project requests! The PI must be affiliated with UmU, LTU, IRF, MiUN, or SLU. You can still add members (join) from anywhere.","title":"Projects - compute and storage"},{"location":"projectsaccounts/#application__process","text":"Apply for compute projects in SUPR . Login to SUPR (create SUPR account if you do not have one). Click \u201cRounds\u201d in the left menu. Pick \u201cCompute Rounds\u201d. Pick \u201cCentre Local Compute\u201d. Pick \u201cHPC2N Local Compute YYYY\u201d. Choose \u201cCreate New Proposal for HPC2N Local Compute YYYY\u201d. Create from scratch or use earlier proposal as starting point. Agree to the default storage if 500GB is enough. More information: https://supr.naiss.se/round/open_or_pending_type/?type=Centre+Local+Compute If the above mentioned default storage is not enough, you will need to apply for a Local storage project : https://supr.naiss.se/round/open_or_pending_type/?type=Centre+Local+Storage Info As default, you have 25GB in your home directory. If you need more, you/your PI can accept the \u201cdefault storage\u201d you will be offered after applying for compute resources. The default storage is 500GB. If you need more than that, you/your PI will have to apply for a storage project. When you have both, link them together. It is done from the storage project. This way all members of the compute project also becomes members of the storage project. After applying on SUPR, the project(s) will be reviewed.","title":"Application process"},{"location":"projectsaccounts/#linking__a__compute__project__to__a__storage__project","text":"Before linking (SUPR): 2. Pick a compute project to link: 3. Showing linked projects: 4. Members of the storage project after linking:","title":"Linking a compute project to a storage project"},{"location":"projectsaccounts/#accounts","text":"When you have a project / have become member of a project, you can apply for an account at HPC2N. This is done in SUPR, under \u201cAccounts\u201d: https://supr.naiss.se/account/ . Your account request will be processed within a week. You will then get an email with information about logging in and links to getting started information. More information on the account process can be found on HPC2N\u2019s documentation pages: https://www.hpc2n.umu.se/documentation/access-and-accounts/users","title":"Accounts"},{"location":"simple/","text":"Simple batch script examples \u00b6 Objectives See and try out different types of simple batch script examples. Try using constraints: how to allocate specific CPUs. Try using constraints: how to allocate specific GPUs. For consistency, I have given all the example batch scripts the suffix .sh even though it is not required. Another commonly used suffix is .batch , but any or none will work. You need to compile any programs mentioned in a batch script in order to run the examples, except for compile-run.sh and the CUDA examples, which includes compilation. Important The course project has the following project ID: hpc2n2024-084 In order to use it in a batch job, add this to the batch script: #SBATCH -A hpc2n2024-084 We have a storage project linked to the compute project: intro-hpc2n . You find it in /proj/nobackup/intro-hpc2n . Remember to create your own directory under it. Hint Try to change the C programs, add different programs, and in general play around with the examples! Note For these test examples I would suggest using the foss compiler toolchain, version 2022b, unless otherwise specified. If you decide to use a different one, you will have to make changes to some of the batch scripts. To submit a job script, do sbatch JOBSCRIPT In most of the examples, I name the executable when I compile. The flag -o tells the compiler you want to name the executable. If you don\u2019t include that and a name, you will get an executable named a.out . Of course, you do not have to name the executable hello . This is just an example. In general, I have named all the executables the same as the program (without the suffix). Serial batch job \u00b6 To compile a serial program, like hello.c with gcc do: gcc hello.c -o hello Sample batch script (hello.sh) #!/bin/bash # Project id - change to your own after the course! #SBATCH -A hpc2n2024-084 # Asking for 1 core #SBATCH -n 1 # Asking for a walltime of 1 min #SBATCH --time=00:01:00 # Purge modules before loading new ones in a script. ml purge > /dev/null 2 > & 1 ml foss/2022b ./hello Exercise: serial job Submit the job with sbatch . Check on it with squeue --me . Take a look at the output ( slurm-JOBID.out ) with nano or your favourite editor. MPI batch job \u00b6 To compile an MPI program, like mpi_hello.c (and create an executable named mpi_hello ) with gcc, do: mpicc mpi_hello.c -o mpi_hello Sample batch script (mpi_hello.sh) #!/bin/bash # Remember to change this to your own Project ID after the course! #SBATCH -A hpc2n2024-084 # Number of tasks - default is 1 core per task #SBATCH -n 14 #SBATCH --time=00:05:00 # It is always a good idea to do ml purge before loading other modules ml purge > /dev/null 2 > & 1 ml add foss/2022b # Use srun since this is an MPI program srun ./mpi_hello Exercise: MPI job Submit the job with sbatch . Check on it with squeue --me . Take a look at the output ( slurm-JOBID.out ) with nano or your favourite editor. Try running it more than once to see that the order of the tasks are random. OpenMP batch job \u00b6 To compile an OpenMP program, like omp_hello.c (and create an executable named omp_hello ) with gcc, do: gcc -fopenmp omp_hello.c -o omp_hello Sample batch script (omp_hello.sh) #!/bin/bash #SBATCH -A hpc2n2024-084 # Number of cores per task #SBATCH -c 28 #SBATCH --time=00:05:00 # It is always a good idea to do ml purge before loading other modules ml purge > /dev/null 2 > & 1 ml add foss/2022b # Set OMP_NUM_THREADS to the same value as -c with a fallback in case it isn't set. # SLURM_CPUS_PER_TASK is set to the value of -c, but only if -c is explicitly set if [ -n \" $SLURM_CPUS_PER_TASK \" ] ; then omp_threads = $SLURM_CPUS_PER_TASK else omp_threads = 1 fi export OMP_NUM_THREADS = $omp_threads ./omp_hello Exercise: OpenMP job Set OMP_NUM_THREADS to some value between 1 and 28 ( export OMP_NUM_THREADS=value ). Submit the job with sbatch . Take a look at the output ( slurm-JOBID.out ) with nano or your favourite editor. Change the value of OMP_NUM_THREADS ). Submit it again and check on the output to see the change. Multiple serial jobs from same submit file \u00b6 This submit file shows one way of running several programs from inside the same submit file. To run this example, you need to compile the following serial C programs: hello.c Greeting.c Adding2.c Mult2.c When the C programs have been compiled, submit the multiple-serial.sh program: multiple-serial.sh All jobs run at the same time, so you need as many cores as they need combined. You also need to ask for long enough time that even the longest of the jobs will finish. Note that here you submit with srun even if it is serial jobs. You use & to send the job to the background. Also note the wait at the end. If you do not add that, the whole batch job will finish when the first of the jobs inside ends. #!/bin/bash #SBATCH -A hpc2n2024-084 # Add enough cores that all jobs can run at the same time #SBATCH -n 5 # Make sure that the time is long enough that the longest job will have time to finish #SBATCH --time=00:05:00 module purge > /dev/null 2 > & 1 ml foss/2022b srun -n 1 --exclusive ./hello & srun -n 1 --exclusive ./Greeting & srun -n 1 --exclusive ./Adding2 10 20 & srun -n 1 --exclusive /bin/hostname & srun -n 1 --exclusive ./Mult2 10 2 wait Exercise: multiple serial jobs Compile the above mentioned programs. Submit the batch script with sbatch multiple-serial.sh If you run it several times you will notice that the order is random. Job arrays \u00b6 Job arrays offer a mechanism for submitting and managing collections of similar jobs. All jobs must have the same initial options (e.g. size, time limit, etc.), however it is possible to change some of these options after the job has begun execution using the scontrol command specifying the JobID of the array or individual ArrayJobID. More information here on the official Slurm documentation pages . To try an example, we have included a small Python script hello-world-array.py and a batch script hello-world-array.sh . Both can also be found in the exercises/simple directory you have cloned. hello-world-array.py # import sys library (we need this for the command line args) import sys # print task number print ( 'Hello world! from task number: ' , sys.argv [ 1 ]) hello-world-array.sh #!/bin/bash # This is a very simple example of how to run a Python script with a job array #SBATCH -A hpc2n2024-084 # Change to your own after the course! #SBATCH --time=00:05:00 # Asking for 5 minutes #SBATCH --array=1-10 # how many tasks in the array #SBATCH -c 1 # Asking for 1 core # one core per task #SBATCH -o hello-world-%j-%a.out # Load any modules you need, here for Python 3.11.3 ml GCC/12.3.0 Python/3.11.3 # Run your Python script srun python hello-world-array.py $SLURM_ARRAY_TASK_ID Exercise: job arrays Submit the batch script. Look at the output files. Change the number of tasks in the array. Rerun. See the change. Multiple parallel jobs sequentially \u00b6 To run this example, you need to compile the following parallel C programs: mpi_hello.c mpi_greeting.c mpi_hi.c When the MPI C programs have been compiled, submit the multiple-parallel-sequential.sh program: #!/bin/bash #SBATCH -A hpc2n2024-084 # Since the files are run sequentially I only need enough cores for the largest of them to run #SBATCH -c 28 # Remember to ask for enough time for all jobs to complete #SBATCH --time=00:10:00 module purge > /dev/null 2 > & 1 ml foss/2022b # Here 14 tasks with 2 cores per task. Output to file - not needed if your job creates output in a file directly # In this example I also copy the output somewhere else and then run another executable. srun -n 14 -c 2 ./mpi_hello > myoutput1 2 > & 1 cp myoutput1 mydatadir srun -n 14 -c 2 ./mpi_greeting > myoutput2 2 > & 1 cp myoutput2 mydatadir srun -n 14 -c 2 ./mpi_hi > myoutput3 2 > & 1 cp myoutput3 mydatadir sbatch multiple-parallel-sequential.sh Exercise: multiple parallel jobs sequentially Submit the job: sbatch multiple-parallel-sequential.sh See that output data are thrown to files and copied to the directory mydatadir . Multiple parallel jobs simultaneously \u00b6 To run this example, you need to compile the following parallel C programs: mpi_hello.c mpi_greeting.c mpi_hi.c As before, we recommend using the foss/2022b module for this. If you use a different one you need to change it in the multiple-parallel-simultaneous.sh batch script. When the MPI C programs have been compiled, submit the multiple-parallel-simultaneous.sh program: #!/bin/bash #SBATCH -A hpc2n2024-084 # Since the files run simultaneously I need enough cores for all of them to run #SBATCH -n 56 # Remember to ask for enough time for all jobs to complete #SBATCH --time=00:10:00 module purge > /dev/null 2 > & 1 ml foss/2022b srun -n 14 --exclusive ./mpi_hello & srun -n 14 --exclusive ./mpi_greeting & srun -n 14 --exclusive ./mpi_hi & wait Just like for the multiple serial jobs simultaneously example, you need to add wait to make sure the batch job will not finish when the first of the jobs in it finishes. Exercise: multiple parallel jobs simultaneously When you have compiled the needed programs, as mentioned above, submit with sbatch multiple-parallel-simultaneous.sh Compiling and running in the batch job \u00b6 Sometimes you have a program that takes a long time to compile, or that you need to recompile before each run. To see a simple example of compiling and running from the batch job, look at the batch script compile-run.sh . In this case it compiles and runs the mpi_hello.c program. compile-run.sh #!/bin/bash # CHANGE THE PROJECT ID TO YOUR OWN PROJECT ID AFTER THE COURSE! #SBATCH -A hpc2n2024-084 #Name the job, for easier finding in the list #SBATCH -J compiler-run #SBATCH -t 00:10:00 #SBATCH -n 12 ml purge > /dev/null 2 > & 1 ml foss/2022b mpicc mpi_hello.c -o mpi_hello mpirun ./mpi_hello Exercise: compile and run in a batch job This batch script can be submitted directly, without compiling anything first, as that happens in the batch script. Try submitting it with sbatch and see what happens. Which files are created? You could try changing the program it compiles and runs to a different one. Remember to change the compiler if you are not using an MPI program. Getting errors and outputs in separate files \u00b6 As a default, Slurm throws both errors and other output to the same file, named slurm-JOBID.out . If you want the errors and other output to separate files, you can do as in the example separate-err-out.sh : #!/bin/bash # Remember to change this to your own Project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH -n 8 #SBATCH --time=00:05:00 # Putting the output in a separate output file and the errors in an # error file instead of putting it all in slurm-JOBID.out # Note the environment variable %J, which contains the job ID. It is handy to # avoid naming the files the same for different runs, and thus overwriting them. #SBATCH --error=job.%J.err #SBATCH --output=job.%J.out ml purge > /dev/null 2 > & 1 ml foss/2022b mpirun ./mpi_hello You need the mpi_hello.c file compiled (and the executable named mpi_hello ) for this to run without changes. Of course, you can also just add your own programs. Exercise: errors and outputs in separate files Compile the file mpi_hello.s after loading the module foss/2022b . Submit the job script with sbatch . See that separate output and error files are created. CUDA/GPU programs \u00b6 To run programs/software that uses GPUs, you need to allocated GPUs in the job script. They will not be allocated by your program. To compile a cuda program, like hello-world.cu you need to load a toolchain containing CUDA compilers/load CUDA compilers. To run a piece of software that uses GPUs, you need to load a module version which is GPU aware. In many cases there are several versions of a module, only some of which are for running on GPUs. Important Remember to check the modules, versions, and prerequisites! Also make sure you check for the correct node type. Some of the GPUs are on Intel nodes (check modules on kebnekaise.hpc2n.umu.se ), some on AMD nodes (check modules on kebnekaise-amd.hpc2n.umu.se ). V100 - Intel Skylake \u00b6 This example runs a small CUDA code. We recommend fosscuda/2020b (contains GCC , OpenMPI , OpenBLAS / LAPACK , FFTW , ScaLAPACK , and CUDA ) or intelcuda/2019a (contains icc , ifort , IntelMPI , IntelMKL , and CUDA ) Sample batch script gpu-skylake.sh #!/bin/bash # This job script is for running on 1 V100 GPU. # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=v100:1 ml purge > /dev/null 2 > & 1 ml fosscuda/2020b nvcc hello-world.cu -o hello ./hello The batch script gpu.sh compiles and runs a small cuda program called hello-world.cu . Exercise: V100 GPU job To submit it, just do: sbatch gpu.sh Use squeue --me or scontrol show job JOBID to see that the job runs in the correct partition/node types. A100 - AMD Zen3 \u00b6 Remember, in order to find the correct modules, as well as compile a program if you need that, you must login to one of the AMD login nodes with either SSH ( kebnekaise-amd.hpc2n.umu.se ) or ThinLinc ( kebnekaise-amd-tl.kebnekaise.hpc2n.umu.se ). The job can be submitted from the regular login node, though. Exercise: login to the AMD login node and find a suitable module If you are logged in to the regular Kebnekaise login node, then you can easiest login to the AMD login node by typing this in a terminal window: ssh kebnekaise-amd.hpc2n.umu.se After that, you check for a suitable CUDA toolchain: ml spider CUDA . You can then load it (here CUDA/11.7.0 ) and use nvcc to compile the program hello-world.cu : ml CUDA/11.7.0 nvcc hello-world.cu -o hello Now logout from the AMD login node again. The batch script gpu-a100.sh compiles and runs a small cuda program called hello-world.cu . Sample A100 GPU job script: gpu-a100.sh #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=a100:1 ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 nvcc hello-world.cu -o hello ./hello Exercise: A100 GPU batch jobs The above script is found in the same directory as the other exercises ( intro-course/exercises/simple ). You can submit it directory: sbatch gpu-a100.sh Like for the A100, you are encouraged to use squeue --me and/or scontrol show job JOBID to see that the job gets the correct partition/node type allocated. A40 - Intel broadwell \u00b6 Kebnekaise also has a few of the A40 GPUs. These are placed on Intel broadwell nodes. In order to run on these, you add this to your batch script: #SBATCH --gpus=a40:number where number is 1 or 2 (the number of GPU cards). You can find the available modules on the regular login node, kebnekaise.hpc2n.umu.se . L40s - AMD Zen4 \u00b6 Since these GPUs are located on AMD Zen4 nodes, you need to login to kebnekaise-amd.hpc2n.umu.se to check available modules. Then, to ask for these nodes in your batch script, you add: #SBATCH --gpus=l40s:number where number is 1 or 2 (the number of GPU cards). H100 - AMD Zen4 \u00b6 The H100 GPUs are located on AMD Zen4 nodes. You can find the available modules by logging in to kebnekaise-amd.hpc2n.umu.se . You ask for these GPUs in your batch script by adding: #SBATCH --gpus=h100:number where number is 1, 2, 3, or 4 (the number of GPU cards you want to allocate). A6000 - AMD Zen4 \u00b6 The A6000 GPUs are placed on AMD Zen4 nodes. That means you can find the available modules by logging in to kebnekaise-amd.hpc2n.umu.se . To run on these GPUs, add this to your batch script: #SBATCH --gpus=a6000:number where number is 1 or 2 (the number of GPU cards you want to allocated). MI100 - AMD Zen3 \u00b6 The MI100 GPUs are located on AMD Zen3 nodes. You can find the available modules by logging in to kebnekaise-amd.hpc2n.umu.se . To allocate MI100 GPUs, add this to your batch script: #SBATCH --gpus=mi100:number where number is 1 or 2 (the number of GPU cards). GPU features \u00b6 Sample batch script for allocating any AMD GPU #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=1 #SBATCH -C amd_gpu ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 ./myGPUcode Sample batch script for allocating any Nvidia GPU #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=1 #SBATCH -C nvidia_gpu ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 ./myGPUcode Sample batch script for allocating any Nvidia GPU on Intel node #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=1 #SBATCH -C 'nvidia_gpu&intel_cpu' ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 ./myGPUcode Sample batch script for allocating any GPU with AI features and on a Zen node #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=1 #SBATCH -C ''zen3|zen4'&GPU_AI' ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 ./myGPUcode Exercise: GPU features In order to run these examples, you can change ./myGPUcode to nvcc hello-world.cu -o hello ./hello or any other GPU program of your choice. The gpu-features.sh example script in the exercises/simple directory is prepared for the \u201cany GPU with AI features and on a Zen node\u201d. You can either run it as is, or make changes to it and try any of the other combinations here (or try new combinations yourself). Check with squeue --me which partition/node type the job ends up in, and that it fits. More information can be found with scontrol show job JOBID . Starting JupyterLab \u00b6 On Kebnekaise, it is possible to run JupyterLab. This is done through a batch job, and is described in detail on our \u201cJupyter on Kebnekaise\u201d documentation . Keypoints \u00b6 Keypoints How to run serial, MPI, OpenMP, and GPU jobs How to use GPU features How to run several jobs from inside one batch job","title":"Simple examples"},{"location":"simple/#simple__batch__script__examples","text":"Objectives See and try out different types of simple batch script examples. Try using constraints: how to allocate specific CPUs. Try using constraints: how to allocate specific GPUs. For consistency, I have given all the example batch scripts the suffix .sh even though it is not required. Another commonly used suffix is .batch , but any or none will work. You need to compile any programs mentioned in a batch script in order to run the examples, except for compile-run.sh and the CUDA examples, which includes compilation. Important The course project has the following project ID: hpc2n2024-084 In order to use it in a batch job, add this to the batch script: #SBATCH -A hpc2n2024-084 We have a storage project linked to the compute project: intro-hpc2n . You find it in /proj/nobackup/intro-hpc2n . Remember to create your own directory under it. Hint Try to change the C programs, add different programs, and in general play around with the examples! Note For these test examples I would suggest using the foss compiler toolchain, version 2022b, unless otherwise specified. If you decide to use a different one, you will have to make changes to some of the batch scripts. To submit a job script, do sbatch JOBSCRIPT In most of the examples, I name the executable when I compile. The flag -o tells the compiler you want to name the executable. If you don\u2019t include that and a name, you will get an executable named a.out . Of course, you do not have to name the executable hello . This is just an example. In general, I have named all the executables the same as the program (without the suffix).","title":"Simple batch script examples"},{"location":"simple/#serial__batch__job","text":"To compile a serial program, like hello.c with gcc do: gcc hello.c -o hello Sample batch script (hello.sh) #!/bin/bash # Project id - change to your own after the course! #SBATCH -A hpc2n2024-084 # Asking for 1 core #SBATCH -n 1 # Asking for a walltime of 1 min #SBATCH --time=00:01:00 # Purge modules before loading new ones in a script. ml purge > /dev/null 2 > & 1 ml foss/2022b ./hello Exercise: serial job Submit the job with sbatch . Check on it with squeue --me . Take a look at the output ( slurm-JOBID.out ) with nano or your favourite editor.","title":"Serial batch job"},{"location":"simple/#mpi__batch__job","text":"To compile an MPI program, like mpi_hello.c (and create an executable named mpi_hello ) with gcc, do: mpicc mpi_hello.c -o mpi_hello Sample batch script (mpi_hello.sh) #!/bin/bash # Remember to change this to your own Project ID after the course! #SBATCH -A hpc2n2024-084 # Number of tasks - default is 1 core per task #SBATCH -n 14 #SBATCH --time=00:05:00 # It is always a good idea to do ml purge before loading other modules ml purge > /dev/null 2 > & 1 ml add foss/2022b # Use srun since this is an MPI program srun ./mpi_hello Exercise: MPI job Submit the job with sbatch . Check on it with squeue --me . Take a look at the output ( slurm-JOBID.out ) with nano or your favourite editor. Try running it more than once to see that the order of the tasks are random.","title":"MPI batch job"},{"location":"simple/#openmp__batch__job","text":"To compile an OpenMP program, like omp_hello.c (and create an executable named omp_hello ) with gcc, do: gcc -fopenmp omp_hello.c -o omp_hello Sample batch script (omp_hello.sh) #!/bin/bash #SBATCH -A hpc2n2024-084 # Number of cores per task #SBATCH -c 28 #SBATCH --time=00:05:00 # It is always a good idea to do ml purge before loading other modules ml purge > /dev/null 2 > & 1 ml add foss/2022b # Set OMP_NUM_THREADS to the same value as -c with a fallback in case it isn't set. # SLURM_CPUS_PER_TASK is set to the value of -c, but only if -c is explicitly set if [ -n \" $SLURM_CPUS_PER_TASK \" ] ; then omp_threads = $SLURM_CPUS_PER_TASK else omp_threads = 1 fi export OMP_NUM_THREADS = $omp_threads ./omp_hello Exercise: OpenMP job Set OMP_NUM_THREADS to some value between 1 and 28 ( export OMP_NUM_THREADS=value ). Submit the job with sbatch . Take a look at the output ( slurm-JOBID.out ) with nano or your favourite editor. Change the value of OMP_NUM_THREADS ). Submit it again and check on the output to see the change.","title":"OpenMP batch job"},{"location":"simple/#multiple__serial__jobs__from__same__submit__file","text":"This submit file shows one way of running several programs from inside the same submit file. To run this example, you need to compile the following serial C programs: hello.c Greeting.c Adding2.c Mult2.c When the C programs have been compiled, submit the multiple-serial.sh program: multiple-serial.sh All jobs run at the same time, so you need as many cores as they need combined. You also need to ask for long enough time that even the longest of the jobs will finish. Note that here you submit with srun even if it is serial jobs. You use & to send the job to the background. Also note the wait at the end. If you do not add that, the whole batch job will finish when the first of the jobs inside ends. #!/bin/bash #SBATCH -A hpc2n2024-084 # Add enough cores that all jobs can run at the same time #SBATCH -n 5 # Make sure that the time is long enough that the longest job will have time to finish #SBATCH --time=00:05:00 module purge > /dev/null 2 > & 1 ml foss/2022b srun -n 1 --exclusive ./hello & srun -n 1 --exclusive ./Greeting & srun -n 1 --exclusive ./Adding2 10 20 & srun -n 1 --exclusive /bin/hostname & srun -n 1 --exclusive ./Mult2 10 2 wait Exercise: multiple serial jobs Compile the above mentioned programs. Submit the batch script with sbatch multiple-serial.sh If you run it several times you will notice that the order is random.","title":"Multiple serial jobs from same submit file"},{"location":"simple/#job__arrays","text":"Job arrays offer a mechanism for submitting and managing collections of similar jobs. All jobs must have the same initial options (e.g. size, time limit, etc.), however it is possible to change some of these options after the job has begun execution using the scontrol command specifying the JobID of the array or individual ArrayJobID. More information here on the official Slurm documentation pages . To try an example, we have included a small Python script hello-world-array.py and a batch script hello-world-array.sh . Both can also be found in the exercises/simple directory you have cloned. hello-world-array.py # import sys library (we need this for the command line args) import sys # print task number print ( 'Hello world! from task number: ' , sys.argv [ 1 ]) hello-world-array.sh #!/bin/bash # This is a very simple example of how to run a Python script with a job array #SBATCH -A hpc2n2024-084 # Change to your own after the course! #SBATCH --time=00:05:00 # Asking for 5 minutes #SBATCH --array=1-10 # how many tasks in the array #SBATCH -c 1 # Asking for 1 core # one core per task #SBATCH -o hello-world-%j-%a.out # Load any modules you need, here for Python 3.11.3 ml GCC/12.3.0 Python/3.11.3 # Run your Python script srun python hello-world-array.py $SLURM_ARRAY_TASK_ID Exercise: job arrays Submit the batch script. Look at the output files. Change the number of tasks in the array. Rerun. See the change.","title":"Job arrays"},{"location":"simple/#multiple__parallel__jobs__sequentially","text":"To run this example, you need to compile the following parallel C programs: mpi_hello.c mpi_greeting.c mpi_hi.c When the MPI C programs have been compiled, submit the multiple-parallel-sequential.sh program: #!/bin/bash #SBATCH -A hpc2n2024-084 # Since the files are run sequentially I only need enough cores for the largest of them to run #SBATCH -c 28 # Remember to ask for enough time for all jobs to complete #SBATCH --time=00:10:00 module purge > /dev/null 2 > & 1 ml foss/2022b # Here 14 tasks with 2 cores per task. Output to file - not needed if your job creates output in a file directly # In this example I also copy the output somewhere else and then run another executable. srun -n 14 -c 2 ./mpi_hello > myoutput1 2 > & 1 cp myoutput1 mydatadir srun -n 14 -c 2 ./mpi_greeting > myoutput2 2 > & 1 cp myoutput2 mydatadir srun -n 14 -c 2 ./mpi_hi > myoutput3 2 > & 1 cp myoutput3 mydatadir sbatch multiple-parallel-sequential.sh Exercise: multiple parallel jobs sequentially Submit the job: sbatch multiple-parallel-sequential.sh See that output data are thrown to files and copied to the directory mydatadir .","title":"Multiple parallel jobs sequentially"},{"location":"simple/#multiple__parallel__jobs__simultaneously","text":"To run this example, you need to compile the following parallel C programs: mpi_hello.c mpi_greeting.c mpi_hi.c As before, we recommend using the foss/2022b module for this. If you use a different one you need to change it in the multiple-parallel-simultaneous.sh batch script. When the MPI C programs have been compiled, submit the multiple-parallel-simultaneous.sh program: #!/bin/bash #SBATCH -A hpc2n2024-084 # Since the files run simultaneously I need enough cores for all of them to run #SBATCH -n 56 # Remember to ask for enough time for all jobs to complete #SBATCH --time=00:10:00 module purge > /dev/null 2 > & 1 ml foss/2022b srun -n 14 --exclusive ./mpi_hello & srun -n 14 --exclusive ./mpi_greeting & srun -n 14 --exclusive ./mpi_hi & wait Just like for the multiple serial jobs simultaneously example, you need to add wait to make sure the batch job will not finish when the first of the jobs in it finishes. Exercise: multiple parallel jobs simultaneously When you have compiled the needed programs, as mentioned above, submit with sbatch multiple-parallel-simultaneous.sh","title":"Multiple parallel jobs simultaneously"},{"location":"simple/#compiling__and__running__in__the__batch__job","text":"Sometimes you have a program that takes a long time to compile, or that you need to recompile before each run. To see a simple example of compiling and running from the batch job, look at the batch script compile-run.sh . In this case it compiles and runs the mpi_hello.c program. compile-run.sh #!/bin/bash # CHANGE THE PROJECT ID TO YOUR OWN PROJECT ID AFTER THE COURSE! #SBATCH -A hpc2n2024-084 #Name the job, for easier finding in the list #SBATCH -J compiler-run #SBATCH -t 00:10:00 #SBATCH -n 12 ml purge > /dev/null 2 > & 1 ml foss/2022b mpicc mpi_hello.c -o mpi_hello mpirun ./mpi_hello Exercise: compile and run in a batch job This batch script can be submitted directly, without compiling anything first, as that happens in the batch script. Try submitting it with sbatch and see what happens. Which files are created? You could try changing the program it compiles and runs to a different one. Remember to change the compiler if you are not using an MPI program.","title":"Compiling and running in the batch job"},{"location":"simple/#getting__errors__and__outputs__in__separate__files","text":"As a default, Slurm throws both errors and other output to the same file, named slurm-JOBID.out . If you want the errors and other output to separate files, you can do as in the example separate-err-out.sh : #!/bin/bash # Remember to change this to your own Project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH -n 8 #SBATCH --time=00:05:00 # Putting the output in a separate output file and the errors in an # error file instead of putting it all in slurm-JOBID.out # Note the environment variable %J, which contains the job ID. It is handy to # avoid naming the files the same for different runs, and thus overwriting them. #SBATCH --error=job.%J.err #SBATCH --output=job.%J.out ml purge > /dev/null 2 > & 1 ml foss/2022b mpirun ./mpi_hello You need the mpi_hello.c file compiled (and the executable named mpi_hello ) for this to run without changes. Of course, you can also just add your own programs. Exercise: errors and outputs in separate files Compile the file mpi_hello.s after loading the module foss/2022b . Submit the job script with sbatch . See that separate output and error files are created.","title":"Getting errors and outputs in separate files"},{"location":"simple/#cudagpu__programs","text":"To run programs/software that uses GPUs, you need to allocated GPUs in the job script. They will not be allocated by your program. To compile a cuda program, like hello-world.cu you need to load a toolchain containing CUDA compilers/load CUDA compilers. To run a piece of software that uses GPUs, you need to load a module version which is GPU aware. In many cases there are several versions of a module, only some of which are for running on GPUs. Important Remember to check the modules, versions, and prerequisites! Also make sure you check for the correct node type. Some of the GPUs are on Intel nodes (check modules on kebnekaise.hpc2n.umu.se ), some on AMD nodes (check modules on kebnekaise-amd.hpc2n.umu.se ).","title":"CUDA/GPU programs"},{"location":"simple/#v100__-__intel__skylake","text":"This example runs a small CUDA code. We recommend fosscuda/2020b (contains GCC , OpenMPI , OpenBLAS / LAPACK , FFTW , ScaLAPACK , and CUDA ) or intelcuda/2019a (contains icc , ifort , IntelMPI , IntelMKL , and CUDA ) Sample batch script gpu-skylake.sh #!/bin/bash # This job script is for running on 1 V100 GPU. # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=v100:1 ml purge > /dev/null 2 > & 1 ml fosscuda/2020b nvcc hello-world.cu -o hello ./hello The batch script gpu.sh compiles and runs a small cuda program called hello-world.cu . Exercise: V100 GPU job To submit it, just do: sbatch gpu.sh Use squeue --me or scontrol show job JOBID to see that the job runs in the correct partition/node types.","title":"V100 - Intel Skylake"},{"location":"simple/#a100__-__amd__zen3","text":"Remember, in order to find the correct modules, as well as compile a program if you need that, you must login to one of the AMD login nodes with either SSH ( kebnekaise-amd.hpc2n.umu.se ) or ThinLinc ( kebnekaise-amd-tl.kebnekaise.hpc2n.umu.se ). The job can be submitted from the regular login node, though. Exercise: login to the AMD login node and find a suitable module If you are logged in to the regular Kebnekaise login node, then you can easiest login to the AMD login node by typing this in a terminal window: ssh kebnekaise-amd.hpc2n.umu.se After that, you check for a suitable CUDA toolchain: ml spider CUDA . You can then load it (here CUDA/11.7.0 ) and use nvcc to compile the program hello-world.cu : ml CUDA/11.7.0 nvcc hello-world.cu -o hello Now logout from the AMD login node again. The batch script gpu-a100.sh compiles and runs a small cuda program called hello-world.cu . Sample A100 GPU job script: gpu-a100.sh #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=a100:1 ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 nvcc hello-world.cu -o hello ./hello Exercise: A100 GPU batch jobs The above script is found in the same directory as the other exercises ( intro-course/exercises/simple ). You can submit it directory: sbatch gpu-a100.sh Like for the A100, you are encouraged to use squeue --me and/or scontrol show job JOBID to see that the job gets the correct partition/node type allocated.","title":"A100 - AMD Zen3"},{"location":"simple/#a40__-__intel__broadwell","text":"Kebnekaise also has a few of the A40 GPUs. These are placed on Intel broadwell nodes. In order to run on these, you add this to your batch script: #SBATCH --gpus=a40:number where number is 1 or 2 (the number of GPU cards). You can find the available modules on the regular login node, kebnekaise.hpc2n.umu.se .","title":"A40 - Intel broadwell"},{"location":"simple/#l40s__-__amd__zen4","text":"Since these GPUs are located on AMD Zen4 nodes, you need to login to kebnekaise-amd.hpc2n.umu.se to check available modules. Then, to ask for these nodes in your batch script, you add: #SBATCH --gpus=l40s:number where number is 1 or 2 (the number of GPU cards).","title":"L40s - AMD Zen4"},{"location":"simple/#h100__-__amd__zen4","text":"The H100 GPUs are located on AMD Zen4 nodes. You can find the available modules by logging in to kebnekaise-amd.hpc2n.umu.se . You ask for these GPUs in your batch script by adding: #SBATCH --gpus=h100:number where number is 1, 2, 3, or 4 (the number of GPU cards you want to allocate).","title":"H100 - AMD Zen4"},{"location":"simple/#a6000__-__amd__zen4","text":"The A6000 GPUs are placed on AMD Zen4 nodes. That means you can find the available modules by logging in to kebnekaise-amd.hpc2n.umu.se . To run on these GPUs, add this to your batch script: #SBATCH --gpus=a6000:number where number is 1 or 2 (the number of GPU cards you want to allocated).","title":"A6000 - AMD Zen4"},{"location":"simple/#mi100__-__amd__zen3","text":"The MI100 GPUs are located on AMD Zen3 nodes. You can find the available modules by logging in to kebnekaise-amd.hpc2n.umu.se . To allocate MI100 GPUs, add this to your batch script: #SBATCH --gpus=mi100:number where number is 1 or 2 (the number of GPU cards).","title":"MI100 - AMD Zen3"},{"location":"simple/#gpu__features","text":"Sample batch script for allocating any AMD GPU #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=1 #SBATCH -C amd_gpu ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 ./myGPUcode Sample batch script for allocating any Nvidia GPU #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=1 #SBATCH -C nvidia_gpu ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 ./myGPUcode Sample batch script for allocating any Nvidia GPU on Intel node #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=1 #SBATCH -C 'nvidia_gpu&intel_cpu' ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 ./myGPUcode Sample batch script for allocating any GPU with AI features and on a Zen node #!/bin/bash # Remember to change this to your own project ID after the course! #SBATCH -A hpc2n2024-084 #SBATCH --time=00:05:00 #SBATCH --gpus=1 #SBATCH -C ''zen3|zen4'&GPU_AI' ml purge > /dev/null 2 > & 1 ml CUDA/11.7.0 ./myGPUcode Exercise: GPU features In order to run these examples, you can change ./myGPUcode to nvcc hello-world.cu -o hello ./hello or any other GPU program of your choice. The gpu-features.sh example script in the exercises/simple directory is prepared for the \u201cany GPU with AI features and on a Zen node\u201d. You can either run it as is, or make changes to it and try any of the other combinations here (or try new combinations yourself). Check with squeue --me which partition/node type the job ends up in, and that it fits. More information can be found with scontrol show job JOBID .","title":"GPU features"},{"location":"simple/#starting__jupyterlab","text":"On Kebnekaise, it is possible to run JupyterLab. This is done through a batch job, and is described in detail on our \u201cJupyter on Kebnekaise\u201d documentation .","title":"Starting JupyterLab"},{"location":"simple/#keypoints","text":"Keypoints How to run serial, MPI, OpenMP, and GPU jobs How to use GPU features How to run several jobs from inside one batch job","title":"Keypoints"},{"location":"software/","text":"Application examples \u00b6 Create a soft-link to your storage project It will be very convinient to create a soft-link to your storage project in your home directory for a faster navigation: cd $HOME ln -s /proj/nobackup/hpc2n202X-XYZ choose-a-name Monitoring the use of resources Most likely you will allocate many cores and many GPUs for your simulations. You can monitor the use of these resources with the job-usage job_ID command, where job_ID is the output number of the sbatch command. You can also see this number if you type squeue -u my-username . job-usage outputs a url that you can copy/paste in your local browser where you can see how resources are being used: Matlab \u00b6 How to find Matlab \u00b6 Matlab is available through the Menu bar if you are using ThinLinc client (recommended). Additionally, you can load a Matlab module on a Linux terminal on Kebnekaise. Details for these two options can be found here . First time configuration \u00b6 The first time you access Matlab on Kebnekaise, you need to configure it by following these guidelines Configuring Matlab . After configuring the cluster, it is a good practice to validate the cluster (HOME -> Parallel -> Create and Manage Clusters): Notice that it is recommended to use a small number of workers for the validation, in this case 4. Tools for efficient simulations \u00b6 Chart flow for a more efficient Matlab code using existing tools (adapted from 1 ) MATLAB on GPUs Notice that MATLAB currently supports only NVIDIA GPUs (v100,a40,a6000,a100,l40s,h100), with v100 and l40s being the most abundant (10 nodes each). Use MATLAB for lightweight tasks on the login nodes Remember that login nodes are used by many users and if you run heavy jobs there, you will interfere with the workflow of them. Exercises \u00b6 Exercise 1: Matlab serial job The folder SERIAL contains a function funct.m which performs a FFT on a matrix. The execution time is obtained with tic/toc and written down in the output file called log.out . Run the function by using the MATLAB GUI with the help of the script submit.m . As an alternative, you can submit the job via a batch script job.sh . Here, you will need to fix the Project_ID with the one provided for the present course and the Matlab version. Exercise 2: Matlab parallel job PARFOR folder contains an example of a parallelized loop with the \u201cparfor\u201d directive. A pause() function is included in the loop to make it heavy. This function can be submitted to the queue by running the script submit.m in the MATLAB GUI. The number of workers can be set by replacing the string FIXME (in the \u201csubmit.m\u201d file) with the number you desire. Try different values for the number of workers from 1 to 10 and take a note of the simulation time output at the end of the simulation. Where does the code achieve its peak performance? SPMD folder presents an example of a parallelized code using SPMD paradigm. Submit this job to the queue through the MATLAB GUI. This example illustrates the use of parpool to run parallel code in a more interactive manner. Exercise 3: Matlab GPU job GPU folder contains a test case that computes a Mandelbrot set both on CPU mandelcpu.m and on GPU mandelgpu.m . You can submit the jobs through the MATLAB GUI using the submitcpu.m and submitgpu.m files. The final output if everything ran well are two .png figures which display the timings for both architectures. Use the \u201ceom\u201d command on the terminal to visualize the images (eom out-X.png) R \u00b6 How to find R \u00b6 Similar to Matlab, R is available through the Menu bar if you are using ThinLinc client (recommended). Additionally, you can load a Matlab module on a Linux terminal on Kebnekaise. Details for these two options can be found here . First time configuration \u00b6 The first time you access R on Kebnekaise, you need to configure it by following the Preparations step. Recommendations \u00b6 Be aware of data duplication in R Some parallel functions mcapply in this example, tend to replicate the data for the workers (cores) if the dataframe is modified by them. This can be crucial if you are working with a large data frame and you are employing several parallel functions, for instance during the training of machine learning models because your simulation could easily exceed the available memory per node. library ( parallel ) library ( pryr ) prev <- mem_used () print ( paste ( \"Memory initially allocated by R:\" , prev/1e6, \"MB\" )) # Define a relatively large dataframe data_df <- data.frame ( ID = seq ( 1 , 1e7 ) , Value = runif ( 1e7 ) ) # Create a function to be applied to each row (or a subset of rows) process_function <- function ( i, df ) { # do some modification the i-th row return ( df $Value [ i ] * 2 ) } prev <- mem_used () - prev print ( paste ( \"Memory after the serial code execution:\" , prev/1e6, \"MB\" )) # Use mclapply to process the dataframe in parallel num_cores <- 4 results <- mclapply ( 1 :nrow ( data_df ) , function ( i ) process_function ( i, data_df ) , mc.cores = num_cores ) prev <- mem_used () - prev print ( paste ( \"Memory after parallel code execution:\" , prev/1e6, \"MB\" )) In this example mem-dup.R , I used the function mem_used() provided by the pryr package to monitor the memory usage. The batch script for this example is job.sh . One possible solution for data duplication could be to use use a data frame for each worker that includes only the relevant data for that particular computation. Use R for lightweight tasks on the login nodes Remember that login nodes are used by many users and if you run heavy jobs there, you will interfere with the workflow of them. Exercises \u00b6 Requirements Prior to running the examples, you will need to install several packages. Follow these instructions : The packages needed are: For this R version (check if they are not already installed) ml GCC/10.2.0 OpenMPI/4.0.5 R/4.0.4 Rmpi doParallel caret MASS klaR nnet e1071 rpart mlbench parallel Exercise 1: R serial job In the SERIAL folder, a serial is provided. Submit the script job.sh with the command R CMD and also with Rscript . Where could it be more suitable to use Rscript over R CMD ? Why do we need the flag #SBATCH -C \u2018skylake\u2019 in the batch script? Exercise 2: Job Arrays JOB-ARRAYS folder shows an example for job arrays, the batch file is job.sh . Submit the script and notice what is written in the output files. Could you use job arrays in your simulations if you need to run many simulations where some parameters are changed? As an example, imagine that you need to run 28 simulations where a single parameter, such as the temperature, is changed from 2 to 56 C. Could you use the variable task_id in the previous script to get that range of temperatures so that each simulation prints out a different temperature? Exercise 3: Parallel jobs with Rmpi In the folder RMPI , you can find the R script Rmpi.R which uses 5 MPI slaves to apply the runif() function on an array \u201cc\u201d. The submit file is job_Rmpi.sh . As a result, you will see the random numbers generated by the slaves in the slurm output file Exercise 4: Parallel jobs with doParallel The folder DOPARALLEL contains two examples: doParallel.R shows how to use the foreach function in sequential mode (1 core) and the parallel mode using 4 cores. What is the difference in the usage of foreach for these two modes? Submit the job_doParallel.sh script and compare the timings of the sequential and parallel codes. How many workers are allocated for this simulation? If you want to allocate more or less, what changes must be made to these files? doParallel_ML.R presents the evaluation of several ML models in both sequential and parallel modes using the standard \u201ciris\u201d database. The difference is basically in the use of %dopar% instead of %do% function. Submit the batch script job_doParallel_ML.sh to the queue. In the output file observe the resulting elapsed times for the sequential and the 4 cores parallel simulation. Upon submitting the job to the queue you will get a number called job ID. Use the command: job-usage job_ID to obtain a URL which you can copy/paste in your local browser. Tip: refresh your browser several times to get the statistics. Can you see how the CPU is used? What about the memory? Note 1: In order to run this exercise, you need to have all the packages listed at the beginning of this document installed. Note 2: If you want to try a different number of cores for running the scripts, you should change that number in both the .R and .sh scripts Exercise 5: Machine Learning jobs In the folder ML we show a ML model using a sonar database and Random Forest as the training method ( Rscript.R ). The simulations are done both in serial and parallel modes. You may change the values for the number of cores (1 in the present case) to other values. Notice that the number of cores needs to be the same in the files job.sh and Rscript.R . Try a different number of cores and monitor the timings which are reported at the end of the output file. Alphafold \u00b6 How to find Alphafold \u00b6 Alphafold is installed as a module. Notice that on the Intel nodes there are more versions of Alphafold installed than on the AMD nodes. Thus, if you are targeting one version that is only installed on the Intel nodes, you will need to add the instruction #SBATCH -C skylake to your batch script, otherwise the job could arrive to an AMD node that lacks that installation. Exercises \u00b6 Exercise 1: Running a monomer protein simulation In the folder ALPHAFOLD you will find a fasta secuence for a monomer and the corresponding batch file job.sh for running the simulation on GPUs. Try running the simulation with CPUs only and then with l40s, v100 and a100 GPUs. Notice that the simulation will take ~1hrs. so the purpose of this exercise is to know if the simulation starts running well only. CryoSPARC \u00b6 How to find CryoSPARC \u00b6 The version 4.5.3 of CryoSPARC is installed as a module. First time configuration \u00b6 One needs a license for using this software. For academic purposes a free of charge license can be requested at the website cryosparc.com (one working day for the processing). Once you obtain your license ID copy it, create a file called /home/u/username/.cryosparc-license and paste it in the first line of this file. In the second line of the file write your email address. Using CryoSPARC on Kebnekaise \u00b6 Create a suitable folder in your project directory, for instance /proj/nobackup/hpc2n202X-XYZ/cryosparc and move into this folder. Download/copy the lane*tar files that are located here to the cryosparc folder and untar them here ( tar -xvf lane_CPU.tar as an example). Fix your Project_ID and time Change the string Project_ID in the file lane*/cluster_script.sh to reflect your current project. Also, the time was set to 20 min. in these files but for your realistic simulations you can change it to longer times ( -t 00:20:00 ). The lanes should be recognized by CryoSPARC when it starts running. Load the CryoSPARC modules. Start CryoSPARC and accept the request which asks about continuing using cryostart and that the folder was not used before. List the users on the server (which should be only yourself for this type of license), check the email address that is displayed for this user (it should be the one you added in the license file) and reset the password to. These steps are summarized here: $cryosparc start ... Do you wish to continue starting cryosparc? [ yN ] : y ... CryoSPARC master started. From this machine, access CryoSPARC and CryoSPARC Live at http://localhost:39007 ... $cryosparc listusers cryosparc resetpassword --email \"myemail@mail.com\" --password \"choose-a-password\" Copy and paste the line which has the localhost port (notice that port number can change) to a browser on Kebnekaise: After loging in, you will be able to see the CryoSPARC\u2019s dashboard: There are several tutorials at the CryoSPARC website, in the previous picture I followed the Introductory Tutorial (v4.0+) . Use cryosparc instead of cryosparcm On Kebnekaise the command cryosparc should be used and not the one cited in the tutorial cryosparcm Depending on the job type, CryoSPARC would suggest the hardware resources. For instance, in the tutorial above Step 4: Import Movies suggests using 1 CPU upon queueing it, but Step 5: Motion Correction suggests using 1 GPU. For CPU-only jobs you can choose the CPU lane, and if your job uses GPUs you can choose among L40s, V100, A100, and H100. Notice that the V100 and L40s are the most abundant at the moment: Keypoints The software on Kebnekaise is mostly accessed through the module system. References \u00b6 MathWorks documentation on Parallel Computing \u21a9","title":"Application examples"},{"location":"software/#application__examples","text":"Create a soft-link to your storage project It will be very convinient to create a soft-link to your storage project in your home directory for a faster navigation: cd $HOME ln -s /proj/nobackup/hpc2n202X-XYZ choose-a-name Monitoring the use of resources Most likely you will allocate many cores and many GPUs for your simulations. You can monitor the use of these resources with the job-usage job_ID command, where job_ID is the output number of the sbatch command. You can also see this number if you type squeue -u my-username . job-usage outputs a url that you can copy/paste in your local browser where you can see how resources are being used:","title":"Application examples"},{"location":"software/#matlab","text":"","title":"Matlab"},{"location":"software/#how__to__find__matlab","text":"Matlab is available through the Menu bar if you are using ThinLinc client (recommended). Additionally, you can load a Matlab module on a Linux terminal on Kebnekaise. Details for these two options can be found here .","title":"How to find Matlab"},{"location":"software/#first__time__configuration","text":"The first time you access Matlab on Kebnekaise, you need to configure it by following these guidelines Configuring Matlab . After configuring the cluster, it is a good practice to validate the cluster (HOME -> Parallel -> Create and Manage Clusters): Notice that it is recommended to use a small number of workers for the validation, in this case 4.","title":"First time configuration"},{"location":"software/#tools__for__efficient__simulations","text":"Chart flow for a more efficient Matlab code using existing tools (adapted from 1 ) MATLAB on GPUs Notice that MATLAB currently supports only NVIDIA GPUs (v100,a40,a6000,a100,l40s,h100), with v100 and l40s being the most abundant (10 nodes each). Use MATLAB for lightweight tasks on the login nodes Remember that login nodes are used by many users and if you run heavy jobs there, you will interfere with the workflow of them.","title":"Tools for efficient simulations"},{"location":"software/#exercises","text":"Exercise 1: Matlab serial job The folder SERIAL contains a function funct.m which performs a FFT on a matrix. The execution time is obtained with tic/toc and written down in the output file called log.out . Run the function by using the MATLAB GUI with the help of the script submit.m . As an alternative, you can submit the job via a batch script job.sh . Here, you will need to fix the Project_ID with the one provided for the present course and the Matlab version. Exercise 2: Matlab parallel job PARFOR folder contains an example of a parallelized loop with the \u201cparfor\u201d directive. A pause() function is included in the loop to make it heavy. This function can be submitted to the queue by running the script submit.m in the MATLAB GUI. The number of workers can be set by replacing the string FIXME (in the \u201csubmit.m\u201d file) with the number you desire. Try different values for the number of workers from 1 to 10 and take a note of the simulation time output at the end of the simulation. Where does the code achieve its peak performance? SPMD folder presents an example of a parallelized code using SPMD paradigm. Submit this job to the queue through the MATLAB GUI. This example illustrates the use of parpool to run parallel code in a more interactive manner. Exercise 3: Matlab GPU job GPU folder contains a test case that computes a Mandelbrot set both on CPU mandelcpu.m and on GPU mandelgpu.m . You can submit the jobs through the MATLAB GUI using the submitcpu.m and submitgpu.m files. The final output if everything ran well are two .png figures which display the timings for both architectures. Use the \u201ceom\u201d command on the terminal to visualize the images (eom out-X.png)","title":"Exercises"},{"location":"software/#r","text":"","title":"R"},{"location":"software/#how__to__find__r","text":"Similar to Matlab, R is available through the Menu bar if you are using ThinLinc client (recommended). Additionally, you can load a Matlab module on a Linux terminal on Kebnekaise. Details for these two options can be found here .","title":"How to find R"},{"location":"software/#first__time__configuration_1","text":"The first time you access R on Kebnekaise, you need to configure it by following the Preparations step.","title":"First time configuration"},{"location":"software/#recommendations","text":"Be aware of data duplication in R Some parallel functions mcapply in this example, tend to replicate the data for the workers (cores) if the dataframe is modified by them. This can be crucial if you are working with a large data frame and you are employing several parallel functions, for instance during the training of machine learning models because your simulation could easily exceed the available memory per node. library ( parallel ) library ( pryr ) prev <- mem_used () print ( paste ( \"Memory initially allocated by R:\" , prev/1e6, \"MB\" )) # Define a relatively large dataframe data_df <- data.frame ( ID = seq ( 1 , 1e7 ) , Value = runif ( 1e7 ) ) # Create a function to be applied to each row (or a subset of rows) process_function <- function ( i, df ) { # do some modification the i-th row return ( df $Value [ i ] * 2 ) } prev <- mem_used () - prev print ( paste ( \"Memory after the serial code execution:\" , prev/1e6, \"MB\" )) # Use mclapply to process the dataframe in parallel num_cores <- 4 results <- mclapply ( 1 :nrow ( data_df ) , function ( i ) process_function ( i, data_df ) , mc.cores = num_cores ) prev <- mem_used () - prev print ( paste ( \"Memory after parallel code execution:\" , prev/1e6, \"MB\" )) In this example mem-dup.R , I used the function mem_used() provided by the pryr package to monitor the memory usage. The batch script for this example is job.sh . One possible solution for data duplication could be to use use a data frame for each worker that includes only the relevant data for that particular computation. Use R for lightweight tasks on the login nodes Remember that login nodes are used by many users and if you run heavy jobs there, you will interfere with the workflow of them.","title":"Recommendations"},{"location":"software/#exercises_1","text":"Requirements Prior to running the examples, you will need to install several packages. Follow these instructions : The packages needed are: For this R version (check if they are not already installed) ml GCC/10.2.0 OpenMPI/4.0.5 R/4.0.4 Rmpi doParallel caret MASS klaR nnet e1071 rpart mlbench parallel Exercise 1: R serial job In the SERIAL folder, a serial is provided. Submit the script job.sh with the command R CMD and also with Rscript . Where could it be more suitable to use Rscript over R CMD ? Why do we need the flag #SBATCH -C \u2018skylake\u2019 in the batch script? Exercise 2: Job Arrays JOB-ARRAYS folder shows an example for job arrays, the batch file is job.sh . Submit the script and notice what is written in the output files. Could you use job arrays in your simulations if you need to run many simulations where some parameters are changed? As an example, imagine that you need to run 28 simulations where a single parameter, such as the temperature, is changed from 2 to 56 C. Could you use the variable task_id in the previous script to get that range of temperatures so that each simulation prints out a different temperature? Exercise 3: Parallel jobs with Rmpi In the folder RMPI , you can find the R script Rmpi.R which uses 5 MPI slaves to apply the runif() function on an array \u201cc\u201d. The submit file is job_Rmpi.sh . As a result, you will see the random numbers generated by the slaves in the slurm output file Exercise 4: Parallel jobs with doParallel The folder DOPARALLEL contains two examples: doParallel.R shows how to use the foreach function in sequential mode (1 core) and the parallel mode using 4 cores. What is the difference in the usage of foreach for these two modes? Submit the job_doParallel.sh script and compare the timings of the sequential and parallel codes. How many workers are allocated for this simulation? If you want to allocate more or less, what changes must be made to these files? doParallel_ML.R presents the evaluation of several ML models in both sequential and parallel modes using the standard \u201ciris\u201d database. The difference is basically in the use of %dopar% instead of %do% function. Submit the batch script job_doParallel_ML.sh to the queue. In the output file observe the resulting elapsed times for the sequential and the 4 cores parallel simulation. Upon submitting the job to the queue you will get a number called job ID. Use the command: job-usage job_ID to obtain a URL which you can copy/paste in your local browser. Tip: refresh your browser several times to get the statistics. Can you see how the CPU is used? What about the memory? Note 1: In order to run this exercise, you need to have all the packages listed at the beginning of this document installed. Note 2: If you want to try a different number of cores for running the scripts, you should change that number in both the .R and .sh scripts Exercise 5: Machine Learning jobs In the folder ML we show a ML model using a sonar database and Random Forest as the training method ( Rscript.R ). The simulations are done both in serial and parallel modes. You may change the values for the number of cores (1 in the present case) to other values. Notice that the number of cores needs to be the same in the files job.sh and Rscript.R . Try a different number of cores and monitor the timings which are reported at the end of the output file.","title":"Exercises"},{"location":"software/#alphafold","text":"","title":"Alphafold"},{"location":"software/#how__to__find__alphafold","text":"Alphafold is installed as a module. Notice that on the Intel nodes there are more versions of Alphafold installed than on the AMD nodes. Thus, if you are targeting one version that is only installed on the Intel nodes, you will need to add the instruction #SBATCH -C skylake to your batch script, otherwise the job could arrive to an AMD node that lacks that installation.","title":"How to find Alphafold"},{"location":"software/#exercises_2","text":"Exercise 1: Running a monomer protein simulation In the folder ALPHAFOLD you will find a fasta secuence for a monomer and the corresponding batch file job.sh for running the simulation on GPUs. Try running the simulation with CPUs only and then with l40s, v100 and a100 GPUs. Notice that the simulation will take ~1hrs. so the purpose of this exercise is to know if the simulation starts running well only.","title":"Exercises"},{"location":"software/#cryosparc","text":"","title":"CryoSPARC"},{"location":"software/#how__to__find__cryosparc","text":"The version 4.5.3 of CryoSPARC is installed as a module.","title":"How to find CryoSPARC"},{"location":"software/#first__time__configuration_2","text":"One needs a license for using this software. For academic purposes a free of charge license can be requested at the website cryosparc.com (one working day for the processing). Once you obtain your license ID copy it, create a file called /home/u/username/.cryosparc-license and paste it in the first line of this file. In the second line of the file write your email address.","title":"First time configuration"},{"location":"software/#using__cryosparc__on__kebnekaise","text":"Create a suitable folder in your project directory, for instance /proj/nobackup/hpc2n202X-XYZ/cryosparc and move into this folder. Download/copy the lane*tar files that are located here to the cryosparc folder and untar them here ( tar -xvf lane_CPU.tar as an example). Fix your Project_ID and time Change the string Project_ID in the file lane*/cluster_script.sh to reflect your current project. Also, the time was set to 20 min. in these files but for your realistic simulations you can change it to longer times ( -t 00:20:00 ). The lanes should be recognized by CryoSPARC when it starts running. Load the CryoSPARC modules. Start CryoSPARC and accept the request which asks about continuing using cryostart and that the folder was not used before. List the users on the server (which should be only yourself for this type of license), check the email address that is displayed for this user (it should be the one you added in the license file) and reset the password to. These steps are summarized here: $cryosparc start ... Do you wish to continue starting cryosparc? [ yN ] : y ... CryoSPARC master started. From this machine, access CryoSPARC and CryoSPARC Live at http://localhost:39007 ... $cryosparc listusers cryosparc resetpassword --email \"myemail@mail.com\" --password \"choose-a-password\" Copy and paste the line which has the localhost port (notice that port number can change) to a browser on Kebnekaise: After loging in, you will be able to see the CryoSPARC\u2019s dashboard: There are several tutorials at the CryoSPARC website, in the previous picture I followed the Introductory Tutorial (v4.0+) . Use cryosparc instead of cryosparcm On Kebnekaise the command cryosparc should be used and not the one cited in the tutorial cryosparcm Depending on the job type, CryoSPARC would suggest the hardware resources. For instance, in the tutorial above Step 4: Import Movies suggests using 1 CPU upon queueing it, but Step 5: Motion Correction suggests using 1 GPU. For CPU-only jobs you can choose the CPU lane, and if your job uses GPUs you can choose among L40s, V100, A100, and H100. Notice that the V100 and L40s are the most abundant at the moment: Keypoints The software on Kebnekaise is mostly accessed through the module system.","title":"Using CryoSPARC on Kebnekaise"},{"location":"software/#references","text":"MathWorks documentation on Parallel Computing \u21a9","title":"References"}]} \ No newline at end of file diff --git a/search/worker.js b/search/worker.js new file mode 100644 index 00000000..8628dbce --- /dev/null +++ b/search/worker.js @@ -0,0 +1,133 @@ +var base_path = 'function' === typeof importScripts ? '.' : '/search/'; +var allowSearch = false; +var index; +var documents = {}; +var lang = ['en']; +var data; + +function getScript(script, callback) { + console.log('Loading script: ' + script); + $.getScript(base_path + script).done(function () { + callback(); + }).fail(function (jqxhr, settings, exception) { + console.log('Error: ' + exception); + }); +} + +function getScriptsInOrder(scripts, callback) { + if (scripts.length === 0) { + callback(); + return; + } + getScript(scripts[0], function() { + getScriptsInOrder(scripts.slice(1), callback); + }); +} + +function loadScripts(urls, callback) { + if( 'function' === typeof importScripts ) { + importScripts.apply(null, urls); + callback(); + } else { + getScriptsInOrder(urls, callback); + } +} + +function onJSONLoaded () { + data = JSON.parse(this.responseText); + var scriptsToLoad = ['lunr.js']; + if (data.config && data.config.lang && data.config.lang.length) { + lang = data.config.lang; + } + if (lang.length > 1 || lang[0] !== "en") { + scriptsToLoad.push('lunr.stemmer.support.js'); + if (lang.length > 1) { + scriptsToLoad.push('lunr.multi.js'); + } + if (lang.includes("ja") || lang.includes("jp")) { + scriptsToLoad.push('tinyseg.js'); + } + for (var i=0; i < lang.length; i++) { + if (lang[i] != 'en') { + scriptsToLoad.push(['lunr', lang[i], 'js'].join('.')); + } + } + } + loadScripts(scriptsToLoad, onScriptsLoaded); +} + +function onScriptsLoaded () { + console.log('All search scripts loaded, building Lunr index...'); + if (data.config && data.config.separator && data.config.separator.length) { + lunr.tokenizer.separator = new RegExp(data.config.separator); + } + + if (data.index) { + index = lunr.Index.load(data.index); + data.docs.forEach(function (doc) { + documents[doc.location] = doc; + }); + console.log('Lunr pre-built index loaded, search ready'); + } else { + index = lunr(function () { + if (lang.length === 1 && lang[0] !== "en" && lunr[lang[0]]) { + this.use(lunr[lang[0]]); + } else if (lang.length > 1) { + this.use(lunr.multiLanguage.apply(null, lang)); // spread operator not supported in all browsers: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Spread_operator#Browser_compatibility + } + this.field('title'); + this.field('text'); + this.ref('location'); + + for (var i=0; i < data.docs.length; i++) { + var doc = data.docs[i]; + this.add(doc); + documents[doc.location] = doc; + } + }); + console.log('Lunr index built, search ready'); + } + allowSearch = true; + postMessage({config: data.config}); + postMessage({allowSearch: allowSearch}); +} + +function init () { + var oReq = new XMLHttpRequest(); + oReq.addEventListener("load", onJSONLoaded); + var index_path = base_path + '/search_index.json'; + if( 'function' === typeof importScripts ){ + index_path = 'search_index.json'; + } + oReq.open("GET", index_path); + oReq.send(); +} + +function search (query) { + if (!allowSearch) { + console.error('Assets for search still loading'); + return; + } + + var resultDocuments = []; + var results = index.search(query); + for (var i=0; i < results.length; i++){ + var result = results[i]; + doc = documents[result.ref]; + doc.summary = doc.text.substring(0, 200); + resultDocuments.push(doc); + } + return resultDocuments; +} + +if( 'function' === typeof importScripts ) { + onmessage = function (e) { + if (e.data.init) { + init(); + } else if (e.data.query) { + postMessage({ results: search(e.data.query) }); + } else { + console.error("Worker - Unrecognized message: " + e); + } + }; +} diff --git a/simple/index.html b/simple/index.html new file mode 100644 index 00000000..16f55cd7 --- /dev/null +++ b/simple/index.html @@ -0,0 +1,678 @@ + + + + + + + +Objectives
+For consistency, I have given all the example batch scripts the suffix .sh
even though it is not required. Another commonly used suffix is .batch
, but any or none will work.
You need to compile any programs mentioned in a batch script in order to run the examples, except for compile-run.sh
and the CUDA
examples, which includes compilation.
Important
+#SBATCH -A hpc2n2024-084
/proj/nobackup/intro-hpc2n
. Hint
+Try to change the C programs, add different programs, and in general play around with the examples!
+Note
+foss
compiler toolchain, version 2022b, unless otherwise specified. If you decide to use a different one, you will have to make changes to some of the batch scripts.sbatch JOBSCRIPT
-o
tells the compiler you want to name the executable. If you don’t include that and a name, you will get an executable named a.out
. Of course, you do not have to name the executable hello
. This is just an example. In general, I have named all the executables the same as the program (without the suffix).To compile a serial program, like hello.c
with gcc do:
Sample batch script (hello.sh)
+#!/bin/bash
+# Project id - change to your own after the course!
+#SBATCH -A hpc2n2024-084
+# Asking for 1 core
+#SBATCH -n 1
+# Asking for a walltime of 1 min
+#SBATCH --time=00:01:00
+
+# Purge modules before loading new ones in a script.
+ml purge > /dev/null 2>&1
+ml foss/2022b
+
+./hello
+
Exercise: serial job
+Submit the job with sbatch
. Check on it with squeue --me
. Take a look at the output (slurm-JOBID.out
) with nano
or your favourite editor.
To compile an MPI program, like mpi_hello.c
(and create an executable named mpi_hello
) with gcc, do:
Sample batch script (mpi_hello.sh)
+#!/bin/bash
+# Remember to change this to your own Project ID after the course!
+#SBATCH -A hpc2n2024-084
+# Number of tasks - default is 1 core per task
+#SBATCH -n 14
+#SBATCH --time=00:05:00
+
+# It is always a good idea to do ml purge before loading other modules
+ml purge > /dev/null 2>&1
+
+ml add foss/2022b
+
+# Use srun since this is an MPI program
+srun ./mpi_hello
+
Exercise: MPI job
+Submit the job with sbatch
. Check on it with squeue --me
. Take a look at the output (slurm-JOBID.out
) with nano
or your favourite editor. Try running it more than once to see that the order of the tasks are random.
To compile an OpenMP program, like omp_hello.c
(and create an executable named omp_hello
) with gcc, do:
Sample batch script (omp_hello.sh)
+#!/bin/bash
+#SBATCH -A hpc2n2024-084
+# Number of cores per task
+#SBATCH -c 28
+#SBATCH --time=00:05:00
+
+# It is always a good idea to do ml purge before loading other modules
+ml purge > /dev/null 2>&1
+
+ml add foss/2022b
+
+# Set OMP_NUM_THREADS to the same value as -c with a fallback in case it isn't set.
+# SLURM_CPUS_PER_TASK is set to the value of -c, but only if -c is explicitly set
+if [ -n "$SLURM_CPUS_PER_TASK" ]; then
+ omp_threads=$SLURM_CPUS_PER_TASK
+else
+ omp_threads=1
+fi
+export OMP_NUM_THREADS=$omp_threads
+
+./omp_hello
+
Exercise: OpenMP job
+Set OMP_NUM_THREADS
to some value between 1 and 28 (export OMP_NUM_THREADS=value
). Submit the job with sbatch
. Take a look at the output (slurm-JOBID.out
) with nano or your favourite editor. Change the value of OMP_NUM_THREADS
). Submit it again and check on the output to see the change.
This submit file shows one way of running several programs from inside the same submit file.
+To run this example, you need to compile the following serial C programs:
+ +When the C programs have been compiled, submit the multiple-serial.sh
program:
All jobs run at the same time, so you need as many cores as they need combined. You also need to ask for long enough time that even the longest of the jobs will finish.
+Note that here you submit with srun
even if it is serial jobs. You use &
to send the job to the background. Also note the wait
at the end. If you do not add that, the whole batch job will finish when the first of the jobs inside ends.
#!/bin/bash
+#SBATCH -A hpc2n2024-084
+# Add enough cores that all jobs can run at the same time
+#SBATCH -n 5
+# Make sure that the time is long enough that the longest job will have time to finish
+#SBATCH --time=00:05:00
+
+module purge > /dev/null 2>&1
+ml foss/2022b
+
+srun -n 1 --exclusive ./hello &
+srun -n 1 --exclusive ./Greeting &
+srun -n 1 --exclusive ./Adding2 10 20 &
+srun -n 1 --exclusive /bin/hostname &
+srun -n 1 --exclusive ./Mult2 10 2
+wait
+
Exercise: multiple serial jobs
+Compile the above mentioned programs. Submit the batch script with
+ +If you run it several times you will notice that the order is random.
+Job arrays offer a mechanism for submitting and managing collections of similar jobs. All jobs must have the same initial options (e.g. size, time limit, etc.), however it is possible to change some of these options after the job has begun execution using the scontrol command specifying the JobID of the array or individual ArrayJobID.
+More information here on the official Slurm documentation pages.
+To try an example, we have included a small Python script hello-world-array.py
and a batch script hello-world-array.sh
. Both can also be found in the exercises/simple
directory you have cloned.
hello-world-array.py
+ +hello-world-array.sh
+#!/bin/bash
+# This is a very simple example of how to run a Python script with a job array
+#SBATCH -A hpc2n2024-084 # Change to your own after the course!
+#SBATCH --time=00:05:00 # Asking for 5 minutes
+#SBATCH --array=1-10 # how many tasks in the array
+#SBATCH -c 1 # Asking for 1 core # one core per task
+#SBATCH -o hello-world-%j-%a.out
+
+# Load any modules you need, here for Python 3.11.3
+ml GCC/12.3.0 Python/3.11.3
+
+# Run your Python script
+srun python hello-world-array.py $SLURM_ARRAY_TASK_ID
+
Exercise: job arrays
+Submit the batch script. Look at the output files. Change the number of tasks in the array. Rerun. See the change.
+To run this example, you need to compile the following parallel C programs:
+ +When the MPI C programs have been compiled, submit the multiple-parallel-sequential.sh
program:
#!/bin/bash
+#SBATCH -A hpc2n2024-084
+# Since the files are run sequentially I only need enough cores for the largest of them to run
+#SBATCH -c 28
+# Remember to ask for enough time for all jobs to complete
+#SBATCH --time=00:10:00
+
+module purge > /dev/null 2>&1
+ml foss/2022b
+
+# Here 14 tasks with 2 cores per task. Output to file - not needed if your job creates output in a file directly
+# In this example I also copy the output somewhere else and then run another executable.
+
+srun -n 14 -c 2 ./mpi_hello > myoutput1 2>&1
+cp myoutput1 mydatadir
+srun -n 14 -c 2 ./mpi_greeting > myoutput2 2>&1
+cp myoutput2 mydatadir
+srun -n 14 -c 2 ./mpi_hi > myoutput3 2>&1
+cp myoutput3 mydatadir
+
Exercise: multiple parallel jobs sequentially
+Submit the job:
+ +See that output data are thrown to files and copied to the directory mydatadir
.
To run this example, you need to compile the following parallel C programs:
+ +As before, we recommend using the foss/2022b
module for this. If you use a different one you need to change it in the multiple-parallel-simultaneous.sh
batch script.
When the MPI C programs have been compiled, submit the multiple-parallel-simultaneous.sh
program:
#!/bin/bash
+#SBATCH -A hpc2n2024-084
+# Since the files run simultaneously I need enough cores for all of them to run
+#SBATCH -n 56
+# Remember to ask for enough time for all jobs to complete
+#SBATCH --time=00:10:00
+
+module purge > /dev/null 2>&1
+ml foss/2022b
+
+srun -n 14 --exclusive ./mpi_hello &
+srun -n 14 --exclusive ./mpi_greeting &
+srun -n 14 --exclusive ./mpi_hi &
+wait
+
Just like for the multiple serial jobs simultaneously example, you need to add wait
to make sure the batch job will not finish when the first of the jobs in it finishes.
Exercise: multiple parallel jobs simultaneously
+When you have compiled the needed programs, as mentioned above, submit with
+ +Sometimes you have a program that takes a long time to compile, or that you need to recompile before each run. To see a simple example of compiling and running from the batch job, look at the batch script compile-run.sh
.
In this case it compiles and runs the mpi_hello.c
program.
compile-run.sh
+#!/bin/bash
+# CHANGE THE PROJECT ID TO YOUR OWN PROJECT ID AFTER THE COURSE!
+#SBATCH -A hpc2n2024-084
+#Name the job, for easier finding in the list
+#SBATCH -J compiler-run
+#SBATCH -t 00:10:00
+#SBATCH -n 12
+
+ml purge > /dev/null 2>&1
+
+ml foss/2022b
+
+mpicc mpi_hello.c -o mpi_hello
+mpirun ./mpi_hello
+
Exercise: compile and run in a batch job
+This batch script can be submitted directly, without compiling anything first, as that happens in the batch script. Try submitting it with sbatch
and see what happens. Which files are created? You could try changing the program it compiles and runs to a different one. Remember to change the compiler if you are not using an MPI program.
As a default, Slurm throws both errors and other output to the same file, named slurm-JOBID.out
. If you want the errors and other output to separate files, you can do as in the example separate-err-out.sh
:
#!/bin/bash
+# Remember to change this to your own Project ID after the course!
+#SBATCH -A hpc2n2024-084
+#SBATCH -n 8
+#SBATCH --time=00:05:00
+
+# Putting the output in a separate output file and the errors in an
+# error file instead of putting it all in slurm-JOBID.out
+# Note the environment variable %J, which contains the job ID. It is handy to
+# avoid naming the files the same for different runs, and thus overwriting them.
+#SBATCH --error=job.%J.err
+#SBATCH --output=job.%J.out
+
+ml purge > /dev/null 2>&1
+ml foss/2022b
+
+mpirun ./mpi_hello
+
You need the mpi_hello.c
file compiled (and the executable named mpi_hello
) for this to run without changes. Of course, you can also just add your own programs.
Exercise: errors and outputs in separate files
+Compile the file mpi_hello.s
after loading the module foss/2022b
. Submit the job script with sbatch
. See that separate output and error files are created.
To run programs/software that uses GPUs, you need to allocated GPUs in the job script. They will not be allocated by your program.
+To compile a cuda program, like hello-world.cu
you need to load a toolchain containing CUDA compilers/load CUDA compilers.
To run a piece of software that uses GPUs, you need to load a module version which is GPU aware. In many cases there are several versions of a module, only some of which are for running on GPUs.
+Important
+Remember to check the modules, versions, and prerequisites! Also make sure you check for the correct node type. Some of the GPUs are on Intel nodes (check modules on kebnekaise.hpc2n.umu.se
), some on AMD nodes (check modules on kebnekaise-amd.hpc2n.umu.se
).
This example runs a small CUDA code.
+We recommend fosscuda/2020b
(contains GCC
, OpenMPI
, OpenBLAS
/LAPACK
, FFTW
, ScaLAPACK
, and CUDA
) or intelcuda/2019a
(contains icc
, ifort
, IntelMPI
, IntelMKL
, and CUDA
)
Sample batch script gpu-skylake.sh
#!/bin/bash
+# This job script is for running on 1 V100 GPU.
+# Remember to change this to your own project ID after the course!
+#SBATCH -A hpc2n2024-084
+#SBATCH --time=00:05:00
+#SBATCH --gpus=v100:1
+
+ml purge > /dev/null 2>&1
+ml fosscuda/2020b
+
+nvcc hello-world.cu -o hello
+./hello
+
The batch script gpu.sh compiles and runs a small cuda program called hello-world.cu
.
Exercise: V100 GPU job
+To submit it, just do:
+ +Use squeue --me
or scontrol show job JOBID
to see that the job runs in the correct partition/node types.
Remember, in order to find the correct modules, as well as compile a program if you need that, you must login to one of the AMD login nodes with either SSH (kebnekaise-amd.hpc2n.umu.se
) or ThinLinc (kebnekaise-amd-tl.kebnekaise.hpc2n.umu.se
).
+The job can be submitted from the regular login node, though.
Exercise: login to the AMD login node and find a suitable module
+If you are logged in to the regular Kebnekaise login node, then you can easiest login to the AMD login node by typing this in a terminal window:
+ +After that, you check for a suitable CUDA toolchain: ml spider CUDA
.
You can then load it (here CUDA/11.7.0
) and use nvcc
to compile the program hello-world.cu
:
Now logout from the AMD login node again.
+The batch script gpu-a100.sh
compiles and runs a small cuda program called hello-world.cu
.
Sample A100 GPU job script: gpu-a100.sh
+#!/bin/bash
+# Remember to change this to your own project ID after the course!
+#SBATCH -A hpc2n2024-084
+#SBATCH --time=00:05:00
+#SBATCH --gpus=a100:1
+
+ml purge > /dev/null 2>&1
+ml CUDA/11.7.0
+
+nvcc hello-world.cu -o hello
+./hello
+
Exercise: A100 GPU batch jobs
+The above script is found in the same directory as the other exercises (intro-course/exercises/simple
). You can submit it directory:
Like for the A100, you are encouraged to use squeue --me
and/or scontrol show job JOBID
to see that the job gets the correct partition/node type allocated.
Kebnekaise also has a few of the A40 GPUs. These are placed on Intel broadwell nodes.
+In order to run on these, you add this to your batch script:
+ +where number
is 1 or 2 (the number of GPU cards).
You can find the available modules on the regular login node, kebnekaise.hpc2n.umu.se
.
Since these GPUs are located on AMD Zen4 nodes, you need to login to kebnekaise-amd.hpc2n.umu.se
to check available modules.
Then, to ask for these nodes in your batch script, you add:
+ +where number
is 1 or 2 (the number of GPU cards).
The H100 GPUs are located on AMD Zen4 nodes. You can find the available modules by logging in to kebnekaise-amd.hpc2n.umu.se
.
You ask for these GPUs in your batch script by adding:
+ +where number
is 1, 2, 3, or 4 (the number of GPU cards you want to allocate).
The A6000 GPUs are placed on AMD Zen4 nodes. That means you can find the available modules by logging in to kebnekaise-amd.hpc2n.umu.se
.
To run on these GPUs, add this to your batch script:
+ +where number
is 1 or 2 (the number of GPU cards you want to allocated).
The MI100 GPUs are located on AMD Zen3 nodes. You can find the available modules by logging in to kebnekaise-amd.hpc2n.umu.se
.
To allocate MI100 GPUs, add this to your batch script:
+ +where number
is 1 or 2 (the number of GPU cards).
Sample batch script for allocating any AMD GPU
+#!/bin/bash
+# Remember to change this to your own project ID after the course!
+#SBATCH -A hpc2n2024-084
+#SBATCH --time=00:05:00
+#SBATCH --gpus=1
+#SBATCH -C amd_gpu
+
+ml purge > /dev/null 2>&1
+ml CUDA/11.7.0
+
+./myGPUcode
+
Sample batch script for allocating any Nvidia GPU
+#!/bin/bash
+# Remember to change this to your own project ID after the course!
+#SBATCH -A hpc2n2024-084
+#SBATCH --time=00:05:00
+#SBATCH --gpus=1
+#SBATCH -C nvidia_gpu
+
+ml purge > /dev/null 2>&1
+ml CUDA/11.7.0
+
+./myGPUcode
+
Sample batch script for allocating any Nvidia GPU on Intel node
+#!/bin/bash
+# Remember to change this to your own project ID after the course!
+#SBATCH -A hpc2n2024-084
+#SBATCH --time=00:05:00
+#SBATCH --gpus=1
+#SBATCH -C 'nvidia_gpu&intel_cpu'
+
+ml purge > /dev/null 2>&1
+ml CUDA/11.7.0
+
+./myGPUcode
+
Sample batch script for allocating any GPU with AI features and on a Zen node
+#!/bin/bash
+# Remember to change this to your own project ID after the course!
+#SBATCH -A hpc2n2024-084
+#SBATCH --time=00:05:00
+#SBATCH --gpus=1
+#SBATCH -C ''zen3|zen4'&GPU_AI'
+
+ml purge > /dev/null 2>&1
+ml CUDA/11.7.0
+
+./myGPUcode
+
Exercise: GPU features
+In order to run these examples, you can change ./myGPUcode
to
or any other GPU program of your choice.
+The gpu-features.sh
example script in the exercises/simple
directory is prepared for the “any GPU with AI features and on a Zen node”. You can either run it as is, or make changes to it and try any of the other combinations here (or try new combinations yourself).
Check with squeue --me
which partition/node type the job ends up in, and that it fits. More information can be found with scontrol show job JOBID
.
On Kebnekaise, it is possible to run JupyterLab. This is done through a batch job, and is described in detail on our “Jupyter on Kebnekaise” documentation.
+Keypoints
+It will be very convinient to create a soft-link to your storage project in your +home directory for a faster navigation:
+ +Most likely you will allocate many cores and many GPUs for your simulations. You can
+monitor the use of these resources with the job-usage job_ID
command, where job_ID
+is the output number of the sbatch
command. You can also see this number if you type
+squeue -u my-username
. job-usage
outputs a url that you can copy/paste in your
+local browser where you can see how resources are being used:
Matlab is available through the Menu bar if you are using ThinLinc client (recommended). Additionally, you can load +a Matlab module on a Linux terminal on Kebnekaise. Details for these two options can be found +here.
+The first time you access Matlab on Kebnekaise, you need to configure it by following these guidelines +Configuring Matlab. After configuring the cluster, it is a good practice to validate the +cluster (HOME -> Parallel -> Create and Manage Clusters):
+ +Notice that it is recommended to use a small number of workers for the validation, in this case 4.
+Chart flow for a more efficient Matlab code using existing tools (adapted from1)
+ +MATLAB on GPUs
+Notice that MATLAB currently supports only NVIDIA GPUs (v100,a40,a6000,a100,l40s,h100), +with v100 and l40s being the most abundant (10 nodes each).
+Use MATLAB for lightweight tasks on the login nodes
+Remember that login nodes are used by many users and if you run heavy jobs there, +you will interfere with the workflow of them.
+The folder SERIAL
contains a function funct.m
+which performs a FFT on a matrix.
+The execution time is obtained with tic/toc and written down in the output file called
+log.out. Run the function by using the MATLAB GUI with the help of the script submit.m.
As an alternative, you can submit the job via a batch script +job.sh. +Here, you will need to fix the Project_ID with the one provided for the present course and the Matlab version.
+PARFOR
folder contains an example of a parallelized loop with the “parfor” directive. A pause()
+function is included in the loop to make it heavy. This function can be
+submitted to the queue by running the script submit.m in the MATLAB GUI.
+The number of workers can be set by replacing the string FIXME (in the “submit.m”
+file) with the number you desire.
+ Try different values for the number of workers from 1 to 10 and take a note
+ of the simulation time output at the end of the simulation. Where does the
+ code achieve its peak performance?
SPMD
folder presents an example of a parallelized code using SPMD paradigm. Submit this job to the queue through the MATLAB GUI. This
+example illustrates the use of parpool to run parallel code in a more interactive manner.
GPU
folder contains a test case that computes a Mandelbrot set both
+on CPU mandelcpu.m
+and on GPU mandelgpu.m. You can submit the jobs through
+the MATLAB GUI using the submitcpu.m and submitgpu.m files.
The final output if everything ran well are two .png figures +which display the timings for both architectures. Use the “eom” command on the +terminal to visualize the images (eom out-X.png)
+Similar to Matlab, R is available through the Menu bar if you are using ThinLinc client (recommended). Additionally, you can load +a Matlab module on a Linux terminal on Kebnekaise. Details for these two options can be found +here.
+The first time you access R on Kebnekaise, you need to configure it by following the +Preparations step.
+Some parallel functions mcapply
in this example, tend to replicate the data for
+the workers (cores) if the dataframe is modified by them. This can be crucial if you
+are working with a large data frame and you are employing several parallel functions,
+for instance during the training of machine learning models because your simulation could
+easily exceed the available memory per node.
library(parallel)
+ library(pryr)
+
+ prev <- mem_used()
+ print(paste("Memory initially allocated by R:", prev/1e6, "MB"))
+
+ # Define a relatively large dataframe
+ data_df <- data.frame(
+ ID = seq(1, 1e7),
+ Value = runif(1e7)
+ )
+
+ # Create a function to be applied to each row (or a subset of rows)
+ process_function <- function(i, df) {
+ # do some modification the i-th row
+ return(df$Value[i] * 2)
+ }
+ prev <- mem_used() - prev
+ print(paste("Memory after the serial code execution:", prev/1e6, "MB"))
+
+ # Use mclapply to process the dataframe in parallel
+ num_cores <- 4
+ results <- mclapply(1:nrow(data_df), function(i) process_function(i, data_df), mc.cores = num_cores)
+ prev <- mem_used() - prev
+ print(paste("Memory after parallel code execution:", prev/1e6, "MB"))
+
In this example mem-dup.R, I used the function mem_used()
provided by the pryr
package
+to monitor the memory usage. The batch script for this example is job.sh.
One possible solution for data duplication could be to use use a data frame for each worker that includes +only the relevant data for that particular computation.
+Use R for lightweight tasks on the login nodes
+Remember that login nodes are used by many users and if you run heavy jobs there, +you will interfere with the workflow of them.
+Prior to running the examples, you will need to install several packages. +Follow these instructions:
+The packages needed are:
+For this R version (check if they are not already installed)
+ml GCC/10.2.0 OpenMPI/4.0.5 R/4.0.4
+Rmpi
+doParallel
+caret
+MASS
+klaR
+nnet
+e1071
+rpart
+mlbench
+parallel
+In the SERIAL
folder, a serial is provided. Submit the script
+job.sh with the command R CMD and also with Rscript. Where could
+it be more suitable to use Rscript over R CMD?
Why do we need the flag #SBATCH -C ‘skylake’ in the batch script?
+JOB-ARRAYS
folder shows an example for job arrays, the batch file is job.sh. Submit the
+script and notice what is written in the output files.
Could you use job arrays in your simulations if you need to run many simulations where some parameters are changed? As an example, imagine that you need to run 28 simulations +where a single parameter, such as the temperature, is changed from 2 to 56 C. Could you +use the variable task_id in the previous script to get that range of temperatures so +that each simulation prints out a different temperature?
+In the folder RMPI
, you can find the R script Rmpi.R which uses 5
+MPI slaves to apply the runif() function on an array “c”. The submit file is
+job_Rmpi.sh. As a result, you will see the random numbers
+generated by the slaves in the slurm output file
The folder DOPARALLEL
contains two examples:
doParallel.R + shows how to use the foreach function in sequential mode + (1 core) and the parallel mode using 4 cores. What is the difference in the usage + of foreach for these two modes?
+Submit the job_doParallel.sh script and compare the timings of the + sequential and parallel codes.
+How many workers are allocated for this simulation? If you want to allocate + more or less, what changes must be made to these files?
+doParallel_ML.R presents the evaluation of several ML models in both + sequential and parallel modes using the standard “iris” database. The + difference is basically in the use of %dopar% instead of %do% function.
+Submit the batch script job_doParallel_ML.sh to the queue.
+In the output file observe the resulting elapsed times for the sequential + and the 4 cores parallel simulation.
+Upon submitting the job to the queue you will get a number called job ID. + Use the command:
+job-usage job_ID
to obtain a URL which you can copy/paste in your local browser. Tip: refresh + your browser several times to get the statistics.
+Can you see how the CPU is used? What about the memory?
+Note 1: In order to run this exercise, you need to have all the packages + listed at the beginning of this document installed.
+Note 2: If you want to try a different number of cores for running the + scripts, you should change that number in both the .R and .sh scripts
+In the folder ML
we show a ML model using a sonar database
+and Random Forest as the training method (Rscript.R). The simulations are done both in serial
+and parallel modes. You may change the values for the number of cores (1 in the present case)
+to other values. Notice that the number of cores needs to be the same in the
+files job.sh and Rscript.R.
Try a different number of cores and monitor the timings which are reported at +the end of the output file.
+Alphafold is installed as a module. Notice that on the Intel nodes there are more
+versions of Alphafold installed than on the AMD nodes. Thus, if you are targeting one
+version that is only installed on the Intel nodes, you will need to add the instruction
+#SBATCH -C skylake
to your batch script, otherwise the job could arrive to an
+AMD node that lacks that installation.
In the folder ALPHAFOLD
you will find a fasta secuence for a monomer and the
+corresponding batch file job.sh for running the simulation on
+GPUs. Try running the simulation with CPUs only and then with l40s, v100 and a100 GPUs.
Notice that the simulation will take ~1hrs. so the purpose of this exercise is to know +if the simulation starts running well only.
+The version 4.5.3 of CryoSPARC is installed as a module.
+One needs a license for using this software. For
+academic purposes a free of charge license can be requested at the website
+cryosparc.com (one working day for the processing).
+Once you obtain your license ID copy it, create a file called /home/u/username/.cryosparc-license
and paste
+it in the first line of this file. In the second line of the file write your email address.
Create a suitable folder in your project directory, for instance /proj/nobackup/hpc2n202X-XYZ/cryosparc
+and move into this folder. Download/copy the lane*tar
files that are located here to the cryosparc folder and untar them here (tar -xvf lane_CPU.tar
as an example).
Change the string Project_ID in the file lane*/cluster_script.sh
to reflect your current project.
+Also, the time was set to 20 min. in these files but for your realistic simulations you can change it to
+longer times (-t 00:20:00
).
The lanes should be recognized by CryoSPARC when it starts running.
+Load the CryoSPARC modules. Start CryoSPARC and accept the request which asks about continuing using +cryostart and that the folder was not used before. List the users on the server (which should be only yourself +for this type of license), check the email address that is displayed for this user (it should be the one you +added in the license file) and reset the password to. These steps are summarized here:
+$cryosparc start
+...
+Do you wish to continue starting cryosparc? [yN]: y
+...
+CryoSPARC master started.
+ From this machine, access CryoSPARC and CryoSPARC Live at
+ http://localhost:39007
+...
+
+$cryosparc listusers
+cryosparc resetpassword --email "myemail@mail.com" --password "choose-a-password"
+
Copy and paste the line which has the localhost port (notice that port number can change) to a browser on Kebnekaise:
+ +After loging in, you will be able to see the CryoSPARC’s dashboard:
+ +There are several tutorials at the CryoSPARC website, in the previous picture I followed the +Introductory Tutorial (v4.0+).
+Use cryosparc
instead of cryosparcm
On Kebnekaise the command cryosparc
should be used and not the one cited in the tutorial cryosparcm
Depending on the job type, CryoSPARC would suggest the hardware resources. For instance, in the tutorial +above Step 4: Import Movies suggests using 1 CPU upon queueing it, but Step 5: Motion Correction suggests +using 1 GPU. For CPU-only jobs you can choose the CPU lane, and if your job uses GPUs you can choose +among L40s, V100, A100, and H100. Notice that the V100 and L40s are the most abundant at the moment:
+ +Keypoints
+