Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 6 to 6 | ||||||||
Overviewpalma3 is the login node to a newer part of the PALMA system. It has various queues/partitions for different purposes: | ||||||||
Deleted: | ||||||||
< < |
| |||||||
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 93 to 93 | ||||||||
#SBATCH --ntasks-per-node 8 # set a partition | ||||||||
Changed: | ||||||||
< < | #SBATCH --partition u0dawin | |||||||
> > | #SBATCH --partition normal | |||||||
# set max wallclock time #SBATCH --time=24:00:00 | ||||||||
Line: 159 to 159 | ||||||||
Use for example the following command: | ||||||||
Changed: | ||||||||
< < | srun --partition u0dawin --nodes 1 --ntasks-per-node=8 --pty bash | |||||||
> > | srun --partition express --nodes 1 --ntasks-per-node=8 --pty bash | |||||||
This starts a job in the u0dawin queue/partition on one node with eight cores. |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 9 to 9 | ||||||||
| ||||||||
Deleted: | ||||||||
< < |
| |||||||
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 13 to 13 | ||||||||
| ||||||||
Added: | ||||||||
> > |
| |||||||
There are some special queues, which are only allowed for certain groups (these are also Broadwell nodes like in the normal queue):
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 184 to 184 | ||||||||
List detailed information for a job (useful for troubleshooting): | ||||||||
Changed: | ||||||||
< < | scontrol show jobid -dd <jobid> | |||||||
> > | scontrol show job -dd <jobid> | |||||||
Once your job has completed, you can get additional information that was not available during the run. This includes run time, memory used, etc. To get statistics on completed jobs by jobID: |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 135 to 135 | ||||||||
srun -p normal --nodes=2 --ntasks=64 --ntasks-per-node=32 --cpus-per-task=2 --pty bash OMP_NUM_THREADS=2 mpirun ./program | ||||||||
Added: | ||||||||
> > | Using GPU resourcesThe k20gpu queue features 4 nodes with 3 K20 nVidia Tesla accelerators each. To use one of these the following option must be present in your batch script:#SBATCH --gres=gpu:1It is also possible to use more than one. Additionally it is also possible to specify the type of GPU you want to work on. At the moment there are the following types:
#SBATCH --gres=gpu:kepler:1 | |||||||
Show information about the queuesscontrol show partition |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 91 to 91 | ||||||||
#SBATCH --nodes=1 # set the number of CPU cores per node | ||||||||
Changed: | ||||||||
< < | #SBATCH --ntasks 8 | |||||||
> > | #SBATCH --ntasks-per-node 8 | |||||||
# set a partition #SBATCH --partition u0dawin |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 8 to 8 | ||||||||
palma3 is the login node to a newer part of the PALMA system. It has various queues/partitions for different purposes:
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 21 to 21 | ||||||||
| ||||||||
Added: | ||||||||
> > | ![]() | |||||||
Software/The module conceptThe software on palma-ng can be accessed via modules. These are small script that set environment variables (like PATH and LD_LIBRARY_PATH) pointing to the locations where the software is installed (this is mostly on network drives so that the software is available on every node in the cluster). The module system we use here is LMOD![]() | ||||||||
Line: 48 to 48 | ||||||||
| ||||||||
Deleted: | ||||||||
< < | ||||||||
module add intel/2016b | ||||||||
Changed: | ||||||||
< < | module av | |||||||
> > | module av | |||||||
and you will see the software that has been compiled with this version. Alternatively you can use the "module spider" command. | ||||||||
Line: 126 to 124 | ||||||||
Starting jobs with MPI-parallel codesmpirun will get all necessary information from SLURM, if submitted appropriately. If you for example want to start 128 MPI ranks distributed to four nodes, you could do this the following way: | ||||||||
Deleted: | ||||||||
< < | ||||||||
srun -p normal --nodes=2 --ntasks=128 --ntasks-per-node=64 --pty bash | ||||||||
Changed: | ||||||||
< < | mpirun ./program | |||||||
> > | mpirun ./program | |||||||
or for an non-interactive run or put those parameters in the batch script. | ||||||||
Line: 135 to 131 | ||||||||
or for an non-interactive run or put those parameters in the batch script. For starting hybrid jobs (meaning that they are using MPI and OpenMP parallelization at the same time), you can use the --cpus-per-task switch. | ||||||||
Deleted: | ||||||||
< < | ||||||||
srun -p normal --nodes=2 --ntasks=64 --ntasks-per-node=32 --cpus-per-task=2 --pty bash | ||||||||
Changed: | ||||||||
< < | OMP_NUM_THREADS=2 mpirun ./program | |||||||
> > | OMP_NUM_THREADS=2 mpirun ./program | |||||||
Show information about the queuesscontrol show partition | ||||||||
Line: 205 to 215 | ||||||||
scontrol requeue <jobid>-- ![]() | ||||||||
Added: | ||||||||
> > | ||||||||
| ||||||||
Added: | ||||||||
> > |
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 9 to 9 | ||||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 27 to 27 | ||||||||
The most important difference between Palma I and palma-ng is the new introduced hierarchical module naming scheme![]() | ||||||||
Added: | ||||||||
> > | (1) https://www.tacc.utexas.edu/research-development/tacc-projects/lmod![]() ![]() | |||||||
| ||||||||
Line: 37 to 41 | ||||||||
| ||||||||
Changed: | ||||||||
< < | (1) https://www.tacc.utexas.edu/research-development/tacc-projects/lmod![]() | |||||||
> > | Hierarchical module naming scheme means that you do not see all modules at the same time. You will have to load a toolchain or compiler first to see the software that has been compiled with those. At the moment there are the following toolchains: | |||||||
Changed: | ||||||||
< < | (2) https://hpcugent.github.io/easybuild/files/hust14_paper.pdf![]() | |||||||
> > |
module add intel/2016b module avand you will see the software that has been compiled with this version. Alternatively you can use the "module spider" command. | |||||||
Using the module command in submit scripts |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 21 to 21 | ||||||||
| ||||||||
Changed: | ||||||||
< < | The module concept | |||||||
> > | Software/The module concept | |||||||
Changed: | ||||||||
< < | Environment variables (like PATH, LD_LIBRARY_PATH) for compilers and libraries can be set by modules: | |||||||
> > | The software on palma-ng can be accessed via modules. These are small script that set environment variables (like PATH and LD_LIBRARY_PATH) pointing to the locations where the software is installed (this is mostly on network drives so that the software is available on every node in the cluster). The module system we use here is LMOD![]() ![]() | |||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
![]() ![]() | |||||||
Deleted: | ||||||||
< < | When you log in to palma3, some modules are loaded automatically. | |||||||
Using the module command in submit scriptsThis is only valid for the u0dawin queue |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 75 to 75 | ||||||||
#SBATCH --ntasks 8 # set a partition | ||||||||
Changed: | ||||||||
< < | #SBATCH -p u0dawin | |||||||
> > | #SBATCH --partition u0dawin | |||||||
# set max wallclock time #SBATCH --time=24:00:00 | ||||||||
Line: 87 to 87 | ||||||||
#SBATCH --mail-type=ALL # set an output file | ||||||||
Changed: | ||||||||
< < | #SBATCH -o output.dat | |||||||
> > | #SBATCH --output output.dat | |||||||
# send mail to this address #SBATCH --mail-user=your_account@uni-muenster.de | ||||||||
Line: 102 to 102 | ||||||||
A detailed description can be found here: http://slurm.schedmd.com/sbatch.html![]() | ||||||||
Added: | ||||||||
> > | Starting jobs with MPI-parallel codesmpirun will get all necessary information from SLURM, if submitted appropriately. If you for example want to start 128 MPI ranks distributed to four nodes, you could do this the following way:srun -p normal --nodes=2 --ntasks=128 --ntasks-per-node=64 --pty bash mpirun ./programor for an non-interactive run or put those parameters in the batch script. For starting hybrid jobs (meaning that they are using MPI and OpenMP parallelization at the same time), you can use the --cpus-per-task switch. srun -p normal --nodes=2 --ntasks=64 --ntasks-per-node=32 --cpus-per-task=2 --pty bash OMP_NUM_THREADS=2 mpirun ./program | |||||||
Show information about the queuesscontrol show partition | ||||||||
Line: 111 to 129 | ||||||||
Running interactive jobs with SLURMUse for example the following command: | ||||||||
Changed: | ||||||||
< < | srun -p u0dawin --nodes 1 --ntasks-per-node=8 --pty bash | |||||||
> > | srun --partition u0dawin --nodes 1 --ntasks-per-node=8 --pty bash | |||||||
This starts a job in the u0dawin queue/partition on one node with eight cores. |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 111 to 111 | ||||||||
Running interactive jobs with SLURMUse for example the following command: | ||||||||
Changed: | ||||||||
< < | srun -p u0dawin -N 1 --ntasks-per-node=8 --pty bash | |||||||
> > | srun -p u0dawin --nodes 1 --ntasks-per-node=8 --pty bash | |||||||
This starts a job in the u0dawin queue/partition on one node with eight cores. | ||||||||
Added: | ||||||||
> > | ||||||||
Information on jobsList all current jobs for a user: |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 72 to 72 | ||||||||
#SBATCH --nodes=1 # set the number of CPU cores per node | ||||||||
Changed: | ||||||||
< < | #SBATCH -n 8 | |||||||
> > | #SBATCH --ntasks 8 | |||||||
# set a partition #SBATCH -p u0dawin |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 9 to 9 | ||||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
There are some special queues, which are only allowed for certain groups (these are also Broadwell nodes like in the normal queue):
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 37 to 37 | ||||||||
When you log in to palma3, some modules are loaded automatically.
Using the module command in submit scripts | ||||||||
Added: | ||||||||
> > | This is only valid for the u0dawin queue | |||||||
If you want to use the module command in submit scripts, the line
source /etc/profile.d/modules.sh; source /etc/profile.d/modules_local.sh |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 8 to 8 | ||||||||
palma3 is the login node to a newer part of the PALMA system. It has various queues/partitions for different purposes:
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
| ||||||||
Added: | ||||||||
> > |
| |||||||
The module concept |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 10 to 10 | ||||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
The module concept | ||||||||
Line: 38 to 39 | ||||||||
Ganglia![]() | ||||||||
Added: | ||||||||
> > | If you have X forwarding enabled, you can use llview (Just type "llview" at the command line).
![]() | |||||||
The batch systemThe batch system on PALMA3 is SLURM, but there is a wrapper for PBS installed, so most of your skripts should still be able to work. If you want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]() | ||||||||
Line: 124 to 128 | ||||||||
Show priorities for waiting jobs: | ||||||||
Changed: | ||||||||
< < | sprio | |||||||
> > | sprio | |||||||
Controlling jobs | ||||||||
Line: 151 to 154 | ||||||||
scontrol requeue <jobid>-- ![]() | ||||||||
Added: | ||||||||
> > |
|
Line: 1 to 1 | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PALMA-NG | |||||||||||||||
Line: 12 to 12 | |||||||||||||||
| |||||||||||||||
Added: | |||||||||||||||
> > | The module conceptEnvironment variables (like PATH, LD_LIBRARY_PATH) for compilers and libraries can be set by modules:
Using the module command in submit scriptsIf you want to use the module command in submit scripts, the linesource /etc/profile.d/modules.sh; source /etc/profile.d/modules_local.shhas to be added before. Otherwise, just put the "module add" commands in your .bashrc (which can be found in your home-directory). | ||||||||||||||
MonitoringGanglia![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 12 to 12 | ||||||||
| ||||||||
Added: | ||||||||
> > | MonitoringGanglia![]() | |||||||
The batch systemThe batch system on PALMA3 is SLURM, but there is a wrapper for PBS installed, so most of your skripts should still be able to work. If you want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]() | ||||||||
Line: 61 to 65 | ||||||||
You can send your submission to the batch system with the command "sbatch submit.cmd"
A detailed description can be found here: http://slurm.schedmd.com/sbatch.html![]() | ||||||||
Deleted: | ||||||||
< < |
Show running jobs
| |||||||
Show information about the queuesscontrol show partition | ||||||||
Added: | ||||||||
> > |
Show information about the nodessinfo | |||||||
Running interactive jobs with SLURMUse for example the following command:srun -p u0dawin -N 1 --ntasks-per-node=8 --pty bashThis starts a job in the u0dawin queue/partition on one node with eight cores. | ||||||||
Added: | ||||||||
> > | Information on jobsList all current jobs for a user:squeue -u <username>List all running jobs for a user: squeue -u <username> -t RUNNINGList all pending jobs for a user: squeue -u <username> -t PENDINGList all current jobs in the normal partition for a user: squeue -u <username> -p normalList detailed information for a job (useful for troubleshooting): scontrol show jobid -dd <jobid>Once your job has completed, you can get additional information that was not available during the run. This includes run time, memory used, etc. To get statistics on completed jobs by jobID: sacct -j <jobid> --format=JobID,JobName,MaxRSS,ElapsedTo view the same information for all jobs of a user: sacct -u <username> --format=JobID,JobName,MaxRSS,ElapsedShow priorities for waiting jobs: sprio Controlling jobsTo cancel one job:scancel <jobid>To cancel all the jobs for a user: scancel -u <username>To cancel all the pending jobs for a user: scancel -t PENDING -u <username>To cancel one or more jobs by name: scancel --name myJobNameTo pause a particular job: scontrol hold <jobid>To resume a particular job: scontrol resume <jobid>To requeue (cancel and rerun) a particular job: scontrol requeue <jobid> | |||||||
-- ![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA-NG | ||||||||
Line: 52 to 52 | ||||||||
# send mail to this address #SBATCH --mail-user=your_account@uni-muenster.de | ||||||||
Added: | ||||||||
> > | # In the u0dawin queue, you will need the following line source /etc/profile.d/modules.sh; source /etc/profile.d/modules_local.sh | |||||||
# run the application ./program |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Changed: | ||||||||
< < | PALMA3 | |||||||
> > | PALMA-NG | |||||||
Content
Overview | ||||||||
Line: 52 to 53 | ||||||||
#SBATCH --mail-user=your_account@uni-muenster.de # run the application | ||||||||
Changed: | ||||||||
< < | ./program | |||||||
> > | ./program | |||||||
You can send your submission to the batch system with the command "sbatch submit.cmd" |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA3 | ||||||||
Line: 9 to 9 | ||||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
The batch systemThe batch system on PALMA3 is SLURM, but there is a wrapper for PBS installed, so most of your skripts should still be able to work. If you want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]() | ||||||||
Line: 22 to 23 | ||||||||
Submit a job | ||||||||
Added: | ||||||||
> > | Create a file for example called submit.cmd
#!/bin/bash # set the number of nodes #SBATCH --nodes=1 # set the number of CPU cores per node #SBATCH -n 8 # set a partition #SBATCH -p u0dawin # set max wallclock time #SBATCH --time=24:00:00 # set name of job #SBATCH --job-name=test123 # mail alert at start, end and abortion of execution #SBATCH --mail-type=ALL # set an output file #SBATCH -o output.dat # send mail to this address #SBATCH --mail-user=your_account@uni-muenster.de # run the application ./programYou can send your submission to the batch system with the command "sbatch submit.cmd" A detailed description can be found here: http://slurm.schedmd.com/sbatch.html ![]() | |||||||
Show running jobs
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA3 | ||||||||
Changed: | ||||||||
< < | palma3 is the login node to a newer part of the PALMA system. It has various queues for different purposes: | |||||||
> > | Content
Overviewpalma3 is the login node to a newer part of the PALMA system. It has various queues/partitions for different purposes: | |||||||
| ||||||||
Line: 14 to 17 | ||||||||
When using PBS skript, there are some differences to PALMA:
| ||||||||
Added: | ||||||||
> > |
| |||||||
| ||||||||
Added: | ||||||||
> > | Submit a jobShow running jobs
Show information about the queuesscontrol show partition | |||||||
Running interactive jobs with SLURMUse for example the following command: | ||||||||
Deleted: | ||||||||
< < | srun -p u0dawin -N 1 --ntasks-per-node=8 --pty bash | |||||||
Changed: | ||||||||
< < | -- ![]() | |||||||
> > | srun -p u0dawin -N 1 --ntasks-per-node=8 --pty bash | |||||||
Changed: | ||||||||
< < | Comments | |||||||
> > | This starts a job in the u0dawin queue/partition on one node with eight cores. | |||||||
Changed: | ||||||||
< < | ||||||||
> > | -- ![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA3 | ||||||||
Line: 8 to 8 | ||||||||
| ||||||||
Added: | ||||||||
> > | The batch systemThe batch system on PALMA3 is SLURM, but there is a wrapper for PBS installed, so most of your skripts should still be able to work. If you want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]()
Running interactive jobs with SLURMUse for example the following command:srun -p u0dawin -N 1 --ntasks-per-node=8 --pty bash | |||||||
-- ![]() Comments |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA3 | ||||||||
Changed: | ||||||||
< < | palma3 is the login node to a newer part of the PALMA system. It has various queues for different purposes: | |||||||
> > | palma3 is the login node to a newer part of the PALMA system. It has various queues for different purposes:
| |||||||
| ||||||||
Changed: | ||||||||
< < | ||||||||
> > |
| |||||||
-- ![]() Comments |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Added: | ||||||||
> > |
PALMA3palma3 is the login node to a newer part of the PALMA system. It has various queues for different purposes:
![]() Comments |