Line: 1 to 1 | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PALMA II!!! Attention !!! A new Wiki concerning information about PALMA II and HPC in general can be found at the WWU Confluence! | |||||||||||||||||||
Deleted: | |||||||||||||||||||
< < |
Content
OverviewPalma II is the HPC system of the Zentrum für Informationsverarbeitung. To be able to log in, you have to register for the group u0clstr in MeinZIV![]() ![]() FilesystemsWhen you log in to the cluster for the first time, a directory in /home is created for you. Please use this only to store your programs, but don't store your numerical results there. We have limited your storage in home to 400GB. You have to create a directory in /scratch/tmp to store the data you create on the compute nodes there. To enforce this, we will mount home read only on the compute nodes in the future. And since /scratch is not intended as an archive you are asked to remove your data there as soon as you do not need them anymore.Software/The module conceptThe software on palma-ng can be accessed via modules. These are small script that set environment variables (like PATH and LD_LIBRARY_PATH) pointing to the locations where the software is installed (this is mostly on network drives so that the software is available on every node in the cluster). The module system we use here is LMOD![]() ![]() ![]()
and you will see the software that has been compiled with this version. Alternatively you can use the "module spider" command. Monitoring
The batch systemThe batch system on PALMA II is SLURM. If you are used to PBS/Maui and want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]() The partitions
Submit a jobCreate a file for example called submit.cmd#!/bin/bash # set the number of nodes #SBATCH --nodes=1 # set the number of CPU cores per node #SBATCH --ntasks-per-node 72 # How much memory is needed (per node). Possible units: K, G, M, T #SBATCH --mem=64G # set a partition #SBATCH --partition normal # set max wallclock time #SBATCH --time=24:00:00 # set name of job #SBATCH --job-name=test123 # mail alert at start, end and abortion of execution #SBATCH --mail-type=ALL # set an output file #SBATCH --output output.dat # send mail to this address #SBATCH --mail-user=your_account@uni-muenster.de # run the application ./programYou can send your submission to the batch system with the command "sbatch submit.cmd" It is recommended to reserve complete nodes, if you can use 72 threads. A detailed description can be found here: http://slurm.schedmd.com/sbatch.html ![]() Starting jobs with MPI-parallel codesmpirun will get all necessary information from SLURM, if submitted appropriately. If you for example want to start 144 MPI ranks distributed to two nodes, you could do this the following way:#!/bin/bash # set the number of nodes #SBATCH --nodes=2 # set the number of CPU cores per node #SBATCH --exclusive # How much memory is needed (per node). Possible units: K, G, M, T. #SBATCH --mem=64G # set a partition #SBATCH --partition normal # set max wallclock time #SBATCH --time=2-00:00:00 # set name of job #SBATCH --job-name=test123 # mail alert at start, end and abortion of execution #SBATCH --mail-type=ALL # set an output file #SBATCH --output output.dat # send mail to this address #SBATCH --mail-user=your_account@uni-muenster.de # run the application mpirun programSome codes do not profit from Hyperthreading, so it is better, to start only 36 processes per node: #!/bin/bash # set the number of nodes #SBATCH --nodes=2 # set the number of CPU cores per node #SBATCH --exclusive #SBATCH --ntasks-per-node=36 # How much memory is needed (per node). Possible units: K, G, M, T. #SBATCH --mem=64G # set a partition #SBATCH --partition normal # set max wallclock time #SBATCH --time=2-00:00:00 # set name of job #SBATCH --job-name=test123 # mail alert at start, end and abortion of execution #SBATCH --mail-type=ALL # set an output file #SBATCH --output output.dat # send mail to this address #SBATCH --mail-user=your_account@uni-muenster.de # run the application mpirun programFor starting hybrid jobs (meaning that they are using MPI and OpenMP parallelization at the same time), you can use the --cpus-per-task switch. srun -p normal --nodes=2 --ntasks=72 --ntasks-per-node=36 --cpus-per-task=2 --pty bash OMP_NUM_THREADS=2 mpirun ./program Using the GPU nodesIf you want to use a GPU for your computations:
Using CaffeCaffe 1.0 is available for Python3 on the GPU partitions in the fosscuda/2018b toolchain. To use it, you have to loadfosscuda/2018b and Caffe (ml fosscuda/2018b Caffe) and export the Caffe PYTHONPATH .
On Skylake nodes (gputitanxp and gpuv100 partitions)
PYTHONPATH=/Applic.HPC/skylakegpu/software/MPI/GCC-CUDA/7.3.0-2.30-9.2.88/OpenMPI/3.1.1/Caffe/1.0-Python-3.6.6/python:$PYTHONPATHOn Broadwell nodes ( gpuk20 partition)
PYTHONPATH=/Applic.HPC/k20gpu/software/MPI/GCC-CUDA/7.3.0-2.30-9.2.88/OpenMPI/3.1.1/Caffe/1.0-Python-3.6.6/python:$PYTHONPATH Show information about the partitionsscontrol show partition Show information about the nodessinfo Running interactive jobs with SLURMUse for example the following command:srun --partition express --nodes 1 --ntasks-per-node=8 --pty bashThis starts a job in the express partition on one node with eight cores. Information on jobsList all current jobs for a user:squeue -u <username>List all running jobs for a user: squeue -u <username> -t RUNNINGList all pending jobs for a user: squeue -u <username> -t PENDINGList all current jobs in the normal partition for a user: squeue -u <username> -p normalList detailed information for a job (useful for troubleshooting): scontrol show job -dd <jobid>Once your job has completed, you can get additional information that was not available during the run. This includes run time, memory used, etc. To get statistics on completed jobs by jobID: sacct -j <jobid> --format=JobID,JobName,MaxRSS,ElapsedTo view the same information for all jobs of a user: sacct -u <username> --format=JobID,JobName,MaxRSS,ElapsedShow priorities for waiting jobs:
Controlling jobsTo cancel one job:scancel <jobid>To cancel all the jobs for a user: scancel -u <username>To cancel all the pending jobs for a user: scancel -t PENDING -u <username>To cancel one or more jobs by name: scancel --name myJobNameTo pause a particular job: scontrol hold <jobid>To resume a particular job: scontrol resume <jobid>To requeue (cancel and rerun) a particular job: scontrol requeue <jobid> VisualizationFor the visualization of bigger data sets, it is impractical to copy them to your local machine. We therefore offer a solution to do the postprocessing on Palma II. Since the CPUs are quite fast, the rendering is done in software.
![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Added: | ||||||||
> > | !!! Attention !!! A new Wiki concerning information about PALMA II and HPC in general can be found at the WWU Confluence! | |||||||
Content
Overview | ||||||||
Line: 300 to 285 | ||||||||
| ||||||||
Changed: | ||||||||
< < | vnc.sh | |||||||
> > | vnc.sh | |||||||
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 295 to 295 | ||||||||
scontrol requeue <jobid> Visualization | ||||||||
Changed: | ||||||||
< < | For the visualization of bigger data sets, it is impractical to copy them to your local machine. We therefore offer a solution to do the postprocessing on Palma II. Since the CPUs are quite powerful, the rendering is done in software. | |||||||
> > | For the visualization of bigger data sets, it is impractical to copy them to your local machine. We therefore offer a solution to do the postprocessing on Palma II. Since the CPUs are quite fast, the rendering is done in software. | |||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 294 to 294 | ||||||||
scontrol requeue <jobid> | ||||||||
Changed: | ||||||||
< < | Visualization | |||||||
> > | VisualizationFor the visualization of bigger data sets, it is impractical to copy them to your local machine. We therefore offer a solution to do the postprocessing on Palma II. Since the CPUs are quite powerful, the rendering is done in software.
| |||||||
-- ![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 53 to 53 | ||||||||
The batch system on PALMA II is SLURM. If you are used to PBS/Maui and want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]() The partitions | ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 208 to 208 | ||||||||
Using Caffe | ||||||||
Changed: | ||||||||
< < | Caffe 1.0 is available for Python3 on the gpuv100 partition in the fosscuda/2018b toolchain. To use it, you have to load fosscuda/2018b and Caffe (ml fosscuda/2018b Caffe) and export the Caffe Pythonpath in your submit file.
--export=PYTHONPATH=/Applic.HPC/skylakegpu/software/MPI/GCC-CUDA/7.3.0-2.30-9.2.88/OpenMPI/3.1.1/Caffe/1.0-Python-3.6.6/python:$PYTHONPATH | |||||||
> > | Caffe 1.0 is available for Python3 on the GPU partitions in the fosscuda/2018b toolchain. To use it, you have to load fosscuda/2018b and Caffe (ml fosscuda/2018b Caffe) and export the Caffe PYTHONPATH .
On Skylake nodes (gputitanxp and gpuv100 partitions)
PYTHONPATH=/Applic.HPC/skylakegpu/software/MPI/GCC-CUDA/7.3.0-2.30-9.2.88/OpenMPI/3.1.1/Caffe/1.0-Python-3.6.6/python:$PYTHONPATHOn Broadwell nodes ( gpuk20 partition)
PYTHONPATH=/Applic.HPC/k20gpu/software/MPI/GCC-CUDA/7.3.0-2.30-9.2.88/OpenMPI/3.1.1/Caffe/1.0-Python-3.6.6/python:$PYTHONPATH | |||||||
Show information about the partitionsscontrol show partition |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 110 to 113 | ||||||||
#SBATCH --mail-user=your_account@uni-muenster.de # run the application | ||||||||
Changed: | ||||||||
< < | ./program | |||||||
> > | ./program | |||||||
You can send your submission to the batch system with the command "sbatch submit.cmd" | ||||||||
Line: 204 to 206 | ||||||||
| ||||||||
Added: | ||||||||
> > | Using Caffe | |||||||
Added: | ||||||||
> > | Caffe 1.0 is available for Python3 on the gpuv100 partition in the fosscuda/2018b toolchain. To use it, you have to load fosscuda/2018b and Caffe (ml fosscuda/2018b Caffe) and export the Caffe Pythonpath in your submit file.
--export=PYTHONPATH=/Applic.HPC/skylakegpu/software/MPI/GCC-CUDA/7.3.0-2.30-9.2.88/OpenMPI/3.1.1/Caffe/1.0-Python-3.6.6/python:$PYTHONPATH | |||||||
Show information about the partitionsscontrol show partition |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 56 to 56 | ||||||||
| ||||||||
Added: | ||||||||
> > |
| |||||||
There are some special partitions, which are only allowed for certain groups (these are also Skylake nodes like in the normal queue):
| ||||||||
Line: 68 to 71 | ||||||||
| ||||||||
Changed: | ||||||||
< < | When using PBS skript, there are some differences to PALMA: | |||||||
> > | When using PBS skript, there are some differences to the old PALMA: | |||||||
| ||||||||
Line: 194 to 197 | ||||||||
srun -p normal --nodes=2 --ntasks=72 --ntasks-per-node=36 --cpus-per-task=2 --pty bash OMP_NUM_THREADS=2 mpirun ./program | ||||||||
Added: | ||||||||
> > | Using the GPU nodesIf you want to use a GPU for your computations:
| |||||||
Show information about the partitionsscontrol show partition |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 120 to 120 | ||||||||
mpirun will get all necessary information from SLURM, if submitted appropriately. If you for example want to start 144 MPI ranks distributed to two nodes, you could do this the following way: | ||||||||
Changed: | ||||||||
< < | srun -p normal --nodes=2 --ntasks=144 --ntasks-per-node=72 --pty bash mpirun ./program | |||||||
> > | #!/bin/bash
# set the number of nodes
#SBATCH --nodes=2
# set the number of CPU cores per node
#SBATCH --exclusive
# How much memory is needed (per node)
#SBATCH --mem=64GB
# set a partition
#SBATCH --partition normal
# set max wallclock time
#SBATCH --time=2-00:00:00
# set name of job
#SBATCH --job-name=test123
# mail alert at start, end and abortion of execution
#SBATCH --mail-type=ALL
# set an output file
#SBATCH --output output.dat
# send mail to this address
#SBATCH --mail-user=your_account@uni-muenster.de
# run the application
mpirun program
Some codes do not profit from Hyperthreading, so it is better, to start only 36 processes per node:
#!/bin/bash # set the number of nodes #SBATCH --nodes=2 # set the number of CPU cores per node #SBATCH --exclusive #SBATCH --ntasks-per-node=36 # How much memory is needed (per node) #SBATCH --mem=64GB # set a partition #SBATCH --partition normal # set max wallclock time #SBATCH --time=2-00:00:00 # set name of job #SBATCH --job-name=test123 # mail alert at start, end and abortion of execution #SBATCH --mail-type=ALL # set an output file #SBATCH --output output.dat # send mail to this address #SBATCH --mail-user=your_account@uni-muenster.de # run the application mpirun program | |||||||
Deleted: | ||||||||
< < | or for an non-interactive run or put those parameters in the batch script. | |||||||
For starting hybrid jobs (meaning that they are using MPI and OpenMP parallelization at the same time), you can use the --cpus-per-task switch.
srun -p normal --nodes=2 --ntasks=72 --ntasks-per-node=36 --cpus-per-task=2 --pty bash |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 42 to 42 | ||||||||
Monitoring | ||||||||
Changed: | ||||||||
< < | ||||||||
> > | ||||||||
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 55 to 55 | ||||||||
| ||||||||
Added: | ||||||||
> > |
| |||||||
Changed: | ||||||||
< < | There are some special queues, which are only allowed for certain groups (these are also Skylake nodes like in the normal queue): (not yet available)
| |||||||
> > | There are some special partitions, which are only allowed for certain groups (these are also Skylake nodes like in the normal queue):
| |||||||
When using PBS skript, there are some differences to PALMA:
|
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 7 to 7 | ||||||||
Palma II is the HPC system of the Zentrum für Informationsverarbeitung. To be able to log in, you have to register for the group u0clstr in MeinZIV![]() ![]() | ||||||||
Changed: | ||||||||
< < | Software/The module concept | |||||||
> > |
FilesystemsWhen you log in to the cluster for the first time, a directory in /home is created for you. Please use this only to store your programs, but don't store your numerical results there. We have limited your storage in home to 400GB. You have to create a directory in /scratch/tmp to store the data you create on the compute nodes there. To enforce this, we will mount home read only on the compute nodes in the future. And since /scratch is not intended as an archive you are asked to remove your data there as soon as you do not need them anymore.Software/The module concept | |||||||
The software on palma-ng can be accessed via modules. These are small script that set environment variables (like PATH and LD_LIBRARY_PATH) pointing to the locations where the software is installed (this is mostly on network drives so that the software is available on every node in the cluster). The module system we use here is LMOD![]() | ||||||||
Line: 47 to 51 | ||||||||
The batch system on PALMA II is SLURM. If you are used to PBS/Maui and want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]() The partitions | ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
| ||||||||
Line: 74 to 78 | ||||||||
#SBATCH --nodes=1 # set the number of CPU cores per node | ||||||||
Changed: | ||||||||
< < | #SBATCH --ntasks-per-node 8 | |||||||
> > | #SBATCH --ntasks-per-node 72 # How much memory is needed (per node) #SBATCH --mem=64GB | |||||||
# set a partition #SBATCH --partition normal | ||||||||
Line: 100 to 107 | ||||||||
You can send your submission to the batch system with the command "sbatch submit.cmd" | ||||||||
Added: | ||||||||
> > | It is recommended to reserve complete nodes, if you can use 72 threads. | |||||||
A detailed description can be found here: http://slurm.schedmd.com/sbatch.html![]() Starting jobs with MPI-parallel codes | ||||||||
Changed: | ||||||||
< < | mpirun will get all necessary information from SLURM, if submitted appropriately. If you for example want to start 128 MPI ranks distributed to four nodes, you could do this the following way:
| |||||||
> > | mpirun will get all necessary information from SLURM, if submitted appropriately. If you for example want to start 144 MPI ranks distributed to two nodes, you could do this the following way:
srun -p normal --nodes=2 --ntasks=144 --ntasks-per-node=72 --pty bash mpirun ./program | |||||||
or for an non-interactive run or put those parameters in the batch script. For starting hybrid jobs (meaning that they are using MPI and OpenMP parallelization at the same time), you can use the --cpus-per-task switch. | ||||||||
Changed: | ||||||||
< < |
Using GPU resourcesThe k20gpu queue features 4 nodes with 3 K20 nVidia Tesla accelerators each. To use one of these the following option must be present in your batch script:It is also possible to use more than one. Additionally it is also possible to specify the type of GPU you want to work on. At the moment there are the following types:
| |||||||
> > | srun -p normal --nodes=2 --ntasks=72 --ntasks-per-node=36 --cpus-per-task=2 --pty bash OMP_NUM_THREADS=2 mpirun ./program | |||||||
Show information about the partitionsscontrol show partition | ||||||||
Line: 161 to 161 | ||||||||
sacct -u <username> --format=JobID,JobName,MaxRSS,ElapsedShow priorities for waiting jobs: | ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
Controlling jobs |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 44 to 44 | ||||||||
The batch system | ||||||||
Changed: | ||||||||
< < | The batch system on PALMA3 is SLURM, but there is a wrapper for PBS installed, so most of your skripts should still be able to work. If you want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]() | |||||||
> > | The batch system on PALMA II is SLURM. If you are used to PBS/Maui and want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]() | |||||||
The partitions | ||||||||
Changed: | ||||||||
< < |
| |||||||
> > |
| |||||||
| ||||||||
Changed: | ||||||||
< < | There are some special queues, which are only allowed for certain groups (these are also Broadwell nodes like in the normal queue): | |||||||
> > |
| |||||||
| ||||||||
Line: 62 to 63 | ||||||||
| ||||||||
Changed: | ||||||||
< < |
| |||||||
> > | ||||||||
Submit a jobCreate a file for example called submit.cmd | ||||||||
Line: 123 to 124 | ||||||||
To specify a certain type use the following in your batch script:
| ||||||||
Changed: | ||||||||
< < | Show information about the queues | |||||||
> > | Show information about the partitions | |||||||
scontrol show partition Show information about the nodes | ||||||||
Line: 134 to 135 | ||||||||
Use for example the following command:
srun --partition express --nodes 1 --ntasks-per-node=8 --pty bash | ||||||||
Changed: | ||||||||
< < | This starts a job in the u0dawin queue/partition on one node with eight cores. | |||||||
> > | This starts a job in the express partition on one node with eight cores. | |||||||
Information on jobs |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA IIContent
Overview | ||||||||
Changed: | ||||||||
< < | ||||||||
> > | Palma II is the HPC system of the Zentrum für Informationsverarbeitung. To be able to log in, you have to register for the group u0clstr in MeinZIV![]() ![]() | |||||||
Software/The module conceptThe software on palma-ng can be accessed via modules. These are small script that set environment variables (like PATH and LD_LIBRARY_PATH) pointing to the locations where the software is installed (this is mostly on network drives so that the software is available on every node in the cluster). The module system we use here is LMOD![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
PALMA II | ||||||||
Line: 65 to 65 | ||||||||
Submit a jobCreate a file for example called submit.cmd | ||||||||
Changed: | ||||||||
< < |
| |||||||
> > | #!/bin/bash # set the number of nodes #SBATCH --nodes=1 # set the number of CPU cores per node #SBATCH --ntasks-per-node 8 # set a partition #SBATCH --partition normal # set max wallclock time #SBATCH --time=24:00:00 # set name of job #SBATCH --job-name=test123 # mail alert at start, end and abortion of execution #SBATCH --mail-type=ALL # set an output file #SBATCH --output output.dat # send mail to this address #SBATCH --mail-user=your_account@uni-muenster.de # run the application ./program | |||||||
You can send your submission to the batch system with the command "sbatch submit.cmd" |
Line: 1 to 1 | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PALMA II | |||||||||||||||||||
Changed: | |||||||||||||||||||
< < | The documentation has yet to be written. So far, have a look at palma-ng, since PALMA II is also using SLURM as batch system. | ||||||||||||||||||
> > | Content
OverviewSoftware/The module conceptThe software on palma-ng can be accessed via modules. These are small script that set environment variables (like PATH and LD_LIBRARY_PATH) pointing to the locations where the software is installed (this is mostly on network drives so that the software is available on every node in the cluster). The module system we use here is LMOD![]() ![]() ![]()
and you will see the software that has been compiled with this version. Alternatively you can use the "module spider" command. Monitoring
The batch systemThe batch system on PALMA3 is SLURM, but there is a wrapper for PBS installed, so most of your skripts should still be able to work. If you want to switch to SLURM, this document might help you: https://slurm.schedmd.com/rosetta.pdf![]() The partitions
Submit a jobCreate a file for example called submit.cmdYou can send your submission to the batch system with the command "sbatch submit.cmd" A detailed description can be found here: http://slurm.schedmd.com/sbatch.html ![]() Starting jobs with MPI-parallel codesmpirun will get all necessary information from SLURM, if submitted appropriately. If you for example want to start 128 MPI ranks distributed to four nodes, you could do this the following way:or for an non-interactive run or put those parameters in the batch script. For starting hybrid jobs (meaning that they are using MPI and OpenMP parallelization at the same time), you can use the --cpus-per-task switch.
Using GPU resourcesThe k20gpu queue features 4 nodes with 3 K20 nVidia Tesla accelerators each. To use one of these the following option must be present in your batch script:It is also possible to use more than one. Additionally it is also possible to specify the type of GPU you want to work on. At the moment there are the following types:
Show information about the queuesscontrol show partition Show information about the nodessinfo Running interactive jobs with SLURMUse for example the following command:srun --partition express --nodes 1 --ntasks-per-node=8 --pty bashThis starts a job in the u0dawin queue/partition on one node with eight cores. Information on jobsList all current jobs for a user:squeue -u <username>List all running jobs for a user: squeue -u <username> -t RUNNINGList all pending jobs for a user: squeue -u <username> -t PENDINGList all current jobs in the normal partition for a user: squeue -u <username> -p normalList detailed information for a job (useful for troubleshooting): scontrol show job -dd <jobid>Once your job has completed, you can get additional information that was not available during the run. This includes run time, memory used, etc. To get statistics on completed jobs by jobID: sacct -j <jobid> --format=JobID,JobName,MaxRSS,ElapsedTo view the same information for all jobs of a user: sacct -u <username> --format=JobID,JobName,MaxRSS,ElapsedShow priorities for waiting jobs:
Controlling jobsTo cancel one job:scancel <jobid>To cancel all the jobs for a user: scancel -u <username>To cancel all the pending jobs for a user: scancel -t PENDING -u <username>To cancel one or more jobs by name: scancel --name myJobNameTo pause a particular job: scontrol hold <jobid>To resume a particular job: scontrol resume <jobid>To requeue (cancel and rerun) a particular job: scontrol requeue <jobid> Visualization | ||||||||||||||||||
-- ![]() |
Line: 1 to 1 | ||||||||
---|---|---|---|---|---|---|---|---|
Added: | ||||||||
> > |
PALMA IIThe documentation has yet to be written. So far, have a look at palma-ng, since PALMA II is also using SLURM as batch system. --![]() |