GPU
Available GPUs
Several GPUs are available on the cluster:
GPU type | Node | RAM | Power consumption |
---|---|---|---|
Nvidia A30 | pbil-clouda30 |
24Go | 165W |
Nvidia A40 | pbil-clouda40 |
46Go | 300W |
Nvidia Titan X | pbil-deb33 |
12Go | 250W |
Requesting a GPU with SLURM
You can request a compute node with any GPU by using the --gpus
flag:
If you want to request a specific GPU, you can either use the --gres
flag:
# Request an interactive session with an A30 GPU
sinter --gres=gpu:a40
# Launch a slurm job with an A40 GPU
sbatch --gres=gpu:a30 script.sh
You can create more complex requests with the --constraint
flag:
# Request an interactive session with either an a30 or a40 GPU
sinter --constraint="[a30|a40]" --gpus=1
# Launch a slurm job with either an a30 or a40 GPU
sbatch --constraint="[a30|a40]" --gpus=1 script.sh
Advanced usage
Launching a job on two GPU nodes in parallel
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=1
#SBATCH --gres=gpu:1
#SBATCH --output=nvidia-smi-all.out
srun --nodes=1 --ntasks=1 bash -c 'hostname; nvidia-smi' &
srun --nodes=1 --ntasks=1 bash -c 'hostname; nvidia-smi' &
wait
Launching two jobs in parallel on the same GPU
It is possible to split a GPU by using MPS. The following slurm script launches two tasks in parallel on one GPU by splitting it in two.
#!/bin/bash
#SBATCH --constraint=a30
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=10M
#SBATCH --gres=mps:100
for i in {1..2}; do
srun --exclusive --nodes=1 --ntasks=1 --gres=mps:50 --output="out${i}.txt" bash -c 'sleep 5; hostname; nvidia-smi' &
done
wait
Warning
It is not possible to use both --gres=gpu
and --gres=mps
flags at the same time. When using MPS we have to use nodelist
or constraint
to request a specific GPU.