Aller au contenu

GPU

Available GPUs

Several GPUs are available on the cluster:

GPU type Node RAM Power consumption
Nvidia A30 pbil-clouda30 24Go 165W
Nvidia A40 pbil-clouda40 46Go 300W
Nvidia Titan X pbil-deb33 12Go 250W

Requesting a GPU with SLURM

You can request a compute node with any GPU by using the --gpus flag:

# Request an interactive session
sinter --gpus 1

# Launch a slurm job
sbatch --gpus=1 script.sh

If you want to request a specific GPU, you can either use the --gres flag:

# Request an interactive session with an A30 GPU
sinter --gres=gpu:a40

# Launch a slurm job with an A40 GPU
sbatch --gres=gpu:a30 script.sh

You can create more complex requests with the --constraint flag:

# Request an interactive session with either an a30 or a40 GPU
sinter --constraint="[a30|a40]" --gpus=1

# Launch a slurm job with either an a30 or a40 GPU
sbatch --constraint="[a30|a40]" --gpus=1 script.sh

Advanced usage

Launching a job on two GPU nodes in parallel

#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=1
#SBATCH --gres=gpu:1
#SBATCH --output=nvidia-smi-all.out

srun --nodes=1 --ntasks=1 bash -c 'hostname; nvidia-smi' &
srun --nodes=1 --ntasks=1 bash -c 'hostname; nvidia-smi' &
wait

Launching two jobs in parallel on the same GPU

It is possible to split a GPU by using MPS. The following slurm script launches two tasks in parallel on one GPU by splitting it in two.

#!/bin/bash

#SBATCH --constraint=a30
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=10M
#SBATCH --gres=mps:100

for i in {1..2}; do
       srun --exclusive --nodes=1 --ntasks=1 --gres=mps:50 --output="out${i}.txt" bash -c 'sleep 5; hostname; nvidia-smi' &
done

wait

Warning

It is not possible to use both --gres=gpu and --gres=mps flags at the same time. When using MPS we have to use nodelist or constraint to request a specific GPU.