#RuntimeError: CUDA device-side assert triggered (Unsloth GRPO, Multi-GPU, Qwen 2.5 14B Training)

2 messages · Page 1 of 1 (latest)

fresh iron
#

I'm trying to fine-tune Qwen 2.5 14B using Unsloth + GRPO on two GPUs (since a single GPU doesn't have enough VRAM). I'm using the default Colab/Jupyter training script and only modified a few parts to fit my needs. However, the training fails with this CUDA device-side assert triggered error:

RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with 'TORCH_USE_CUDA_DSA' to enable device-side assertions.

I've attached the notebook script for you to see what I changed and how I fixed it.