I’m unable to use CUDA compute on RTX 5090 pods. This reproduces across multiple fresh pods/templates (including pods with no storage/network volume attached), so it doesn’t appear to be a SwarmUI/ComfyUI setup issue.
What happens
nvidia-smi works and shows RTX 5090 (driver 580.xx, “CUDA Version 13.0”)
But CUDA compute fails to initialize:
/dev/nvidia-caps/* is missing: NO NVIDIA CAPS NODES
Low-level driver init fails:
python3 -c "import ctypes; lib=ctypes.CDLL('libcuda.so.1'); print('cuInit=', lib.cuInit(0))" → cuInit=999
PyTorch then reports CUDA unavailable / “CUDA unknown error”, and ComfyUI exits on startup.
Expected
/dev/nvidia-caps/* present
cuInit returns 0 and PyTorch can use the GPU
Scope
Seen on multiple RTX 5090 pods (different templates/images), including without storage attached.
(Add datacenter IDs you tested here, if known.)
Request Can someone from Runpod staff confirm whether there’s an issue with the RTX 5090 pool/container runtime passthrough (capability device nodes not being mounted), and advise what datacenter/pool is currently healthy or when it will be fixed?
Commands used
ls -l /dev/nvidia-caps/* 2>/dev/null || echo "NO NVIDIA CAPS NODES"
nvidia-smi
python3 -c "import ctypes; lib=ctypes.CDLL('libcuda.so.1'); print('cuInit=', lib.cuInit(0))"