#RTX 5090 pods: CUDA init fails (cuInit=999) + missing /dev/nvidia-caps (PyTorch/ComfyUI can’t see GP

6 messages · Page 1 of 1 (latest)

wintry depot
#

I’m unable to use CUDA compute on RTX 5090 pods. This reproduces across multiple fresh pods/templates (including pods with no storage/network volume attached), so it doesn’t appear to be a SwarmUI/ComfyUI setup issue.

What happens

nvidia-smi works and shows RTX 5090 (driver 580.xx, “CUDA Version 13.0”)
But CUDA compute fails to initialize:
    /dev/nvidia-caps/* is missing: NO NVIDIA CAPS NODES
    Low-level driver init fails:
        python3 -c "import ctypes; lib=ctypes.CDLL('libcuda.so.1'); print('cuInit=', lib.cuInit(0))" → cuInit=999
    PyTorch then reports CUDA unavailable / “CUDA unknown error”, and ComfyUI exits on startup.

Expected

/dev/nvidia-caps/* present
cuInit returns 0 and PyTorch can use the GPU

Scope

Seen on multiple RTX 5090 pods (different templates/images), including without storage attached.
(Add datacenter IDs you tested here, if known.)

Request Can someone from Runpod staff confirm whether there’s an issue with the RTX 5090 pool/container runtime passthrough (capability device nodes not being mounted), and advise what datacenter/pool is currently healthy or when it will be fixed?

Commands used

ls -l /dev/nvidia-caps/* 2>/dev/null || echo "NO NVIDIA CAPS NODES"
nvidia-smi
python3 -c "import ctypes; lib=ctypes.CDLL('libcuda.so.1'); print('cuInit=', lib.cuInit(0))"
proper obsidianBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

amber marlinBOT
amber marlinBOT
grizzled mesa
#

Can you open a support ticket

#

i think this is quite simillar to what some other user are experiencing with comfyui