#Debian 12 Nvidia Drivers and Cuda Installation Fails

1 messages · Page 1 of 1 (latest)

magic comet
#

I am running a Debian 12 system. About two months ago, I successfully installed the nvidia drivers with cuda support. After about a week, my computer instantly powered off during use and after that, I could not boot to a login screen. I would get a "Something went wrong" white screen on boot.

I had to purge NVIDIA and CUDA to get it working. I have recently tried again, using the debian guide and the official nvidia guide. Both attempts would lead to the same issue happening.

Can anyone help me resolve why this isn't working? Secure-Boot is not enabled in BIOS. When I boot into safemode after an installation attempt, I must run nvidia-smi with root permissions if that changes anything. Root login is disabled on this machine.

#
$ lspci | grep -e VGA
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1080] (rev a1)
$ inxi -Fxxxz
System:
  Kernel: 6.1.0-21-amd64 arch: x86_64 bits: 64 compiler: gcc v: 12.2.0
    Desktop: GNOME v: 43.9 tk: GTK v: 3.24.38 wm: gnome-shell dm: GDM3 v: 43.0
    Distro: Debian GNU/Linux 12 (bookworm)
CPU:
  Info: quad core model: Intel Core i5-6400 bits: 64 type: MCP
    smt: <unsupported> arch: Skylake-S rev: 3 cache: L1: 256 KiB L2: 1024 KiB
    L3: 6 MiB
  Speed (MHz): avg: 2175 high: 2944 min/max: 800/3300 cores: 1: 2535 2: 2944
    3: 2423 4: 800 bogomips: 21599
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Graphics:
  Device-1: NVIDIA GP104 [GeForce GTX 1080] driver: nouveau v: kernel
    arch: Pascal pcie: speed: 2.5 GT/s lanes: 16 ports: active: DP-1,HDMI-A-1
    empty: DP-2,DP-3,DVI-D-1 bus-ID: 01:00.0 chip-ID: 10de:1b80 class-ID: 0300
    temp: 43.0 C
$ sudo cat /etc/apt/sources.list
deb http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware
deb-src http://deb.debian.org/debian bookworm main contrib non-free non-free-firmware
deb http://deb.debian.org/debian bookworm-updates main contrib non-free non-free-firmware
deb-src http://deb.debian.org/debian bookworm-updates main contrib non-free non-free-firmware

deb http://security.debian.org/debian-security bookworm-security main
deb-src http://security.debian.org/debian-security bookworm-security main

Installation Steps

$ wget http://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb
$ sudo apt install /tmp/nvidia/cuda-keyring_1.1-1_all.deb
$ sudo apt install cuda-toolkit-12-3
$ sudo apt isntall nvidia-driver firmware-misc-nonfree

Adding CUDA to path

$ sudo nano /etc/profile.d/cuda-12.3.sh
export CUDA_VERSION="12.3"
export CUDA_HOME="/usr/local/cuda-${CUDA_VERSION}"
export PATH="${CUDA_HOME}/bin${PATH:+:${PATH}}"
#

Ensuring GDM3 Uses X11 instead of Wayland

$ cat /etc/gdm3/daemon.conf
# GDM configuration storage
#
# See /usr/share/gdm/gdm.schemas for a list of available options.

[daemon]
# Uncomment the line below to force the login screen to use Xorg
 WaylandEnable=false

# Enabling automatic login
#  AutomaticLoginEnable = true
#  AutomaticLogin = user1

# Enabling timed login
#  TimedLoginEnable = true
#  TimedLogin = user1
#  TimedLoginDelay = 10

[security]

[xdmcp]

[chooser]

[debug]
# Uncomment the line below to turn on debugging
# More verbose logs
# Additionally lets the X server dump core if it crashes
#Enable=true

Ensuring nouveau is Disabled

 sudo cat /etc/modprobe.d/nvidia-blacklists-nouveau.conf 
# You need to run "update-initramfs -u" after editing this file.

# see #580894
blacklist nouveau
magic comet
#

This might be related to Xorg. Attempting to troubleshoot it now since I can login with recovery mode and run nvidia-smi and the drivers are properly detected.

magic comet