#CUDA & Immich ML & WSL

1 messages · Page 1 of 1 (latest)

grand kettle
#

I'm trying to run the immich ML service on my desktop so I can offload some work from my server. I have a nvidia 1070 in there so I think I should be able to use CUDA.

This is the error I'm seeing:

2024-12-06 12:09:45 2024-12-06 12:09:45.348032539 [E:onnxruntime:Default, cuda_call.cc:118 CudaCall] CUDA failure 500: named symbol not found ; GPU=-1 ; hostname=8d937cac5a03 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices); 
2024-12-06 12:09:45 *************** EP Error ***************
2024-12-06 12:09:45 EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc:59 static onnxruntime::CUDAExecutionProviderInfo onnxruntime::CUDAExecutionProviderInfo::FromProviderOptions(const onnxruntime::ProviderOptions&) [ONNXRuntimeError] : 1 : FAIL : provider_options_utils.h:151 Parse Failed to parse provider option "device_id": CUDA failure 500: named symbol not found ; GPU=-1 ; hostname=8d937cac5a03 ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider_info.cc ; line=66 ; expr=cudaGetDeviceCount(&num_devices); 
2024-12-06 12:09:45  when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
2024-12-06 12:09:45 Falling back to ['CPUExecutionProvider'] and retrying.
2024-12-06 12:09:45 ****************************************
timid barnBOT
#

:wave: Hey @grand kettle,

Thanks for reaching out to us. Please carefully read this message and follow the recommended actions. This will help us be more effective in our support effort and leave more time for building Immich immich.

References

#

Checklist

I have...

  1. :ballot_box_with_check: verified I'm on the latest release(note that mobile app releases may take some time).
  2. :ballot_box_with_check: read applicable release notes.
  3. :ballot_box_with_check: reviewed the FAQs for known issues.
  4. :ballot_box_with_check: reviewed Github for known issues.
  5. :ballot_box_with_check: tried accessing Immich via local ip (without a custom reverse proxy).
  6. :ballot_box_with_check: uploaded the relevant information (see below).
  7. :ballot_box_with_check: tried an incognito window, disabled extensions, cleared mobile app cache, logged out and back in, different browsers, etc. as applicable

(an item can be marked as "complete" by reacting with the appropriate number)

Information

In order to be able to effectively help you, we need you to provide clear information to show what the problem is. The exact details needed vary per case, but here is a list of things to consider:

  • Your docker-compose.yml and .env files.
  • Logs from all the containers and their status (see above).
  • All the troubleshooting steps you've tried so far.
  • Any recent changes you've made to Immich or your system.
  • Details about your system (both software/OS and hardware).
  • Details about your storage (filesystems, type of disks, output of commands like fdisk -l and df -h).
  • The version of the Immich server, mobile app, and other relevant pieces.
  • Any other information that you think might be relevant.

Please paste files and logs with proper code formatting, and especially avoid blurry screenshots.
Without the right information we can't work out what the problem is. Help us help you ;)

If this ticket can be closed you can use the /close command, and re-open it later if needed.

grand kettle
#

This is my docker compose:

name: immich

services:
  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-accele>      file: hwaccel.ml.yml
      service: cuda # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` vers>    ports:
      - 3003:3003
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always
    healthcheck:
      disable: false

volumes:
  model-cache:

I believe I have met all these requirments:

The GPU must have compute capability 5.2 or greater.
The server must have the official NVIDIA driver installed.
The installed driver must be >= 535 (it must support CUDA 12.2).
On Linux (except for WSL2), you also need to have NVIDIA Container Toolkit installed.
#

Inside the container running nvidia-smi gives me this:

# nvidia-smi
Fri Dec  6 11:28:18 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.77.01              Driver Version: 566.36         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1070        On  |   00000000:01:00.0  On |                  N/A |
| 22%   49C    P5             20W /  180W |    1848MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
timid barnBOT
grand kettle
#

Looks like a reboot fixed this - sorry!