#Add GPU acceleration for qmd

23 messages · Page 1 of 1 (latest)

dark void
#

I am using openclaw:latest docker image, and am loading qmd . The latest now complains that I don't have GPU acceleration, so would like to connect my nvidia GPU. I have tried just connecting nvidia runtime, but that doesn't work. Is there a good way to add this to the base openclaw image?

mental vectorBOT
dull smeltBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

mental vectorBOT
sharp kiteBOT
#

Yep, but the clean way is a derived image, not patching the running openclaw:latest container.

What’s going on:

  • The shipped OpenClaw image is a generic Debian/Node runtime, not a CUDA image
  • QMD runs via node-llama-cpp
  • For NVIDIA GPU use, --runtime=nvidia alone is usually not enough. You need all 3:
    1. NVIDIA Container Toolkit on the host
    2. CUDA userspace libs inside the image
    3. Run the container with GPU access (--gpus all / compose equivalent)

OpenClaw’s Docker docs also recommend baking required binaries into the image at build time, not installing them in a running container:

So the pattern I’d use is:

#
FROM ghcr.io/openclaw/openclaw:latest
USER root

# install CUDA runtime libs for your distro/version
# install build deps node-llama-cpp may need
RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates curl git cmake clang libgomp1 \
 && npm install -g @tobilu/qmd \
 && rm -rf /var/lib/apt/lists/*

ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility

USER node
#

But the important missing piece is the CUDA libs layer. Without that, QMD usually falls back to CPU even if the container can “see” the GPU.

Also useful upstream refs:

Two quick asks so I can make this exact instead of hand-wavy:

  1. paste openclaw --version from inside the container
  2. paste your current Dockerfile or docker-compose.yml

If you want, I can turn that into a working CUDA-enabled OpenClaw+QMD image for your setup.

dark void
#

I am using latest 2026.4.10, and right now my dockerfile is just

`FROM ghcr.io/openclaw/openclaw:latest

USER root

CUDA build deps
RUN apt-get update && apt-get install -y
git
build-essential
cmake
python3
&& rm -rf /var/lib/apt/lists/*

Install QMD
RUN npm install -g @tobilu/qmd

USER node`

sharp kiteBOT
#

Yep, that explains it.

Your current Dockerfile adds build tools, but not the actual CUDA runtime/toolkit bits QMD needs. So qmd installs, but node-llama-cpp still ends up in a CPU-only setup.

The fix is:

  1. Pin the image while debugging

    FROM ghcr.io/openclaw/openclaw:2026.4.10
    
  2. Install CUDA userspace libs/toolkit first
    Your image is Debian bookworm based, so you need the NVIDIA CUDA layer inside the container, not just --runtime=nvidia on the host.

  3. Then install QMD
    After CUDA is present, install QMD, and ideally force the CUDA backend fetch/build:

#
   RUN npm install -g @tobilu/qmd \
    && npx --yes node-llama-cpp source download --gpu cuda
  1. Run with GPU access
    Use --gpus all (or compose equivalent), and make sure the host has NVIDIA Container Toolkit installed.

  2. Verify before blaming OpenClaw

    npx --yes node-llama-cpp inspect gpu
    

    If that does not show CUDA, QMD will stay on CPU too.

So, roughly, your Dockerfile wants to look more like:

#
FROM ghcr.io/openclaw/openclaw:2026.4.10
USER root

# add CUDA runtime/toolkit packages here
RUN apt-get update && apt-get install -y --no-install-recommends \
    git \
    build-essential \
    cmake \
    python3 \
    clang \
    libgomp1 \
 && rm -rf /var/lib/apt/lists/*

RUN npm install -g @tobilu/qmd \
 && npx --yes node-llama-cpp source download --gpu cuda
#

ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility

USER node

The missing part is the CUDA package block. That’s the important bit.

Useful refs:

If you want, paste your host OS plus whether you’re using plain docker run or docker compose, and I can turn this into a more exact copy-paste Dockerfile.

dark void
#

82.40 -- Could not find nvcc, please set CUDAToolkit_ROOT.
82.40 CMake Error at llama.cpp/ggml/src/ggml-cuda/CMakeLists.txt:258 (message):
82.40 CUDA Toolkit not found
82.40
82.40
82.40 -- Configuring incomplete, errors occurred!
82.40 See also "/app/node_modules/node-llama-cpp/llama/localBuilds/linux-x64-cuda/CMakeFiles/CMakeOutput.log".
82.41 ERROR OMG Process terminated: 1
82.44
82.44 [node-llama-cpp] To resolve errors related to CUDA compilation, see the CUDA guide: https://node-llama-cpp.withcat.ai/guide/CUDA
82.44 âś– Failed to compile llama.cpp
82.44 Failed to build llama.cpp with CUDA support. Error: SpawnError: Command npm run -s cmake-js-llama -- compile --log-level warn --config Release --arch=x64 --out localBuilds/linux-x64-cuda --runtime-version=24.14.0 --parallel=14 --CDGGML_BUILD_NUMBER=1 --CDCMAKE_CONFIGURATION_TYPES=Release --CDNLC_CURRENT_PLATFORM=linux-x64 --CDNLC_TARGET_PLATFORM=linux-x64 --CDNLC_VARIANT=cuda.b8390 --CDGGML_METAL=OFF --CDGGML_CUDA=1 --CDGGML_CCACHE=OFF --CDLLAMA_CURL=OFF --CDLLAMA_HTTPLIB=OFF --CDLLAMA_BUILD_BORINGSSL=OFF --CDLLAMA_OPENSSL=OFF exited with code 1
82.44 at createError (file:///app/node_modules/node-llama-cpp/dist/utils/spawnCommand.js:34:20)
82.44 at ChildProcess.<anonymous> (file:///app/node_modules/node-llama-cpp/dist/utils/spawnCommand.js:47:24)
82.44 at ChildProcess.emit (node:events:508:28)
82.44 at ChildProcess._handle.onexit (node:internal/child_process:294:12)
82.44 node-llama-cpp source download

sharp kiteBOT
#

Ah, nice, this narrows it down a lot.

That error means you’re in the CUDA build path, and the image does not have the CUDA toolkit installed.

nvidia-container-runtime only gives the container access to the GPU. It does not put nvcc inside the image.

So the missing piece is:

  • full CUDA toolkit inside the container (nvcc)
  • not just runtime access
  • not just build-essential/cmake/python3

What this means for your Dockerfile

If you want this line to work:

#
node-llama-cpp source download --gpu cuda

then you need a toolkit package in the image, for example CUDA 12.4+ or 13.1+, and these env vars usually help:

ENV CUDA_PATH=/usr/local/cuda
ENV CUDAToolkit_ROOT=/usr/local/cuda
ENV CUDACXX=/usr/local/cuda/bin/nvcc
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility

Important bit

You currently have:

#
RUN apt-get update && apt-get install -y \
    git \
    build-essential \
    cmake \
    python3

but you still need the actual CUDA toolkit package from NVIDIA’s repo, for example cuda-toolkit-13-1 or newer.

Also, one subtle thing

If you’re forcing:

node-llama-cpp source download --gpu cuda
#

that always needs nvcc.

If you’re just testing whether GPU works, try this first after CUDA is installed:

npx --yes node-llama-cpp inspect gpu

If that shows CUDA correctly, QMD is in much better shape already, and you may not need to force a source build right away.

So the short answer

Yes, you can keep FROM ghcr.io/openclaw/openclaw:2026.4.10, but you need to add:

  1. NVIDIA CUDA apt repo
  2. full toolkit install, not just runtime
  3. CUDA_PATH / CUDAToolkit_ROOT / CUDACXX
dark void
#

this is actually unraid linux, and yes can you give me a exact dockerfile?

sharp kiteBOT
#

Yep, for Unraid I’d try this exact child image first.

FROM ghcr.io/openclaw/openclaw:2026.4.10

USER root
ENV DEBIAN_FRONTEND=noninteractive

# Base build deps + NVIDIA CUDA repo + full CUDA toolkit (includes nvcc)
RUN apt-get update && apt-get install -y --no-install-recommends \
    ca-certificates \
    wget \
    gnupg \
    git \
    build-essential \
    cmake \
#
    python3 \
    clang \
    libgomp1 \
 && wget -q https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb \
 && dpkg -i cuda-keyring_1.1-1_all.deb \
 && rm -f cuda-keyring_1.1-1_all.deb \
 && apt-get update \
 && apt-get install -y --no-install-recommends \
    cuda-toolkit-12-5 \
 && rm -rf /var/lib/apt/lists/*

ENV CUDA_PATH=/usr/local/cuda
ENV CUDAToolkit_ROOT=/usr/local/cuda
ENV CUDACXX=/usr/local/cuda/bin/nvcc
ENV NVIDIA_VISIBLE_DEVICES=all
#
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility

RUN npm install -g @tobilu/qmd

USER node

A couple of important notes:

  • Do not run node-llama-cpp inspect gpu during docker build, because the GPU usually is not exposed to the build stage.
  • In Unraid, you still need GPU passthrough/runtime enabled for the container itself, otherwise the image builds fine but QMD still runs on CPU.

After the container is up, test with:

#
docker exec -it <container-name> bash
npx --yes node-llama-cpp inspect gpu
qmd query "test"

If inspect gpu still says no CUDA, then the remaining issue is on the Unraid runtime side, not the Dockerfile.

If you want, I can also give you the matching Unraid container template/runtime settings next.