#Add GPU acceleration for qmd
23 messages · Page 1 of 1 (latest)
Yep, but the clean way is a derived image, not patching the running openclaw:latest container.
What’s going on:
- The shipped OpenClaw image is a generic Debian/Node runtime, not a CUDA image
- QMD runs via
node-llama-cpp - For NVIDIA GPU use,
--runtime=nvidiaalone is usually not enough. You need all 3:- NVIDIA Container Toolkit on the host
- CUDA userspace libs inside the image
- Run the container with GPU access (
--gpus all/ compose equivalent)
OpenClaw’s Docker docs also recommend baking required binaries into the image at build time, not installing them in a running container:
- https://docs.openclaw.ai/install/docker
- https://docs.openclaw.ai/install/docker-vm-runtime
- https://docs.openclaw.ai/concepts/memory-qmd
So the pattern I’d use is:
FROM ghcr.io/openclaw/openclaw:latest
USER root
# install CUDA runtime libs for your distro/version
# install build deps node-llama-cpp may need
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates curl git cmake clang libgomp1 \
&& npm install -g @tobilu/qmd \
&& rm -rf /var/lib/apt/lists/*
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
USER node
But the important missing piece is the CUDA libs layer. Without that, QMD usually falls back to CPU even if the container can “see” the GPU.
Also useful upstream refs:
Two quick asks so I can make this exact instead of hand-wavy:
- paste
openclaw --versionfrom inside the container - paste your current
Dockerfileordocker-compose.yml
If you want, I can turn that into a working CUDA-enabled OpenClaw+QMD image for your setup.
I am using latest 2026.4.10, and right now my dockerfile is just
`FROM ghcr.io/openclaw/openclaw:latest
USER root
CUDA build deps
RUN apt-get update && apt-get install -y
git
build-essential
cmake
python3
&& rm -rf /var/lib/apt/lists/*
Install QMD
RUN npm install -g @tobilu/qmd
USER node`
Yep, that explains it.
Your current Dockerfile adds build tools, but not the actual CUDA runtime/toolkit bits QMD needs. So qmd installs, but node-llama-cpp still ends up in a CPU-only setup.
The fix is:
-
Pin the image while debugging
FROM ghcr.io/openclaw/openclaw:2026.4.10 -
Install CUDA userspace libs/toolkit first
Your image is Debian bookworm based, so you need the NVIDIA CUDA layer inside the container, not just--runtime=nvidiaon the host. -
Then install QMD
After CUDA is present, install QMD, and ideally force the CUDA backend fetch/build:
RUN npm install -g @tobilu/qmd \
&& npx --yes node-llama-cpp source download --gpu cuda
-
Run with GPU access
Use--gpus all(or compose equivalent), and make sure the host has NVIDIA Container Toolkit installed. -
Verify before blaming OpenClaw
npx --yes node-llama-cpp inspect gpuIf that does not show CUDA, QMD will stay on CPU too.
So, roughly, your Dockerfile wants to look more like:
FROM ghcr.io/openclaw/openclaw:2026.4.10
USER root
# add CUDA runtime/toolkit packages here
RUN apt-get update && apt-get install -y --no-install-recommends \
git \
build-essential \
cmake \
python3 \
clang \
libgomp1 \
&& rm -rf /var/lib/apt/lists/*
RUN npm install -g @tobilu/qmd \
&& npx --yes node-llama-cpp source download --gpu cuda
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
USER node
The missing part is the CUDA package block. That’s the important bit.
Useful refs:
- OpenClaw Docker: https://docs.openclaw.ai/install/docker
- OpenClaw Docker VM runtime: https://docs.openclaw.ai/install/docker-vm-runtime
- QMD: https://docs.openclaw.ai/concepts/memory-qmd
- node-llama-cpp Docker/CUDA guide: https://node-llama-cpp.withcat.ai/guide/docker and https://node-llama-cpp.withcat.ai/guide/CUDA
If you want, paste your host OS plus whether you’re using plain docker run or docker compose, and I can turn this into a more exact copy-paste Dockerfile.
82.40 -- Could not find nvcc, please set CUDAToolkit_ROOT.
82.40 CMake Error at llama.cpp/ggml/src/ggml-cuda/CMakeLists.txt:258 (message):
82.40 CUDA Toolkit not found
82.40
82.40
82.40 -- Configuring incomplete, errors occurred!
82.40 See also "/app/node_modules/node-llama-cpp/llama/localBuilds/linux-x64-cuda/CMakeFiles/CMakeOutput.log".
82.41 ERROR OMG Process terminated: 1
82.44
82.44 [node-llama-cpp] To resolve errors related to CUDA compilation, see the CUDA guide: https://node-llama-cpp.withcat.ai/guide/CUDA
82.44 âś– Failed to compile llama.cpp
82.44 Failed to build llama.cpp with CUDA support. Error: SpawnError: Command npm run -s cmake-js-llama -- compile --log-level warn --config Release --arch=x64 --out localBuilds/linux-x64-cuda --runtime-version=24.14.0 --parallel=14 --CDGGML_BUILD_NUMBER=1 --CDCMAKE_CONFIGURATION_TYPES=Release --CDNLC_CURRENT_PLATFORM=linux-x64 --CDNLC_TARGET_PLATFORM=linux-x64 --CDNLC_VARIANT=cuda.b8390 --CDGGML_METAL=OFF --CDGGML_CUDA=1 --CDGGML_CCACHE=OFF --CDLLAMA_CURL=OFF --CDLLAMA_HTTPLIB=OFF --CDLLAMA_BUILD_BORINGSSL=OFF --CDLLAMA_OPENSSL=OFF exited with code 1
82.44 at createError (file:///app/node_modules/node-llama-cpp/dist/utils/spawnCommand.js:34:20)
82.44 at ChildProcess.<anonymous> (file:///app/node_modules/node-llama-cpp/dist/utils/spawnCommand.js:47:24)
82.44 at ChildProcess.emit (node:events:508:28)
82.44 at ChildProcess._handle.onexit (node:internal/child_process:294:12)
82.44 node-llama-cpp source download
Ah, nice, this narrows it down a lot.
That error means you’re in the CUDA build path, and the image does not have the CUDA toolkit installed.
nvidia-container-runtime only gives the container access to the GPU. It does not put nvcc inside the image.
So the missing piece is:
- full CUDA toolkit inside the container (
nvcc) - not just runtime access
- not just
build-essential/cmake/python3
What this means for your Dockerfile
If you want this line to work:
node-llama-cpp source download --gpu cuda
then you need a toolkit package in the image, for example CUDA 12.4+ or 13.1+, and these env vars usually help:
ENV CUDA_PATH=/usr/local/cuda
ENV CUDAToolkit_ROOT=/usr/local/cuda
ENV CUDACXX=/usr/local/cuda/bin/nvcc
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
Important bit
You currently have:
RUN apt-get update && apt-get install -y \
git \
build-essential \
cmake \
python3
but you still need the actual CUDA toolkit package from NVIDIA’s repo, for example cuda-toolkit-13-1 or newer.
Also, one subtle thing
If you’re forcing:
node-llama-cpp source download --gpu cuda
that always needs nvcc.
If you’re just testing whether GPU works, try this first after CUDA is installed:
npx --yes node-llama-cpp inspect gpu
If that shows CUDA correctly, QMD is in much better shape already, and you may not need to force a source build right away.
So the short answer
Yes, you can keep FROM ghcr.io/openclaw/openclaw:2026.4.10, but you need to add:
- NVIDIA CUDA apt repo
- full toolkit install, not just runtime
CUDA_PATH/CUDAToolkit_ROOT/CUDACXX
- run container with
--gpus all
Useful refs:
- https://node-llama-cpp.withcat.ai/guide/CUDA
- https://node-llama-cpp.withcat.ai/guide/docker
- https://docs.openclaw.ai/install/docker
- https://docs.openclaw.ai/concepts/memory-qmd
If you want, paste whether the host is Debian/Ubuntu and whether you use docker run or docker compose, and I can turn this into a more exact Dockerfile snippet.
this is actually unraid linux, and yes can you give me a exact dockerfile?
Yep, for Unraid I’d try this exact child image first.
FROM ghcr.io/openclaw/openclaw:2026.4.10
USER root
ENV DEBIAN_FRONTEND=noninteractive
# Base build deps + NVIDIA CUDA repo + full CUDA toolkit (includes nvcc)
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates \
wget \
gnupg \
git \
build-essential \
cmake \
python3 \
clang \
libgomp1 \
&& wget -q https://developer.download.nvidia.com/compute/cuda/repos/debian12/x86_64/cuda-keyring_1.1-1_all.deb \
&& dpkg -i cuda-keyring_1.1-1_all.deb \
&& rm -f cuda-keyring_1.1-1_all.deb \
&& apt-get update \
&& apt-get install -y --no-install-recommends \
cuda-toolkit-12-5 \
&& rm -rf /var/lib/apt/lists/*
ENV CUDA_PATH=/usr/local/cuda
ENV CUDAToolkit_ROOT=/usr/local/cuda
ENV CUDACXX=/usr/local/cuda/bin/nvcc
ENV NVIDIA_VISIBLE_DEVICES=all
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
RUN npm install -g @tobilu/qmd
USER node
A couple of important notes:
- Do not run
node-llama-cpp inspect gpuduringdocker build, because the GPU usually is not exposed to the build stage. - In Unraid, you still need GPU passthrough/runtime enabled for the container itself, otherwise the image builds fine but QMD still runs on CPU.
After the container is up, test with:
docker exec -it <container-name> bash
npx --yes node-llama-cpp inspect gpu
qmd query "test"
If inspect gpu still says no CUDA, then the remaining issue is on the Unraid runtime side, not the Dockerfile.
If you want, I can also give you the matching Unraid container template/runtime settings next.