#Hardware setup

1 messages · Page 1 of 1 (latest)

placid kernel
#

I have openclaw running on a Raspberry pi and a beelink gtr9 where I host the ai models as well as run docker for a local firecrawl service. Everything is working just fine but I'm also aware that the beelink has an NPU. Could I make use of that for something in regards to this openclaw setup?

river jewelBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

strong hazelBOT
#

Short version: probably not directly for OpenClaw today, unless you build a sidecar service around it.

OpenClaw mostly cares that your model/backend exposes something it can call, usually an OpenAI-compatible endpoint or another supported provider route. The Beelink’s AMD NPU is a separate accelerator target; OpenClaw won’t automatically use it just because it exists.

Where it could fit:

  • LLM serving: only if your NPU runtime can serve a model behind an OpenAI-compatible API. AMD’s Ryzen AI Linux stack does support NPU workloads, but it’s still pretty specific/young: Ubuntu 24.04, newer kernel, XRT/Ryzen AI packages, supported STX/KRK platforms, compiled/compatible models, etc. Not a drop-in Ollama/llama.cpp acceleration path.
  • Embeddings / reranking / classifiers / OCR-ish helpers: more realistic long-term. You could run a small local service on the Beelink and have OpenClaw tools/hooks call it, but that’s custom plumbing.
  • Firecrawl: the NPU likely won’t help much unless Firecrawl or your own post-processing pipeline is doing ML inference that can target it.

My practical recommendation: keep using the Beelink CPU/GPU/iGPU path for local model serving unless you specifically want an experiment project. The NPU is interesting, but for OpenClaw value-per-hour, I’d first optimize the model server you already use, Docker resource limits, and maybe GPU/Vulkan/ROCm acceleration if available.

Relevant docs: https://docs.openclaw.ai/concepts/models
For fresh hardware/model-backend experiences, #models is also the best place to compare notes — this stuff changes fast. 🦞

placid kernel
#

How about voice processing?

strong hazelBOT
#

For voice processing, the NPU is a little more plausible than for general LLM serving — but still not plug-and-play with OpenClaw.

OpenClaw’s built-in audio path is: incoming voice note/audio → transcribe with a configured provider or CLI → inject transcript into the chat/session. It can auto-detect local CLIs like sherpa-onnx-offline, whisper-cli / whisper.cpp, Python whisper, then fall back to providers like OpenAI/Deepgram/Groq/Google/etc.

Docs: https://docs.openclaw.ai/nodes/audio

Where the NPU could fit:

  • If you can run an AMD Ryzen AI / ONNX Runtime based ASR model locally on the Beelink NPU, expose it as a small CLI or HTTP service.
  • Then configure OpenClaw audio transcription to call that CLI via tools.media.audio.models: [{ type: "cli", ... }], or wrap it behind a custom skill/tool/service.
  • Models in the sherpa-onnx ecosystem are probably the most natural thing to investigate first, because OpenClaw already knows about sherpa-onnx-offline as a local transcription option. The missing piece is whether your exact Beelink NPU + Linux driver stack + model can actually accelerate it.

My honest take: for voice notes, CPU whisper.cpp or sherpa-onnx may already be fast enough on the GTR9, and it’ll be dramatically simpler. I’d only chase NPU voice if you want always-on wake-word / streaming ASR with low power, or you enjoy hardware tinkering. For normal Discord/Telegram/WhatsApp voice-note transcription, local CPU ASR or Deepgram/OpenAI is the sane path.