Is this a known issue? Thoughts on the tradeoff? | Friends of the Crustacean 🦞🤝 | Page 1

brittle charm Mar 12, 2026, 5:55 PM

#

Root cause: The node-llama-cpp Metal GPU backend was crashing with an
assertion failure (ggml-metal-device.m). This was being triggered by the local
vector embedding provider used for memory search. Each time the gateway
started, it loaded the local llama model, hit the Metal bug, and crashed —
causing a SIGTERM restart loop until macOS gave up.

Fix: Disabled memorySearch.enabled → false in openclaw.json. The provider is
still set to "local" (to pass config validation) but won't be loaded since
search is disabled.

Trade-off: You lose hybrid vector+text memory search. Once OpenClaw ships a
fix for the node-llama-cpp Metal assertion (or you update), you can re-enable
it by setting memorySearch.enabled: true. Text-based memory (keyword matching
from compaction/session files) should still work through the normal agent
context.

rough lilyBOT Mar 12, 2026, 5:55 PM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

low lily Mar 12, 2026, 5:55 PM

#

True

solar oracleBOT Mar 12, 2026, 6:02 PM

#

Yeah, this is a pretty common class of failure in the llama.cpp ecosystem on macOS: ggml-metal-*.m assertions / Metal backend crashes that only show up with certain models + shapes. In OpenClaw it’s extra painful because local memory embeddings can initialize on gateway startup → you get a launchd restart loop.

Your workaround is totally reasonable for “get the gateway stable again”.

A couple clarifications / tradeoffs:

Disabling agents.defaults.memorySearch.enabled = false doesn’t just turn off “vector” recall — it disables the memory tools (memory_search and memory_get) entirely (so no hybrid BM25+vector search either).
You still get the “basic” behavior of memory files being present in the workspace (and MEMORY.md is part of the standard workspace bootstrap), but you lose the ability for the agent to retrieve older stuff on-demand via the memory tools.

If you want semantic recall back without touching node-llama-cpp/Metal, the cleaner fix is usually:

keep memory search enabled, but set agents.defaults.memorySearch.provider to a remote embeddings provider (so OpenClaw won’t load node-llama-cpp at all), and configure the API key; or
use provider = "ollama" if you already run a stable Ollama embeddings setup.

(Also: you don’t need to keep provider: "local" “to pass validation” — provider: "auto" is valid, and if enabled is false the provider won’t matter anyway. Top-level memorySearch is legacy and gets migrated to agents.defaults.memorySearch.)

Docs:

Memory overview + config knobs: https://docs.openclaw.ai/concepts/memory

#

CLI to inspect: https://docs.openclaw.ai/cli/memory (openclaw memory status --deep)

If you can repro this reliably, it’d be worth filing a GitHub issue (OpenClaw or upstream) with:

the exact assertion text + backtrace lines
openclaw --version
macOS version + chip
which GGUF / embeddings model path was being loaded

#Is this a known issue? Thoughts on the tradeoff?