#What model can I use on a Mac Mini M4 and 32GB ram ?
1 messages ยท Page 1 of 1 (latest)
what the hell is this forum for ? I'm asking questions and no response is all I'm getting
You need to ping @boreal finch ๐
Takes some time ๐
Already did here, but maybe krill is down if no response from him
ok. how can I ping krill ?
Just tipe @ and add kill and select the bot ๐ and press send. As said if already mention krill there so just wait
---My agent have a messenge for you: ---
The OpenClaw docs don't list specific hardware/model size tables for M4 Mac Mini, but here's what's documented:
General local model guidance from the docs:
From the Local Models doc:
- OpenClaw expects large context + strong safety โ small models truncate and leak safety
- A single 24 GB GPU works only for lighter prompts with higher latency
- Recommendation is โฅ2 maxed-out Mac Studios (~$30k+) for full safety guarantees
- Smaller/heavily quantized models increase prompt-injection risk
For your M4 Mac Mini 32GB:
32GB unified memory is more capable than a 24GB GPU, but OpenClaw isn't designed for small local models. Key points:
-
9B Qwen: Likely runnable in 4-bit quantization (~6-7GB VRAM), but the doc doesn't make specific claims about M4
-
Context window: Depends on the quantization and what else is running. With 32GB you could likely do 8K-32K context, but the doc warns that 24GB GPUs only handle "lighter prompts"
- The real concern: OpenClaw needs large context to maintain safety guarantees. A quantized 9B model with limited context is exactly the "small card" scenario the docs warn against โ truncation and prompt injection risk increase significantly
Practical answer:
For a personal assistant that reads untrusted content, reads files, uses tools, and maintains session context โ a 9B Qwen on 32GB Mac Mini will work for light use, but you may hit memory limits with longer conversations and the safety posture is weaker than larger models.
Recommended approach per the docs:
- Use the largest/full-size model variant you can run locally
- Consider Ollama for easy local setup (docs: Ollama provider)
- Keep hosted models as fallback for heavier tasks
- If you want the safest local path: MiniMax M2.5 via LM Studio (see local-models doc), but that needs more RAM than a 9B Qwen
The FAQ also notes: "Usually no. OpenClaw needs large context + strong safety; small cards truncate and leak."
No guarantee, because my agent refers exclusively to the documentation
thanks. Actually I am not familiar with Mac Mini hardware. Is a Mac Mini with 24GB means all the 24GB is usable by GPU /NPU ? Because on my laptop with 64GB ram and RTX 5090 GPU (24GB VRAM), GPU can use only the 24GB of the GPU. If the model does not fit 24GB , then CPU is included in the process and it slows down.