Quick thread summary before this gets too long:
You wanted to run local models in OpenClaw on a machine with a 4GB GPU + 32GB RAM, including separate models for planning/coding, vision, and Whisper/audio.
What we found:
OpenClaw can use separate model slots:
agents.defaults.model for the main assistant/planner/coding model
agents.defaults.imageModel for image-capable model fallback
tools.media.audio for Whisper/audio transcription
Whisper should not be configured as a normal chat model. It belongs under tools.media.audio.
For whisper.cpp, use whisper-cli.exe/main.exe with full Windows paths, not usually command: "whisper" unless that is actually your binary.
Your local model providers were configured as custom OpenAI-compatible endpoints like:
custom-localhost-8080/planner
custom-localhost-8081/vision
custom-localhost-8082/coder
On OpenClaw 2026.4.23, models.providers.*.timeoutSeconds was rejected. After updating to 2026.5.7, it works, and you set planner timeout to 900.