That error is coming from the backend, not from OpenClaw’s token counter:
Cannot truncate prompt with n_keep (16886) >= n_ctx (4096)= your server is actually running with n_ctx = 4096, while the prompt OpenClaw sends is ~16.9k tokens.
Two key points:
-
OpenClaw has a hard minimum context window of 16,000 tokens.
So any backend that’s really at 4k (or even 8k) is going to fail sooner or later. -
The “?/16k” you see in OpenClaw is based on your configured
contextWindow, not what LM Studio truly runs.
If you “tell” OpenClaw 16k but LM Studio still serves 4k, you’ll get exactly what you’re seeing.
Fix path A (recommended): make LM Studio actually run ≥16k (ideally 32k+)
In LM Studio, you need to increase the model/server Context Length (LM Studio’s llama.cpp backend commonly defaults to 4096). After changing it, restart the LM Studio server (and ideally reload the model).
Then make sure your OpenClaw provider model entry matches reality: