#compacting context loop

1 messages · Page 1 of 1 (latest)

feral monolith
#

Been having trouble with openrouter/moonshotai/kimi-k2.6 as my model. Primarily, even on a /new session, I immediately get compaction:


Auston:
/new

Auston:
Testing... hi...

Ridge ⛰️:
âś… New session started.

Ridge ⛰️:
🧹 Compacting context (1 messages) so I can continue without losing history…

Ridge ⛰️:
âś… Context compacted. Continuing from where I left off.

Ridge ⛰️:
🧹 Compacting context (4 messages) so I can continue without losing history…

Ridge ⛰️:
âś… Context compacted. Continuing from where I left off.

Ridge ⛰️:
✅ Context compacted (119,048 → 29,944 tokens). Continuing from where I left off.

Ridge ⛰️:
🧹 Compacting context (6 messages) so I can continue without losing history…

Ridge ⛰️:
âś… Context compacted. Continuing from where I left off.

Ridge ⛰️:
⚠️ Context limit exceeded. I've reset our conversation to start fresh - please try again.

To prevent this, increase your compaction buffer by setting agents.defaults.compaction.reserveTokensFloor to 20000 or higher in your config.

Auston:
/stop


I didn't jump to the reserveTokensFloor suggestion, because this isn't a traditional context limit exceeded situation, let me know if that is actually relevant. Otherwise this is very odd. google/gemini-3-flash-preview works fine.


[claw@openclaw ~]$ openclaw --version
OpenClaw 2026.5.20 (e510042)

sacred mirageBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

pseudo lakeBOT
rotund caveBOT
# feral monolith Been having trouble with openrouter/moonshotai/kimi-k2.6 as my model. Primarily,...

I don’t think raising agents.defaults.compaction.reserveTokensFloor is the right first move here. OpenClaw’s docs say auto-compaction triggers when:

contextTokens > contextWindow - reserveTokens

and the default floor is already 20k. Raising the floor actually makes the threshold smaller, so it can make “compact immediately” behavior more likely, not less.

The suspicious part is 119,048 → 29,944 on a brand-new session. That means the initial runtime prompt/bootstrap is already around 119k tokens, and Kimi via OpenRouter is being treated as too close to the model window or the provider is returning an overflow. OpenRouter currently reports moonshotai/kimi-k2.6 as ~262k context, and the native Moonshot docs/config in OpenClaw also use contextWindow: 262144, so this smells like either:

  1. OpenClaw/Pi is resolving a lower effective context window for openrouter/moonshotai/kimi-k2.6
  2. an explicit contextTokens / agents.defaults.contextTokens cap is lower than expected
  3. OpenRouter/Moonshot is rejecting the request despite the advertised window
  4. a bug in this route’s context accounting on 2026.5.20

Useful checks:

rotund caveBOT
# feral monolith Been having trouble with openrouter/moonshotai/kimi-k2.6 as my model. Primarily,...
openclaw models status --json
openclaw status --all
openclaw logs --follow

In the logs, look for the resolved provider/model and any context-window/overflow lines around the compaction loop. If models status --json shows a low contextWindow or contextTokens for openrouter/moonshotai/kimi-k2.6, try explicitly pinning that model’s window in config under models.providers.openrouter.models with id moonshotai/kimi-k2.6 and contextWindow: 262144.

Docs I checked:
https://docs.openclaw.ai/reference/session-management-compaction
https://docs.openclaw.ai/gateway/configuration
https://docs.openclaw.ai/providers/openrouter

If those outputs confirm the window is already 262k and it still loops, I’d file it as a GitHub issue with the version, model ref, and the redacted log snippet: https://github.com/openclaw/openclaw/issues

feral monolith
rotund caveBOT
# feral monolith nothing in particular as far as I can see.

Yeah, that output mostly says “auth/config basics are fine.” One important wrinkle: models status --json doesn’t print the effective catalog row/context window, so the absence of contextWindow there isn’t proof either way.

I’d check these next:

openclaw config get agents.defaults.contextTokens --json
openclaw config get models.providers.openrouter --json
openclaw models list --provider openrouter --all | grep -i 'kimi-k2.6\|context\|ctx'

What I’m looking for:

  • any global cap like agents.defaults.contextTokens around ~128k or lower
  • an explicit models.providers.openrouter.models entry for moonshotai/kimi-k2.6 with a bad/missing contextWindow
  • whether the models list knows Kimi’s ctx as ~262k

If none of that shows a bad cap, I’d try a targeted metadata pin as a workaround:

rotund caveBOT
# feral monolith nothing in particular as far as I can see.
openclaw config set models.providers.openrouter.models '[{"id":"moonshotai/kimi-k2.6","name":"Kimi K2.6","contextWindow":262144,"maxTokens":262142}]' --strict-json --merge --dry-run

If the dry run looks sane, rerun without --dry-run, then restart the gateway/session and retry /new.

If it still loops with the context pinned, that’s very likely a bug/provider-route mismatch rather than user config. The issue report should include the 119,048 → 29,944 compaction line plus these status/config snippets, with tokens redacted: https://github.com/openclaw/openclaw/issues

feral monolith
#

[claw@openclaw ~]$ openclaw config get agents.defaults.contextTokens --json
Config path not found: agents.defaults.contextTokens. Run openclaw config validate to inspect config shape.
[claw@openclaw ~]$ openclaw config get models.providers.openrouter --json
Config path not found: models.providers.openrouter. Run openclaw config validate to inspect config shape.
[claw@openclaw ~]$ openclaw models list --provider openrouter --all | grep -i 'kimi-k2.6|context|ctx'
Model Input Ctx Local Auth Tags
openrouter/moonshotai/kimi-k2.6 text+image 256k no yes default,configured
[claw@openclaw ~]$ openclaw config set models.providers.openrouter.models '[{"id":"moonshotai/kimi-k2.6","name":"Kimi K2.6","contextWindow":262144,"maxTokens":262142}]' --strict-json --merge --dry-run
│
â—‡

OpenClaw 2026.5.20 (e510042) — Self-hosted, self-updating, self-aware (just kidding... unless?).

Dry run successful: 1 update(s) validated against ~/.openclaw/openclaw.json.
[claw@openclaw ~]$ openclaw config set models.providers.openrouter.models '[{"id":"moonshotai/kimi-k2.6","name":"Kimi K2.6","contextWindow":262144,"maxTokens":262142}]' --strict-json --merge
│
â—‡

OpenClaw 2026.5.20 (e510042) — Pairing codes exist because even bots believe in consent—and good security hygiene.

Updated models.providers.openrouter.models. Restart the gateway to apply.
[claw@openclaw ~]$ !sudo
sudo systemctl restart openclaw-gateway.service
[claw@openclaw ~]$

And then:

Auston:
Testing hi

Ridge ⛰️:
🧹 Compacting context (1 messages) so I can continue without losing history…

Ridge ⛰️:
âś… Context compacted. Continuing from where I left off.

Ridge ⛰️:
🧹 Compacting context (4 messages) so I can continue without losing history…

Ridge ⛰️:
âś… Context compacted. Continuing from where I left off.

Ridge ⛰️:
Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.

rotund caveBOT
# feral monolith [claw@openclaw ~]$ openclaw config get agents.defaults.contextTokens --json Conf...

Ah, good catch. This points at maxTokens, not contextWindow.

I gave you a too-literal metadata pin there: maxTokens:262142 means OpenClaw may ask the provider for an output budget near the whole context window. If the initial prompt is ~119k, then prompt + requested output can exceed Kimi/OpenRouter’s 256k total budget and trigger exactly this kind of immediate overflow.

Try overriding the runtime output budget, not the context window:

openclaw config set agents.defaults.models '{"openrouter/moonshotai/kimi-k2.6":{"params":{"maxTokens":8192}}}' --strict-json --merge --dry-run

If that validates:

openclaw config set agents.defaults.models '{"openrouter/moonshotai/kimi-k2.6":{"params":{"maxTokens":8192}}}' --strict-json --merge
sudo systemctl restart openclaw-gateway.service
rotund caveBOT
# feral monolith [claw@openclaw ~]$ openclaw config get agents.defaults.contextTokens --json Conf...

Then retry /new.

If 8192 works, you can try 16384 or 32768, but I would not use the advertised 262k max output for this route inside OpenClaw. The docs confirm per-model agents.defaults.models["provider/model"].params.maxTokens is the right stream-param override:
https://docs.openclaw.ai/gateway/configuration

Sorry for the noisy first pin. The context metadata was fine; the output budget was the sharp edge.

feral monolith
#

Auston:
/new

Ridge ⛰️:
âś… New session started.

Auston:
maxTokens 8192 test. Tell me your favorite fact about the number 8192

Ridge ⛰️:
8192 is 2^13, which slots it into one of my favorite math stories: the chessboard problem.

If you place one grain of rice on the first square, double it each time — 1, 2, 4, 8, 16 — by the 14th square you've reached exactly 8,192 grains. It's the threshold where the doubling starts to feel real: by the end of that row, you're at millions, and by the end of the board you'd bankrupt the planet.

Beyond that, it's a workhorse in computing — 8 KiB, 2^13 — the kind of number that lives in memory pages and audio buffer sizes. But personally? I like it as the moment exponential growth stops being theoretical and starts getting heavy.

====

lol

Can you explain... why? What does this setting practically do? What do different settings control, is this truncation or a parameter the LLM itself understands how to parse? Why do most models work without setting a custom one?

rotund caveBOT
# feral monolith > Auston: /new > Ridge ⛰️: ✅ New session started. > Auston: maxTokens 8192 tes...

Nice, that confirms it.

Practical version:

  • contextWindow = total budget for one model request: input prompt + tool schemas + system/bootstrap text + conversation + requested output.
  • maxTokens = requested maximum output length for the assistant response.
  • reserveTokens / compaction settings = OpenClaw/Pi’s internal headroom rules for deciding when to summarize old history before sending the next request.
  • keepRecentTokens = roughly how much recent conversation to preserve outside the summary when compaction happens.

So this was not truncation. maxTokens is sent as a model/API parameter, usually max_tokens or provider-equivalent. It tells the provider “allow the model to generate up to this many tokens.” Providers generally validate:

input tokens + requested max output tokens <= model context limit

Your new session was already carrying a big OpenClaw agent prompt/tool/bootstrap payload, around 119k tokens from the earlier compaction log. With Kimi’s OpenRouter route showing ~256k context, asking for maxTokens near 262k makes the request impossible before the model even starts:

rotund caveBOT
# feral monolith > Auston: /new > Ridge ⛰️: ✅ New session started. > Auston: maxTokens 8192 tes...
~119k input + ~262k requested output > ~256k total window

Setting params.maxTokens: 8192 makes the request shape more like:

~119k input + 8k requested output < ~256k total window

Why most models work without a custom one: their catalog metadata or transport defaults usually use a sane output cap, like 8k/16k/32k/64k, or the provider tolerates/clamps it. This Kimi/OpenRouter metadata advertises a giant max completion value close to the whole context window, and OpenClaw appears to trust that too literally for agent turns. That’s legal metadata in a narrow sense, but bad as a default for an agent with a large prompt.

I’d keep 8192 if replies feel fine, or try 16384 / 32768 if you want longer answers. If you file this, the bug is basically: OpenClaw should probably clamp requested output tokens so prompt + maxTokens fits the resolved context window, especially for OpenRouter routes that advertise context-sized max completion.

Glad that helped. If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow indexes it for the next person.