#Context overflow: prompt too large for the model. Try again with less input or a larger-context mode

1 messages · Page 1 of 1 (latest)

spring drumBOT
#

That message isn’t a “bug” you can permanently patch away — it’s the model hitting its hard context-window limit (your chat history + tool output + system prompts no longer fit in one request).

The real, final fixes are:

  1. Keep the same convo, but shrink it:
    Send:
/compact

Then retry your message. (This summarizes older turns into a compact summary and keeps recent messages.)
Docs: https://docs.openclaw.ai/concepts/compaction

  1. Start fresh (guaranteed):
    If you just need it to work immediately, do:
/new
#

(or /reset)
This creates a fresh session for that chat, so there’s no accumulated history to overflow.

  1. If it keeps happening in normal use: enable session pruning so old tool results don’t keep bloating the prompt, and/or switch to a larger-context model.
    Docs: https://docs.openclaw.ai/concepts/session-pruning
    FAQ section: https://docs.openclaw.ai/help/faq#im-getting-context-too-large-errors-how-do-i-reset-or-compact

If you tell me which surface you’re using (Discord/Telegram/etc) and whether this happens in one specific long thread or immediately in new chats, I can point you to the best option (compact vs reset vs pruning vs model).

#

If it happens in a brand-new Telegram chat, then /new / /reset / /compact won’t matter — this is almost certainly not “chat history got too long”.

In that case the only realistic causes are:

  1. The model you’re actually hitting has a much smaller context window than you think (common when the model ref/provider isn’t what you expect, or the catalog entry is wrong), or
  2. Your base prompt is huge (tool schemas + skills list + injected workspace files), so even the first request overflows.

Fastest way to pinpoint it (works even when the LLM can’t reply)

In Telegram, send these as standalone messages and paste the output here:

  • /status
  • /context list (or /context detail if it works)

Those commands are Gateway-handled and will show things like:

  • the exact model ref being used
  • the context window OpenClaw thinks it has (ctx=...)
  • how big your system prompt + tool schema JSON overhead is
#

Immediate “make it stop” workaround (if tool schema is the culprit)

If /context shows the tool schemas/system prompt are already close to the limit, temporarily shrink tools:

In ~/.openclaw/openclaw.json (JSON5), set a minimal tool profile (you can do it globally just to test):

{
  tools: { profile: "minimal" }
}

Then retry a simple “hi” message in a new Telegram chat.
If that fixes it, we can widen it back up (e.g. “messaging” or “coding”) and/or deny only the big offenders (often browser, canvas, nodes).

Tool profiles docs: https://docs.openclaw.ai/tools

Also: confirm your install + model config

#

On the Gateway machine, paste (redact keys/tokens):

openclaw --version
openclaw models status

And double-check you’re really on the intended Kimi endpoint/model ref (OpenClaw has specific config examples for Moonshot Kimi with contextWindow set). Docs: https://docs.openclaw.ai/gateway/configuration (search for “Moonshot AI (Kimi)”).

If you paste /context list + openclaw models status, I can tell you which of the two it is in one reply and what to change.

#

Your /context list is the key clue: you’re only at ~14k tokens in the session, and OpenClaw thinks the window is 262k. So a real “prompt too large” overflow should be basically impossible here.

That means the overflow is almost certainly coming from the kimi-coding endpoint rejecting the request because of the requested output size (max tokens), or because that endpoint’s real limits are smaller than what the catalog says.

1) Quick isolation test (no config changes)

In Telegram, temporarily switch to your fallback model and send “hi”:

  • /model anthropic/claude-opus-4-6

If Opus works and Kimi doesn’t, the problem is 100% Kimi endpoint behavior, not your chat/session.

2) Most common fix: cap Kimi’s output tokens

Add a per-model maxTokens cap (this prevents providers that compute “input + requested output” from instantly throwing overflow):

Edit ~/.openclaw/openclaw.json and add:

#
{
agents: {
defaults: {
models: {
"kimi-coding/k2p5": {
params: { maxTokens: 4096 } // try 4096 first; 8192 also ok if it works
}
}
}
}
}

Then run:

#
openclaw gateway restart

Test again in a brand-new Telegram chat.

3) If it still overflows: shrink tool overhead (test)

As a test, set tools to minimal (just to see if Kimi is choking on tool schemas):

{ tools: { profile: "minimal" } }

Restart gateway, test. If this fixes it, we can move you to tools.profile: "messaging" (keeps messaging features but drops heavy UI tools like browser/canvas/nodes).

Docs on tool profiles: https://docs.openclaw.ai/tools

#

4) Important naming confusion

You’re on kimi-coding/k2p5 (Kimi Coding, Anthropic-compatible). That is not the same as Moonshot’s OpenAI-style moonshot/kimi-k2.5 provider, and the docs explicitly say the keys aren’t interchangeable. If you meant Moonshot Kimi, follow: https://docs.openclaw.ai/providers/moonshot

If you try steps (1) + (2) and it’s still failing, paste the exact overflow log lines from:

openclaw logs --tail 200 | grep -i overflow

and I’ll tell you which knob to turn next.

spring drumBOT
#

Yep — openclaw logs doesn’t have --tail. Use --limit:

openclaw logs --limit 200 | grep -i overflow

If you want to watch it live while you trigger the error:

openclaw logs --follow | grep -i overflow

(And if grep isn’t behaving on your shell, just run openclaw logs --limit 200 and paste the chunk around the error.)

spring drumBOT
#

Context overflow: prompt too large for the model. Try again with less input or a larger-context model.

#

Context overflow: prompt too large for the model. Try again with less input or a larger-context model.

#

Context overflow: prompt too large for the model. Try again with less input or a larger-context model.

#

Context overflow: prompt too large for the model. Try again with less input or a larger-context model.

#

Context overflow: prompt too large for the model. Try again with less input or a larger-context model.

digital quarry
#

@spring drum

#

I suspect there's something about that model that doesn't mesh with the way openclaw uses it. Are you hard-set on using that specific model? Have you tried another one from your provider?

tranquil glade
#

@ruby lark

#

Might be a bug with the latest update

#

I rolled back to the previous version and its working again

spring drumBOT
#

Saw the ping — what do you need here?

If you’re jumping into the thread to help: based on the /context list Firas posted (14k / 262k), this is not a real context-window overflow. The logs show LLM request timed out (~90s) plus kimi-coding making broken tool calls (e.g. read with no path, edit missing oldText), and then Anthropic fallback can’t save it because it’s rate-limited/cooldown.

The practical fix we suggested is to treat kimi-coding as chat-only:

  • restrict tools for that provider (tools.byProvider.kimi-coding.profile: "minimal")
  • cap output (agents.defaults.models["kimi-coding/k2p5"].params.maxTokens: 4096)
  • raise agents.defaults.timeoutSeconds while debugging

If you tell me what angle you’re looking at (true overflow vs provider timeout vs Telegram network), I can give you the exact minimal config snippet to post.

tranquil glade
#

@spring drum you should push this to github

spring drumBOT
#

Could be, yeah — but with the data Firas pasted it doesn’t look like a real “context overflow” regression.

Key bits from their /context list + logs:

  • /context: ~14k / 262k (5%) → nowhere near a real context limit
  • logs: LLM request timed out (hard stop at timeoutMs=90000)
  • logs also show broken tool calls from kimi-coding (read with no path, edit missing oldText) → the agent can get stuck retrying until it hits the 90s timeout
  • fallback can’t catch it because Anthropic is in cooldown / rate_limit

So the “context overflow” text is likely a generic/misleading provider error (or a different earlier failure), while the actual failure mode right now is timeout + tool-use flakiness on kimi-coding.

Workaround that usually fixes it immediately:

  • treat kimi-coding as chat-only:
  • tools.byProvider["kimi-coding"].profile: "minimal"
  • cap output: agents.defaults.models["kimi-coding/k2p5"].params.maxTokens: 4096
  • increase runtime: agents.defaults.timeoutSeconds: 600
#

If rolling back fixed it, that’s a pretty strong signal it’s a regression (either in OpenClaw or in the kimi-coding provider behavior that version expects).

I can’t “push this to GitHub” myself (I’m read-only here: no PRs/issues on your behalf), but if you (or Hazy) file an issue, it’ll get eyes fast.

Please open an issue here: https://github.com/openclaw/openclaw/issues
Copy/paste this checklist (it’s the stuff maintainers will ask for anyway):

  • Broken version: openclaw --version (the one that overflows/timeouts)
  • Working version: the exact version you rolled back to
  • OS: (macOS/Linux/Windows+WSL) + Node version if known
  • Model: kimi-coding/k2p5 (confirm) and whether Anthropic fallback is configured
  • Repro: “brand-new Telegram DM → send hi → fails”
  • /context list output (like Firas posted; key line: tokens + ctx window)
  • Log excerpt around the failure (openclaw logs --limit 300), especially any lines containing:
  • context overflow / prompt too large
  • LLM request timed out
  • kimi-coding
#
  • read tool called without path / edit failed: Missing required parameter

If you drop the two versions here (broken + working), I can help you phrase the issue summary + suspected cause (“provider returns misleading overflow vs real timeout/tool-call loop”).

tranquil glade
#

Nvm this also didnt work

spring drumBOT
#

Got it — then we need to stop guessing and pin down what “didn’t work” means now.

1) What’s the current failure mode?

In a brand‑new Telegram DM, when you send just hi, do you get:

  • (A) “Context overflow…” in chat?
  • (B) no reply + it eventually errors?
  • (C) “All models failed” / “LLM request timed out” in logs?

Your earlier logs looked like (C) (90s timeout), not real overflow.

2) Confirm the service is using the config you edited

Run and paste:

openclaw --version
openclaw gateway status
#

gateway status will show whether Config (cli) and Config (service) differ (super common cause of “I changed config but nothing changed”).

3) Grab just the relevant log lines (no huge paste)

Right after reproducing (send hi), run:

openclaw logs --limit 300 | grep -i -E "context overflow|prompt too large|timed out|all models failed|kimi-coding|cooldown|rate_limit|fetch failed|sendChatAction"

4) Quick isolation test (tells us if it’s Kimi vs everything)

In Telegram:

  • /model anthropic/claude-opus-4-6
  • then send hi

If Opus works but Kimi doesn’t → it’s a kimi-coding provider/runtime issue (tool calling / slow responses).
If Opus also fails → it’s Telegram/network/gateway/runtime, not model context.

#

Paste those 2 outputs (gateway status + the filtered log lines) and we can give you the exact fix instead of another round of “try X”.

tranquil glade
#

The issue I observed is, the system thinks we hit out context limit, but when checking its not true. And even if we are close the system supposed to compact the context but didnt

spring drumBOT
#

Yep — that exact pattern usually means the “context overflow” is not coming from actual session history size, and compaction can’t (or won’t) kick in.

A few important details about how compaction works in OpenClaw (Pi runtime):

Why compaction might not happen even if you see “context overflow”

Compaction triggers in only two cases (docs: https://docs.openclaw.ai/reference/session-management-compaction):

  1. Overflow recovery: the model returns a recognized context-overflow error → OpenClaw compacts → retries.
  2. Threshold maintenance: after a successful turn, if contextTokens > contextWindow - reserveTokens.

So if the run fails before a successful reply (timeouts, network errors, tool-call loops), you won’t get the “maintenance” compaction pass.

And if the provider returns a misleading / non-standard error shape (string says “overflow” but it’s not classified as such), you also won’t get “overflow recovery” compaction.

The other big gotcha: compaction can’t shrink the system prompt

Compaction only summarizes conversation history. It does not reduce:

  • tool schemas (often thousands of tokens)
#

So if the real provider limit is smaller than what the catalog claims, you can overflow in a “new chat” even at low /context numbers — and compaction can’t help because there’s nothing to summarize yet.

Practical fix (fastest to validate)

Force the prompt overhead way down for kimi-coding:

{
tools: {
byProvider: {
"kimi-coding": { profile: "minimal" }
}
}
}
#

If that suddenly “fixes overflow on new chats”, it strongly suggests the provider’s actual context window is smaller / stricter than advertised (or it chokes on tool schemas).

To turn this into a solid bug report

Ask for one log snippet that includes the actual provider error payload around the “overflow”, plus /context list. Something like:

openclaw logs --limit 500 | grep -i -E "context overflow|prompt too large|kimi-coding|compaction|reserveTokens|All models failed|timed out"

If you paste that (or have them paste it), we can tell whether it’s:

  • real overflow (provider limit mismatch),
  • timeout/tool-loop being mislabeled as overflow,
  • or compaction classification not matching that provider’s error format.
tranquil glade
spring drumBOT
#

Nice catch — CLI/OpenClaw version and the running Gateway service version can drift, and the Gateway is what actually runs sessions + compaction + model catalogs. If the service is older, you can get exactly this kind of “it says overflow but /context shows plenty of room” weirdness.

Practical fix (for anyone else hitting this):

  1. Check what’s actually running:
openclaw --version
openclaw gateway status
openclaw status

If the service points at an older install/path, you’ve got a mismatch.

  1. Repair the service to use the current install, then restart:
openclaw doctor
openclaw gateway restart
#

(Doctor is the usual “rewrite the daemon entrypoint/config” fixer when versions drift.)

Good that you filed it: https://github.com/openclaw/openclaw/issues/15669
If you can, comment there with both versions (CLI vs Gateway) + the openclaw gateway status output — that’ll make it straightforward to reproduce.

If this solved it for you, please also mark the solution in the thread (right-click the solution message → Apps → ✅ Mark Solution) so Answer Overflow indexes it.

tardy dove
#

I was having some issues with this last night also

#

I haven't figured it out yet either

spring drumBOT
#

This one’s bitten a few people: CLI version ≠ running Gateway service version. The symptoms can look like “fake context overflow / compaction not triggering / weird model behavior”.

Quick checklist to confirm + fix:

  1. Compare versions + service entrypoint:
openclaw --version
openclaw gateway status

If gateway status shows the service using a different config/install path (or just behaves like an older build), you’ve got drift.

  1. Repair + restart:
openclaw doctor
openclaw gateway restart
#

Do these two commands and paste the output (redact tokens/paths if you want, but keep the version lines):

openclaw --version
openclaw gateway status

What we’re looking for is a mismatch where the CLI is one version but the Gateway service is running an older entrypoint (that’s what caused the “fake context overflow / no compaction” reports).

If you do see drift, the usual fix is:

openclaw doctor
openclaw gateway restart

After that, test in a fresh chat and run /context list.