About compaction | Friends of the Crustacean 🦞🤝 | Page 1

frail laurel Mar 10, 2026, 6:06 PM

#

In the openclaw dashboard, I could see compacting context but it takes too long to compact. Any reasons and solutions for it?

jagged ridgeBOT Mar 10, 2026, 6:06 PM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

molten mauveBOT Mar 10, 2026, 6:10 PM

#

Compaction is not “just a local cleanup” — it’s an actual LLM summarization run over a potentially huge chunk of your session history. So “Compacting context…” can take a while when any of these are true:

Why it can be slow

Your transcript is large (lots of messages + big tool outputs). More tokens in → slower summarization.
The session’s model is slow / rate-limited (compaction uses your primary model unless you override it).
It may be doing multiple passes: OpenClaw can do a silent memory flush before compaction, then compaction, then retry your original request. That can feel like “it’s stuck compacting”.

Things that usually fix it

Use a faster dedicated model for compaction (so your “main” model can be whatever you like, but compaction stays snappy):
agents.defaults.compaction.model: "<provider>/<model-id>"
Docs: https://docs.openclaw.ai/concepts/compaction
Reduce how much it tries to keep uncompressed (if you’re compacting huge histories, keeping too many “recent tokens” makes the compaction job itself massive). The deep dive explains the knobs:
- reserveTokens
- keepRecentTokens
  Docs: https://docs.openclaw.ai/reference/session-management-compaction

#

Stop context bloat at the source
- If you have giant tool results (long exec output, big file reads), compaction has to chew through that history.
- Use /context detail to see what’s dominating your context, and consider session pruning (esp. for Anthropic) to keep old tool results from ballooning context between compactions.
  Docs: https://docs.openclaw.ai/concepts/context and https://docs.openclaw.ai/concepts/session-pruning

Quick debugging questions (so we can be precise):

What does /status show for contextTokens / contextWindow and Compactions?
What’s your openclaw --version and which provider/model are you using for that session?
Is this a local model (Ollama/LM Studio/etc.) or cloud?

If you paste /status output, I can tell whether you’re hitting “huge transcript”, “slow model”, or “compaction thresholds too conservative.”

frail laurel Mar 10, 2026, 6:11 PM

#

Yes I have a lot of texts, also with exec results so…

molten mauveBOT Mar 10, 2026, 6:12 PM

#

Yep — lots of exec output is the #1 reason compaction feels slow. Compaction has to “read” (tokenize + summarize) all that accumulated history, and tool results add up fast even when each one is individually truncated.

Practical ways to improve it:

Don’t dump huge command output into the session
- Prefer “summaries” at the shell level: | tail -n 200, | rg 'error|warn', --json + filter, etc.
- For logs, grab just the relevant slice instead of the whole thing.
Enable session pruning (prevents old tool results from bloating the prompt between turns)
- This trims old toolResult messages right before model calls (doesn’t rewrite the JSONL transcript, but it does keep context smaller so you hit compaction less / later).
- Config is agents.defaults.contextPruning (docs + examples):
  https://docs.openclaw.ai/concepts/session-pruning
- Note: right now it’s Anthropic-only (and OpenRouter Anthropic). If you’re on a different provider, tell me which one and I’ll point at the right lever.
Use a faster model just for compaction
- Set agents.defaults.compaction.model to a faster/cheaper summarization model you have access to, so your “main” model choice doesn’t make compaction painful.
  Docs: https://docs.openclaw.ai/concepts/compaction

#

If you’re already deep in the weeds: start a fresh session
- /new (or /reset) is the instant fix when a session transcript is huge and you don’t need it all kept “alive”.

If you paste:

/status (especially contextTokens + compactions count)
what provider/model you’re using
I can tell whether pruning will help immediately, or whether you should mostly solve it by swapping the compaction model / tuning keepRecentTokens & reserveTokens (deep dive: https://docs.openclaw.ai/reference/session-management-compaction).

frail laurel Mar 10, 2026, 6:13 PM

#

It’s openai gpt 4.5 any good way to solve this? Also with thinking high, which I think it should be none or low

molten mauveBOT Mar 10, 2026, 6:15 PM

#

If you’re on openai/gpt-4.5 and /think high, then yeah: compaction is going to feel chunky. Compaction is essentially a big summarization call, and “high thinking” adds extra latency on top.

1) Drop thinking for normal chat (and only turn it up when you need it)

You can persist it for the session with a directive-only message:

/think off (or /think low / /think minimal)

Then, when you do want deeper reasoning for one message, you can add it inline (non-persistent hint) to just that request.

Docs (exact options + behavior): https://docs.openclaw.ai/tools/slash-commands

2) Use a different (faster) model just for compaction

By default compaction uses the session’s primary model — so if your primary is slow/expensive, compaction will be too. You can override:

agents.defaults.compaction.model: "<provider>/<model-id>"

Pick something you have access to that’s “fast + good at summarizing” rather than “max intelligence”.

#

Docs: https://docs.openclaw.ai/concepts/compaction

3) Exec output is the gasoline on the fire

Even truncated outputs accumulate. Best mitigation is operational:

pipe/slice output (| tail -n 200, | rg ..., etc.)
avoid dumping whole logs into the session
if the session is already huge, /new is the instant reset

One quick check

Run /status once and see how close you are to the context window + how often compaction has happened. If you paste that (plus your openclaw --version), I can tell whether you’re mostly suffering from thinking level, model choice for compaction, or just sheer transcript size.

frail laurel Mar 10, 2026, 6:19 PM

#

It should not take 10-20minutes right?

molten mauveBOT Mar 10, 2026, 6:21 PM

#

No — 10–20 minutes is not normal for compaction.

In most setups compaction is on the order of seconds to a couple minutes. When it’s that long, it usually means the run is not “just compacting” but is stuck/retrying behind the scenes.

Common causes in your case (OpenAI + lots of exec output + /think high):

You’re effectively paying for multiple slow calls
- (optional) silent memory flush
- compaction summarization
- retry of the original request
  With /think high, each of those can get painfully slow.
OpenAI request is stalling or retrying (rate limits / network / transport)
- You’ll see this immediately in logs as repeated retries / long gaps.

What I’d do first (fast triage)

In-chat: /think off (or /think low) then run /compact once manually.

#

On the host: run openclaw logs --follow while it’s “compacting” and look for 429s/timeouts/retry loops. (This is in the official troubleshooting ladder: https://docs.openclaw.ai/gateway/troubleshooting)

If logs suggest OpenAI transport weirdness

You can try forcing SSE (often more stable than WS for some networks) and/or disabling WS warmup:

agents.defaults.models["openai/<your-model>"].params.transport: "sse"
...params.openaiWsWarmup: false
Docs: https://docs.openclaw.ai/providers/openai

Longer-term fixes

Keep /think low/off by default, only bump it for one-off prompts.
Set a faster compaction model via agents.defaults.compaction.model (so compaction doesn’t inherit your “heavy” model).

If you paste the relevant log snippet around one compaction attempt (even just the error/retry lines) + openclaw --version, I can tell you which bucket you’re in (stall vs rate limit vs “too much context + high thinking”).

frail laurel Mar 10, 2026, 6:29 PM

#

Agent embedded run timeout during compaction?

molten mauveBOT Mar 10, 2026, 6:30 PM

#

Yep — that error is exactly what it sounds like: compaction runs inside the normal embedded agent loop, and the loop has a hard timeout.

From the agent-loop docs, the embedded runtime timeout is:

agents.defaults.timeoutSeconds (default 600s / 10 minutes)
https://docs.openclaw.ai/concepts/agent-loop#timeouts

So if compaction (plus any pre-flush + retry) takes longer than that, you’ll see “Agent embedded run timeout” and the compaction won’t finish.

Best fixes (recommended)

Turn thinking down (high thinking makes compaction much slower)
- /think off (or /think low) in that session
Use a faster model for compaction
- agents.defaults.compaction.model: "<provider>/<model-id>"
  (so your main model can stay “smart”, but compaction stays fast)

#

Reduce giant exec dumps (they’re the usual reason compaction becomes huge)

If you just need it to stop timing out

Increase the runtime timeout in openclaw.json, e.g.:

{
  agents: {
    defaults: {
      timeoutSeconds: 1200, // 20 minutes
    },
  },
}

If you paste the /status output (contextTokens/contextWindow + compactions count), I can help you pick whether this is better solved by timeoutSeconds vs compaction model + thinking level (usually the latter).

frail laurel Mar 10, 2026, 6:34 PM

#

Is there easier way to compact? Always making problems during compaction. Should I create outside-in compaction mode?

molten mauveBOT Mar 10, 2026, 6:35 PM

#

“Outside‑in compaction mode” isn’t a built-in thing today (at least there’s no documented mode: "outside-in" in agents.defaults.compaction). So I wouldn’t start by inventing a new mode — the problems you’re seeing are almost certainly from runtime timeout + huge transcript + slow settings, not the compaction strategy itself.

Easiest / most reliable ways to make compaction stop being painful

Make compaction fast

Set default thinking low/off: /think off (or /think low)
Use a dedicated compaction model: agents.defaults.compaction.model: "<provider>/<model-id>" (pick something fast at summarization)

Stop it timing out mid-compaction

If you’re hitting “Agent embedded run timeout”, bump:
- agents.defaults.timeoutSeconds (default is 600s / 10 min)
  Docs: https://docs.openclaw.ai/concepts/agent-loop#timeouts

Prevent the transcript from becoming “uncompactable”

Biggest win: don’t persist huge exec dumps (slice output with tail, rg, etc.)
Compaction has to read that history to summarize it — so garbage in = slow compaction.

#

Operational workaround: compact earlier / reset more often

Run /compact at natural milestones (before the session becomes enormous)
Or /new when the thread has served its purpose

If you really want custom behavior (advanced)

OpenClaw can delegate /compact + context assembly to a context-engine plugin (kind: "context-engine" via plugins.slots.contextEngine), but that’s a real engineering project, not a quick config tweak. Docs: https://docs.openclaw.ai/tools/plugin

If you paste your /status and the exact timeout/error line from logs, I can suggest a minimal config that usually fixes this (timeoutSeconds + thinking + compaction.model).

frail laurel Mar 10, 2026, 6:48 PM

#

What model is the best for compaction then?

#About compaction