#Auto compactions issue

1 messages ¡ Page 1 of 1 (latest)

jovial sailBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

last torrentBOT
#

First check version please:

openclaw --version

Likely explanation: 1M native context ≠ OpenClaw’s runtime context budget. For gpt-5.5 via Codex OAuth, OpenClaw documents a native contextWindow of 1,000,000, but a default runtime contextTokens cap of 272,000, so auto-compaction before 1M is expected unless you raise that cap. (docs.openclaw.ai)

To confirm what’s actually active, run in that chat:

/status
/context detail

If it’s Codex OAuth, you can raise the runtime cap in ~/.openclaw/openclaw.json:

#
{
  models: {
    providers: {
      "openai-codex": {
        models: [{ id: "gpt-5.5", contextTokens: 512000 }],
      },
    },
  },
}

Then restart the gateway.

If you’re using direct OpenAI Responses (openai/* on api.openai.com), OpenClaw also auto-enables server-side compaction with a default threshold around 70% of the model context window; that can be tuned/disabled with responsesCompactThreshold / responsesServerCompaction. OpenAI’s Responses API itself has compaction support for long-running conversations. (docs.openclaw.ai)

That error means the run ended without a usable visible assistant response; the “tool actions may have already been executed” part means don’t blindly retry if it was doing file/system actions. Grab:

frail prism
#

so i did raise the context in openclaw.json and i am using codex gpt-5.5

last torrentBOT
#

Gotcha — then the key thing is: did you raise contextTokens on the exact openai-codex/gpt-5.5 model row?

For Codex GPT-5.5, setting only contextWindow: 1000000 won’t necessarily change the runtime compaction budget, because the built-in row already has contextTokens: 272000.

Use this shape:

{
  models: {
    providers: {
      "openai-codex": {
        models: [
          {
            id: "gpt-5.5",
            contextWindow: 1000000,
            contextTokens: 512000
#
          }
        ]
      }
    }
  }
}

Then:

openclaw gateway restart

And verify in the same chat:

#
/status
/context detail

If /status still shows ~272k context, then either:

  1. config is under openai instead of openai-codex,
  2. gateway didn’t reload the config,
  3. the current session has stale model/context state — try /new,
  4. or the model override in that chat is not actually openai-codex/gpt-5.5.

Also please paste openclaw --version + the relevant redacted config snippet if it still compacts early.

frail prism
#

so this is my settings

• openai-codex/gpt-5.4
• contextWindow: 1050000
• contextTokens: 1050000
• openai-codex/gpt-5.5
• contextWindow: 1000000
• contextTokens: 1000000

last torrentBOT
#

Those model entries look right. If it’s still compacting early, I’d check these next:

  1. Agent-level override beats the model row

    Check if you have a lower cap here:

    openclaw config get agents.defaults.contextTokens
    openclaw config get agents.list
    

    If agents.defaults.contextTokens or an agent’s contextTokens is set lower, remove/raise it.

  2. Verify the active session is actually using that route

    In the chat, run:

#
   /status
   /context detail

You want to see openai-codex/gpt-5.5 and ~1,000,000 context.
If it says openai/gpt-5.5 + Runtime: OpenAI Codex, that’s the native Codex harness path, not the openai-codex/* PI route.

  1. Expected compaction point is not exactly 1M

    OpenClaw/Pi keeps reserve headroom, default floor is ~20k tokens, so with 1M context it can compact around 980k, not exactly 1,000,000.

  2. If you changed config recently

    Restart + fresh session:

#
   openclaw gateway restart

Then in chat:

/new
/status

If /status shows 1M but it compacts way before ~980k, paste openclaw --version, /status, and the log lines around the compaction/error. Then it’s probably overflow recovery or runtime mismatch rather than config.

frail prism
#

openclaw config get agents.defaults.contextTokens

🦞 OpenClaw 2026.4.24 (cbcfdf6) — If you can describe it, I can probably automate it—or at least make it funnier.

Config path not found: agents.defaults.contextTokens

last torrentBOT
#

That result is actually good — Config path not found means there is no global agents.defaults.contextTokens override, so that’s not what’s forcing it lower.

Next checks:

openclaw config get agents.list
openclaw models list --provider openai-codex

In agents.list, look for any per-agent contextTokens value. If there’s none, check whether the model catalog is really seeing your 1M override:

  • If models list shows gpt-5.5 as 272k/1000k or similar → your model override isn’t being applied.
  • If it shows 1000k → config is fine, then the session/runtime is the next suspect.

Then in the same chat/session that compacts:

#
/status
/context detail

If /status still shows the old context cap, try:

/new
/model openai-codex/gpt-5.5
/status

Also: don’t paste full agents.list if it contains secrets/tokens — redact first. Docs for the relevant model config bit: https://docs.openclaw.ai/providers/openai

frail prism
#

openclaw models list --provider openai-codex

🦞 OpenClaw 2026.4.24 (cbcfdf6) — If it works, it's automation; if it breaks, it's a "learning opportunity."

Model Input Ctx Local Auth Tags
openai-codex/gpt-5.5 text 977k no yes default,configured
openai-codex/gpt-5.4 text 1025k no yes configured

last torrentBOT
#

That output means your override is being picked up. 977k is just how OpenClaw formats ~1,000,000 tokens, so model config is not the problem now.

So next suspect is the active session/runtime state or the compaction run itself failing.

Please run in the same chat that’s failing:

/status
/context detail

Things to check:

  • If /status shows openai-codex/gpt-5.5 and ~977k ctx, config is fine.
  • If it shows a lower ctx, do:
#
/new
/model openai-codex/gpt-5.5
/status

Also, compaction near ~957k–977k is expected because OpenClaw/Pi reserves output/headroom. But if it’s compacting much earlier or repeatedly failing, grab:

openclaw logs --limit 300 --plain

Look for lines around compaction, context length, request_too_large, or the exact “couldn't generate a response” event.

frail prism
#

Tokens: 28k in / 3.3k out
🗄️ Cache: 70% hit · 66k cached, 0 new
📚 Context: 67k/1.0m (7%) · 🧹 Compactions: 0
📊 Usage: 5h 97% left ⏱2h 6m · Week 68% left ⏱1d 5h

last torrentBOT
#

That status says the current session is not near compaction:

  • Context: 67k/1.0m (7%)
  • Compactions: 0

So the config is working. The “Agent couldn't generate a response” error is likely not from context pressure in this session.

Next thing I’d check is logs around the failed turn:

openclaw logs --limit 300 --plain

Search for lines mentioning:

#
couldn't generate
compaction
context length
request_too_large
openai-codex
error

If the error happened in another chat/thread/session, run /status there too — session state is per conversation. If logs don’t show a clear provider error, try /new once and retry the same prompt; that will tell us whether it’s stale transcript/session-specific or a broader Codex runtime issue.

frail prism
#

its very difficult to search can u tell me a command that does that

last torrentBOT
#

Yep — run this right after the failure:

openclaw logs --limit 1000 --plain | grep -nEi -C 3 "couldn.?t generate|agent couldn|compaction|auto-compaction|context length|request_too_large|input exceeds|too long|openai-codex|gpt-5\.5|error|fail|exception"

If that prints too much, narrower version:

openclaw logs --limit 1000 --plain | grep -nEi -C 5 "couldn.?t generate|compaction|context length|request_too_large"

Paste the matching chunk here, but redact tokens/keys/URLs if any show up.

frail prism
#

the log is too big how can i share here

last torrentBOT
#

Don’t share the whole log — just extract a small redacted slice.

Try this:

openclaw logs --limit 2000 --plain \
  | grep -nEi -C 8 "couldn.?t generate|agent couldn|compaction|auto-compaction|context length|request_too_large|input exceeds|too long|openai-codex|gpt-5\.5|error|fail|exception" \
  | tail -200

If it’s still huge, save it to a file:

openclaw logs --limit 2000 --plain \
  | grep -nEi -C 8 "couldn.?t generate|agent couldn|compaction|auto-compaction|context length|request_too_large|input exceeds|too long|openai-codex|gpt-5\.5|error|fail|exception" \
  | tail -200 > openclaw-error-snippet.txt
#

Then upload openclaw-error-snippet.txt here.

Before uploading, redact anything that looks like:

  • API keys / tokens
  • gateway URLs with tokens
  • private file paths you don’t want public
  • personal message content if unrelated
frail prism
#

i cannot share files here

last torrentBOT
#

No worries — don’t upload a file. Just paste a small terminal snippet.

Run this; it should print max ~80 redacted lines:

openclaw logs --limit 2000 --plain \
  | grep -nEi -C 4 "couldn.?t generate|agent couldn|compaction|context length|request_too_large|input exceeds|too long|openai-codex|gpt-5\.5|HTTP [45][0-9][0-9]|error|exception" \
  | sed -E 's/(sk-[A-Za-z0-9_-]+)/[REDACTED_KEY]/g; s/(token=)[^ ]+/\1[REDACTED]/g; s/(authorization: bearer )[A-Za-z0-9._-]+/\1[REDACTED]/Ig' \
  | tail -80

Then paste the output here inside triple backticks:

```text
paste output here
#
```

If it’s still too much, run the same command with tail -40 instead of tail -80.

frail prism
#

how about i share it here in parts

last torrentBOT
#

Yes, that works — but please share filtered chunks, not the whole raw log.

Best format:

PART 1/3
```log
<paste ~80 lines>

Then PART 2/3, etc.

Before pasting, run this so each part is already filtered/redacted:

```bash
#
openclaw logs --limit 3000 --plain \
  | grep -nEi -C 5 "couldn.?t generate|agent couldn|compaction|context length|request_too_large|input exceeds|too long|openai-codex|gpt-5\.5|HTTP [45][0-9][0-9]|error|exception" \
  | sed -E 's/(sk-[A-Za-z0-9_-]+)/[REDACTED_KEY]/g; s/(token=)[^ ]+/\1[REDACTED]/g; s/(authorization: bearer )[A-Za-z0-9._-]+/\1[REDACTED]/Ig'

If it prints a ton, paste only the parts around the most recent failure timestamp.

frail prism
#

Part 1/3

33:2026-04-26T23:17:52.473Z warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"57df549f-96e2-459f-8108-e425961837f7","isError":true,"error":"LLM request timed out.","failoverReason":"timeout","model":"gpt-5.5","provider":"openai-codex","rawErrorPreview":"terminated","rawErrorHash":"sha256:e8c95a2a7271","providerRuntimeFailureKind":"timeout"} embedded run agent end
34:2026-04-26T23:27:39.351Z warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"e233e688-bb4b-40b6-8cf2-34847760e36e","isError":true,"error":"LLM request timed out.","failoverReason":"timeout","model":"gpt-5.5","provider":"openai-codex","rawErrorPreview":"terminated","rawErrorHash":"sha256:e8c95a2a7271","providerRuntimeFailureKind":"timeout"} embedded run agent end
35:2026-04-26T23:27:54.128Z warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"e233e688-bb4b-40b6-8cf2-34847760e36e","isError":true,"error":"LLM request timed out.","failoverReason":"timeout","model":"gpt-5.5","provider":"openai-codex","rawErrorPreview":"terminated","rawErrorHash":"sha256:e8c95a2a7271","providerRuntimeFailureKind":"timeout"} embedded run agent end
36:2026-04-26T23:38:48.381Z warn agent/embedded {"subsystem":"agent/embedded"} {"event":"embedded_run_agent_end","tags":["error_handling","lifecycle","agent_end","assistant_error"],"runId":"7f2f292f-4829-45fe-9ff8-c53b0e6e757e","isError":true,"error":"LLM request rejected: Your input exceeds the context window of this model. Please adjust your input and try again.","failoverReason":null,"model":"gpt-5.5","provider":"openai-codex","rawErrorPreview":"Codex error:

last torrentBOT
#

Part 1 already tells us something useful: there are two separate failures:

  1. LLM request timed out
  2. LLM request rejected: Your input exceeds the context window of this model

So the compaction is probably OpenClaw reacting to a provider-side context rejection, not your config being ignored.

Please send part 2/3, especially the lines immediately after:

runId":"7f2f292f-4829-45fe-9ff8-c53b0e6e757e"

If easier, run this focused command:

#
openclaw logs --limit 3000 --plain \
  | grep -n -C 20 "7f2f292f-4829-45fe-9ff8-c53b0e6e757e"

That should give the exact failure block without the whole log.

frail prism
#

54-Docs: /tools/browser#playwright-requirement raw_params={"action":"act","targetId":"F1237BDBBC37D0CE82738F1029EF8041","kind":"evaluate","fn":"() => ({title: document.title, hasError: !!document.querySelector('.error'), headings: [...document.querySelectorAll('h2')].map(x=>x.textContent).slice(0,20)})"}
55:2026-04-26T23:40:48.584Z warn agent/embedded {"subsystem":"agent/embedded"} compaction retry aggregate timeout (60000ms): proceeding with pre-compaction state runId=7f2f292f-4829-45fe-9ff8-c53b0e6e757e sessionId=4f128ca8-0ca6-4af2-8ce5-513b923faf2d
56:2026-04-26T23:40:48.591Z warn agent/embedded {"subsystem":"agent/embedded"} using current snapshot: timed out during compaction runId=7f2f292f-4829-45fe-9ff8-c53b0e6e757e sessionId=4f128ca8-0ca6-4af2-8ce5-513b923faf2d
57:2026-04-26T23:41:14.860Z warn agent/embedded {"subsystem":"agent/embedded"} incomplete turn detected: runId=7f2f292f-4829-45fe-9ff8-c53b0e6e757e sessionId=4f128ca8-0ca6-4af2-8ce5-513b923faf2d stopReason=toolUse payloads=0 — surfacing error to user

last torrentBOT
#

Yep, this log explains the behavior:

  • Codex rejected the request: input exceeds context window
  • OpenClaw tried overflow-recovery compaction
  • That retry wait hit a hard 60s aggregate timeout
  • Then the turn ended incomplete after a tool call, so you got the scary “tool actions may have already been executed” message

So your 1M config is recognized, but the provider is still rejecting some real prompt as too large, and recovery compaction isn’t finishing fast enough.

Best workaround: lower the runtime cap so OpenClaw compacts before Codex rejects it:

{
  models: {
    providers: {
      "openai-codex": {
#
        models: [
          { id: "gpt-5.5", contextWindow: 1000000, contextTokens: 750000 },
          { id: "gpt-5.4", contextWindow: 1050000, contextTokens: 800000 }
        ]
      }
    }
  }
}

Then:

openclaw gateway restart
#

Optional but helpful: make compaction visible so you know when it happens:

{
  agents: {
    defaults: {
      compaction: {
        notifyUser: true,
        keepRecentTokens: 50000
      }
    }
  }
}

If this still happens with contextTokens: 750000, I’d file it as a GitHub issue with those log lines — especially the compaction retry aggregate timeout (60000ms) part: https://github.com/openclaw/openclaw/issues

frail prism
#

what is this keepRecentTokens: 50000

last torrentBOT
#

keepRecentTokens: 50000 means: when OpenClaw compacts, keep roughly the most recent 50k tokens verbatim, and summarize older history.

So after compaction, the next prompt has:

  • compacted summary of older stuff
  • last ~50k tokens exactly as-is

It’s useful because recent tool calls/errors/files stay intact instead of only being summarized.

You don’t need to set it. I suggested it as a safer long-session setting. If you’re unsure, leave it out and only lower contextTokens first:

{
  models: {
    providers: {
      "openai-codex": {
#
        models: [
          { id: "gpt-5.5", contextWindow: 1000000, contextTokens: 750000 }
        ]
      }
    }
  }
}

That’s the main fix.

frail prism
#

this is my compaction settings

{
"mode": "safeguard",
"reserveTokensFloor": 0,
"memoryFlush": {
"enabled": false,
"softThresholdTokens": 4000,
"prompt": "Review the session for any architectural decisions, bug fixes, or new patterns. Write personal notes to memory/YYYY-MM-DD.md. Reply NO_REPLY if nothing to store.",
"systemPrompt": "Session nearing compaction. Store durable memories now."
},
"notifyUser": true
}

last torrentBOT
#

Those settings are mostly fine, but reserveTokensFloor: 0 is working against what you want.

It disables OpenClaw’s safety floor, so compaction can happen very close to the provider limit. Since Codex is rejecting before OpenClaw thinks it must compact, add an explicit reserve:

{
  "mode": "safeguard",
  "reserveTokens": 150000,
  "reserveTokensFloor": 0,
  "memoryFlush": {
    "enabled": false
  },
  "notifyUser": true
}

What that means:

#
  • reserveTokens: 150000 → start compacting with ~150k tokens of headroom.
  • reserveTokensFloor: 0 → don’t force OpenClaw’s default minimum; okay if you set reserveTokens yourself.
  • memoryFlush.enabled: false → fine; avoids an extra pre-compaction agent turn.
  • notifyUser: true → just shows compaction notices.

I’d still also lower model contextTokens from 1000000 to something like 750000–850000. That’s the more reliable guard against Codex-side “input exceeds context window” errors.

frail prism
#

so what settings do u suggest look at my setting and then tell me updated settings

last torrentBOT
#

Yep — based on your logs/settings, I’d use this balanced safer setup:

{
  "models": {
    "providers": {
      "openai-codex": {
        "models": [
          {
            "id": "gpt-5.5",
            "contextWindow": 1000000,
            "contextTokens": 800000
          },
          {
            "id": "gpt-5.4",
            "contextWindow": 1050000,
#
            "contextTokens": 850000
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "compaction": {
        "mode": "safeguard",
        "reserveTokens": 150000,
        "reserveTokensFloor": 0,
        "keepRecentTokens": 50000,
        "timeoutSeconds": 900,
        "truncateAfterCompaction": true,
#
        "memoryFlush": {
          "enabled": false
        },
        "notifyUser": true
      }
    }
  }
}

Why this combo:

  • contextTokens: 800000 → don’t drive Codex all the way to the edge of its advertised 1M.
  • reserveTokens: 150000 → compact proactively before provider rejection.
  • keepRecentTokens: 50000 → keep recent work/tool calls intact after compaction.
  • truncateAfterCompaction: true → keeps active transcript smaller after compaction.
#
  • memoryFlush.enabled: false → avoids extra pre-compaction turn/time.

Then run:

openclaw gateway restart

And in chat:

/new
/status

You should see roughly 800k context for gpt-5.5.

frail prism
#

i should remove the prompts? and system prompts?

last torrentBOT
#

Yes — since you have:

memoryFlush: { enabled: false }

you can remove the prompt, systemPrompt, and softThresholdTokens fields. They won’t help while memory flush is disabled, and they just make the config noisier.

Use:

"memoryFlush": {
  "enabled": false
}

If later you re-enable memory flush, then add custom prompt / systemPrompt back. For now, simpler is better.

rain oxide
#

AFAIK, the 1mil context is only avaiable with API and not through codex oauth

#

if that helps

last torrentBOT
#

Yeah, good catch — Json is right here. OpenAI’s GPT‑5.5 announcement says Codex gets GPT‑5.5 with a 400K context window, while the API version is the one with 1M context. (openai.com)

So for openai-codex/gpt-5.5, I’d update to something like:

{
  "models": {
    "providers": {
      "openai-codex": {
        "models": [
          {
            "id": "gpt-5.5",
            "contextWindow": 400000,
            "contextTokens": 320000
          },
          {
OpenAI

Introducing GPT-5.5, our smartest model yet—faster, more capable, and built for complex tasks like coding, research, and data analysis across tools.

#
            "id": "gpt-5.4",
            "contextWindow": 400000,
            "contextTokens": 320000
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
      "compaction": {
        "mode": "safeguard",
        "reserveTokens": 80000,
        "reserveTokensFloor": 0,
        "keepRecentTokens": 50000,
#
        "timeoutSeconds": 900,
        "truncateAfterCompaction": true,
        "memoryFlush": {
          "enabled": false
        },
        "notifyUser": true
      }
    }
  }
}

That explains your logs perfectly: OpenClaw was configured for 1M, but Codex OAuth upstream rejected the real request as over its effective context window.

If you want the actual 1M path, that would be API-key openai/gpt-5.5 when available/eligible — not openai-codex/gpt-5.5.

#

Yes, that helps a lot — and I think Json is right.

The safer assumption is:

  • openai-codex/gpt-5.5 via Codex OAuth: treat as ~400k max, not 1M.
  • OpenAI’s API model docs may list gpt-5.5 with 1M, but ChatGPT/Codex surfaces have lower context limits. OpenAI’s Help page lists GPT‑5.5 Thinking as 400k for Pro and 256K for paid tiers. (help.openai.com)

So update your settings to:

{
  "models": {
    "providers": {
      "openai-codex": {
        "models": [
          {
#
            "id": "gpt-5.5",
            "contextWindow": 400000,
            "contextTokens": 320000
          },
          {
            "id": "gpt-5.4",
            "contextWindow": 400000,
            "contextTokens": 320000
          }
        ]
      }
    }
  },
  "agents": {
    "defaults": {
#
      "compaction": {
        "mode": "safeguard",
        "reserveTokens": 80000,
        "reserveTokensFloor": 0,
        "keepRecentTokens": 50000,
        "timeoutSeconds": 900,
        "truncateAfterCompaction": true,
        "memoryFlush": {
          "enabled": false
        },
        "notifyUser": true
      }
    }
  }
}
#

That should stop OpenClaw from driving Codex past the real upstream limit and hitting the “input exceeds context window” rejection.

rain oxide
#

thanks for verifying krill, you da best

last torrentBOT
#

Happy to help! 🦐 Nice catch — that explains the whole “config says 1M but upstream says nope” weirdness.

frail prism
#

400k is for both 5.4 and 5.5 in codex?

frail prism
last torrentBOT
#

Not exactly — correction:

  • GPT‑5.5 in Codex: yes, OpenAI says 400K context. (openai.com)
  • GPT‑5.4 in Codex: OpenAI’s GPT‑5.4 announcement says Codex had experimental 1M context support, and the API model is listed at 1.05M. (openai.com)

So safest settings:

{ id: "gpt-5.5", contextWindow: 400000, contextTokens: 320000 }

For gpt-5.4, you can either be conservative:

{ id: "gpt-5.4", contextWindow: 400000, contextTokens: 320000 }
OpenAI

Introducing GPT-5.5, our smartest model yet—faster, more capable, and built for complex tasks like coding, research, and data analysis across tools.

OpenAI

Introducing GPT-5.4, OpenAI’s most most capable and efficient frontier model for professional work, with state-of-the-art coding, computer use, tool search, and 1M-token context.

#

or test the long-context path:

{ id: "gpt-5.4", contextWindow: 1050000, contextTokens: 850000 }

Given your actual failure was on gpt-5.5, I’d definitely cap 5.5 to ~320k.

#

You got it — hope that gets the compaction weirdness under control 🦐

rain oxide
frail prism
#

{
"id": "gpt-5.5",
"contextWindow": 400000,
"contextTokens": 350000
},
{
"id": "gpt-5.4",
"contextWindow": 1050000,
"contextTokens": 950000
}

also for compaction make these changes

reserveTokens": 50000,
"reserveTokensFloor": 0,
"keepRecentTokens": 50000,
"timeoutSeconds": 900,
"truncateAfterCompaction": true,

what does this setting mean, when will auto compaction happen?

last torrentBOT
#

Yes — with those settings, auto-compaction should happen roughly here:

GPT‑5.5

contextWindow: 400000,
contextTokens: 350000,
reserveTokens: 50000

Effective budget is ~350k, and OpenClaw keeps 50k headroom.

So proactive compaction happens around:

350k - 50k = ~300k context used
#

GPT‑5.4

contextWindow: 1050000,
contextTokens: 950000,
reserveTokens: 50000

Compaction around:

950k - 50k = ~900k context used

What each setting means:

  • contextWindow = native/upstream max you believe the model supports.
#
  • contextTokens = OpenClaw’s runtime cap; it should stay below the real upstream limit.
  • reserveTokens: 50000 = leave ~50k tokens free for output/tool loops/provider overhead.
  • reserveTokensFloor: 0 = don’t force OpenClaw’s default reserve floor; use your explicit reserveTokens.
  • keepRecentTokens: 50000 = after compaction, keep the latest ~50k tokens verbatim and summarize older history.
  • timeoutSeconds: 900 = allow a single compaction operation up to 15 minutes.
  • truncateAfterCompaction: true = after successful compaction, rotate to a smaller active transcript.

Also: if Codex rejects before those thresholds, OpenClaw may still do overflow recovery compaction earlier. That’s expected.

frail prism
#

on updating openclaw i get this error Updating plugins...
No plugin updates needed.
Completion cache update failed: Error: spawnSync /usr/bin/node ETIMEDOUT

last torrentBOT
#

That one is probably non-fatal. It means the update finished, but the post-update shell completion cache generation timed out after ~30s.

It only affects CLI tab-completion, not the gateway/model/compaction stuff.

Run:

openclaw --version
openclaw doctor

If OpenClaw updated correctly, manually regenerate the completion cache:

OPENCLAW_COMPLETION_SKIP_PLUGIN_COMMANDS=1 openclaw completion --write-state
#

If that works, you’re done.

If it hangs/times out too, you can ignore it for now unless shell tab-completion is important. If your terminal startup feels slow, run:

openclaw doctor --fix

Docs: https://docs.openclaw.ai/cli/completion

frail prism
#

my openclaw is very slow

last torrentBOT
#

Slow in which part — CLI commands, gateway startup, or agent replies? Quick triage:

openclaw --version
time openclaw status
openclaw gateway status --deep
openclaw doctor

If CLI commands are slow / completion cache timed out, run:

openclaw doctor --fix
OPENCLAW_COMPLETION_SKIP_PLUGIN_COMMANDS=1 openclaw completion --write-state

If agent replies are slow, check in chat:

#
/status
/context detail

For Codex GPT‑5.5, don’t run it near fake 1M. Use your safer cap:

{ "id": "gpt-5.5", "contextWindow": 400000, "contextTokens": 350000 }

Also after config changes:

openclaw gateway restart
#

Then in chat:

/new
/status

If it’s still slow, paste openclaw status --all redacted. That’ll show whether it’s model latency, gateway/plugin issues, huge context, or service health.

frail prism
#

no i mean its taking too much cpu

last torrentBOT
#

Ah, CPU — got it.

First identify which process is burning CPU:

ps -eo pid,ppid,pcpu,pmem,etime,command \
  | grep -E '[o]penclaw|[n]ode' \
  | sort -k3 -nr \
  | head -20

Then check what OpenClaw is doing:

openclaw status --all
openclaw logs --limit 500 --plain \
#
  | grep -nEi "compaction|memory|index|embedding|plugin|discord|heartbeat|cron|completion|timeout|error|exception" \
  | tail -120

Common CPU causes:

  • Compaction loop → logs mention compaction repeatedly.
  • Huge transcript/session → enable truncateAfterCompaction: true like we discussed.
  • Memory/indexing/embeddings → logs mention memory, index, embedding.
  • Plugin/dependency loop after update → run:
    openclaw doctor
    
  • Shell completion cache → temporary; if it keeps happening:
#
  OPENCLAW_COMPLETION_SKIP_PLUGIN_COMMANDS=1 openclaw completion --write-state

If CPU stays high while idle, paste the top 5 lines from the ps command plus the filtered log chunk.

frail prism
#

real 1m22.761s
user 1m24.732s
sys 0m9.817s

last torrentBOT
#

That timing means the command is spending ~85s of CPU, not just waiting. That’s high.

Which command did you time — openclaw status, doctor, or completion --write-state?

Run these two to pinpoint it:

time openclaw --version
time openclaw status

And for the hot process:

ps -eo pid,ppid,pcpu,pmem,etime,command \
  | grep -E '[o]penclaw|[n]ode' \
#
  | sort -k3 -nr \
  | head -20

If openclaw --version is also slow, that’s CLI startup/plugin loading. If only status is slow, it’s probably gateway/state/log/session scanning.

frail prism
#

1197656 1197649 109 0.3 00:24 openclaw
1196753 1029 107 0.6 09:15 openclaw-gateway
1197347 1197339 13.5 0.3 03:39 openclaw-agent
1197649 1196289 1.5 0.0 00:24 openclaw
1197339 494813 0.2 0.0 03:39 openclaw

last torrentBOT
#

That shows two hot things:

  • openclaw PID 1197656 at ~109% CPU — likely the CLI command you just ran, maybe completion/cache/status.
  • openclaw-gateway PID 1196753 at ~107% CPU for 9 minutes — that’s the concerning one.

First, get full args so we know what they are doing:

ps -p 1197656,1197649,1196753,1197347,1197339 -o pid,ppid,pcpu,pmem,etime,args

Then check gateway logs for loops:

openclaw logs --limit 1000 --plain \
  | grep -nEi "loop|retry|compaction|completion|plugin|memory|index|embedding|heartbeat|cron|error|exception|timeout" \
#
  | tail -150

If you just need CPU relief now, restart the gateway:

openclaw gateway restart

If the CPU comes back immediately after restart, paste the full ps ... args output + the filtered log tail. Since you’re on 2026.4.24, I’d also run:

openclaw doctor

and if doctor mentions plugin/runtime deps or repeated repair, that may be the culprit.

frail prism
#

why is openclaw gateway taking so much cpu

1196753 root 88.3 0.7 680 MB openclaw-gateway

last torrentBOT
#

88% CPU on openclaw-gateway while idle is not normal. It’s usually one of:

  • plugin/dependency loop after update
  • memory indexing/embedding loop
  • compaction/retry loop
  • cron/heartbeat job stuck
  • channel reconnect loop

First get the exact loop from the gateway log:

tail -n 2000 "$(ls -t /tmp/openclaw/openclaw-*.log | head -1)" \
  | grep -nEi "plugin|runtime-deps|npm|ENOTEMPTY|memory|index|embedding|compaction|cron|heartbeat|retry|loop|timeout|error|exception" \
  | tail -120

If you just need relief now:

#
openclaw gateway restart
sleep 20
ps -p 1196753 -o pid,user,pcpu,pmem,rss,etime,args

If CPU jumps back up after restart, paste the filtered log chunk.

Also: you’re on 2026.4.24, and I’m suspicious of plugin/runtime-deps issues in that version. If the log mentions plugin-runtime-deps, npm, ENOTEMPTY, or repeated dependency installs, that’s likely the cause and should go to a GitHub issue with the log snippet.

frail prism
#

how can i fix this

last torrentBOT
#

Try this fix path:

openclaw gateway restart
openclaw update --yes
openclaw doctor --fix
openclaw gateway restart

Then wait ~30s and check CPU:

ps -C openclaw-gateway -o pid,user,pcpu,pmem,rss,etime,args

Why: you’re on 2026.4.24, and there were affected 2026.4.24 session/transcript/runtime issues that doctor --fix can repair. High gateway CPU can come from replaying/repairing bad session state, plugin deps, or loops.

#

If CPU is still high after that, run:

openclaw logs --limit 1000 --plain \
  | grep -nEi "plugin|runtime-deps|npm|ENOTEMPTY|memory|index|embedding|compaction|cron|heartbeat|retry|timeout|error|exception" \
  | tail -120

Paste that filtered output. If it mentions plugin-runtime-deps, npm, or ENOTEMPTY, that’s likely the root cause.

frail prism
#

OpenClaw gateway listening ports only:

• 127.0.0.1:18789
• 127.0.0.1:18791

why is it listening to two ports

last torrentBOT
#

18789 is the normal default Gateway port. 18791 usually means a second Gateway-like process/profile is also running — often a stale service/process from an update/restart, or another profile/rescue gateway.

Check which PID owns each port:

sudo ss -ltnp 'sport = :18789 or sport = :18791'

Then compare with:

ps -eo pid,ppid,pcpu,pmem,etime,args | grep -E '[o]penclaw|[n]ode'
openclaw gateway status --deep
openclaw gateway probe

If 18791 is stale/duplicate, the clean fix is:

#
openclaw gateway restart

Wait 20s, then recheck:

sudo ss -ltnp 'sport = :18789 or sport = :18791'

If both still remain, paste the ss output. Don’t kill anything yet until we see which service owns which port.

frail prism
#

Start with point 1: the repeated plugin manifest scanning.

What I observed: while CPU was high, the gateway main process was constantly reading /usr/lib/node_modules/openclaw/dist/extensions/*/openclaw.plugin.json. OpenClaw has
about 100+ bundled extension manifests, and the cache TTL in the installed code is only 1000ms, so any active path that asks “what plugins/channels/commands exist?” can
trigger repeated filesystem scans.

Why it matters: this is synchronous filesystem work inside the Node gateway process. If it happens during a stuck Telegram run or command discovery loop, it can burn CPU
and make the gateway less responsive even though the actual bot polling is not expensive.

Likely fix direction: increase the cache TTL first as a low-risk workaround, then consider an upstream/code fix that makes plugin discovery process-lifetime cached or
invalidated only when config/extensions change.

Suggested first test, not applying yet:

Environment=OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS=60000
Environment=OPENCLAW_PLUGIN_MANIFEST_CACHE_MS=60000

is this a good idea?

last torrentBOT
#

Yes — that’s a reasonable low-risk test. I checked: those env vars are real:

OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS
OPENCLAW_PLUGIN_MANIFEST_CACHE_MS

Default is 1000ms, so 60000 should reduce repeated manifest scanning. Tradeoff: plugin/config discovery changes may take up to ~60s to reflect unless you restart the gateway.

Since your gateway is running as root/systemd, test it like this:

sudo systemctl edit openclaw-gateway.service

Add:

#
[Service]
Environment=OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS=60000
Environment=OPENCLAW_PLUGIN_MANIFEST_CACHE_MS=60000

Then:

sudo systemctl daemon-reload
sudo systemctl restart openclaw-gateway.service

Verify:

#
sudo systemctl show openclaw-gateway.service -p Environment
ps -C openclaw-gateway -o pid,pcpu,pmem,etime,args

If CPU drops, that’s strong evidence the manifest scan loop is part of it. Still worth filing upstream, because a hot gateway path repeatedly rebuilding plugin discovery is probably something OpenClaw should cache better internally.

frail prism
#

sudo ss -ltnp 'sport = :18789 or sport = :18791'
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 511 127.0.0.1:18789 0.0.0.0:* users:(("openclaw-gatewa",pid=1208315,fd=26))
LISTEN 0 511 127.0.0.1:18791 0.0.0.0:* users:(("openclaw-gatewa",pid=1208315,fd=28))
LISTEN 0 511 [::1]:18789 [::]:* users:(("openclaw-gatewa",pid=1208315,fd=27))

last torrentBOT
#

That output is normal: it’s one gateway process listening on both ports.

  • 18789 = main Gateway WS/HTTP port.
  • 18791 = browser control service, derived as gateway.port + 2.

Docs say the browser control service binds to loopback on a port derived from gateway.port; default 18791 = 18789 + 2. So this is not a duplicate gateway and not the CPU cause.

The CPU issue is more likely the plugin manifest scanning you found, a stuck channel/run, or a loop. The cache env test is still reasonable.

frail prism
#

ok how can i check and fix high cpu

last torrentBOT
#

Do it in this order. This gives you diagnosis + safe fixes without guessing.

1) Confirm what the gateway is doing

PID=$(pgrep -x openclaw-gateway | head -1)

ps -p "$PID" -o pid,user,pcpu,pmem,rss,etime,args
openclaw tasks list --status running
openclaw tasks audit

Then logs:

openclaw logs --limit 2000 --plain \
#
  | grep -nEi "plugin|manifest|runtime-deps|npm|ENOTEMPTY|memory|index|embedding|compaction|cron|heartbeat|retry|timeout|error|exception" \
  | tail -150

2) If you suspect plugin manifest scanning, confirm it

If strace is installed:

PID=$(pgrep -x openclaw-gateway | head -1)

sudo timeout 15s strace -f -p "$PID" \
  -e trace=openat,newfstatat,statx \
  -c
#

If you see tons of reads/stat calls for openclaw.plugin.json, apply the cache workaround.

3) Apply plugin cache workaround

sudo systemctl edit openclaw-gateway.service

Add:

[Service]
Environment=OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS=60000
Environment=OPENCLAW_PLUGIN_MANIFEST_CACHE_MS=60000

Then:

#
sudo systemctl daemon-reload
sudo systemctl restart openclaw-gateway.service
sleep 30
ps -C openclaw-gateway -o pid,user,pcpu,pmem,rss,etime,args

4) Run repairs after update

openclaw doctor
openclaw doctor --fix
sudo systemctl restart openclaw-gateway.service

5) If CPU is still high

#

Paste:

openclaw --version
ps -C openclaw-gateway -o pid,user,pcpu,pmem,rss,etime,args
openclaw tasks list --status running
openclaw logs --limit 500 --plain | tail -120

Don’t kill it yet unless it’s unusable — the log should show whether it’s plugin scanning, memory indexing, compaction, cron/heartbeat, or channel reconnect loop.

frail prism
#

PID=$(pgrep -x openclaw-gateway | head -1)
ps -p "$PID" -o pid,user,pcpu,pmem,rss,etime,args
openclaw tasks list --status running
openclaw tasks audit
pgrep: pattern that searches for process name longer than 15 characters will result in zero matches
Try `pgrep -f' option to match against the complete command line.
error: list of process IDs must follow -p

Usage:
ps [options]

Try 'ps --help <simple|list|output|threads|misc|all>'
or 'ps --help <s|l|o|t|m|a>'
for additional help text.

For more details see ps(1).

🦞 OpenClaw 2026.4.25 (aa36ee6) — We ship features faster than Apple ships calculator updates.

Background tasks: 0
Task pressure: 0 queued ¡ 0 running ¡ 0 issues
Status filter: running
No background tasks found.

🦞 OpenClaw 2026.4.25 (aa36ee6) — If it works, it's automation; if it breaks, it's a "learning opportunity."

Tasks audit: 0 findings ¡ 0 errors ¡ 0 warnings
Task findings: 0 ¡ TaskFlow findings: 0
No tasks audit findings.

last torrentBOT
#

Nice, you’re on 2026.4.25 now, and tasks are clean. The pgrep issue is just Linux’s 15-char process-name limit.

Use -f:

PID=$(pgrep -f 'openclaw-gateway' | head -1)
ps -p "$PID" -o pid,user,pcpu,pmem,rss,etime,args

Or simpler:

ps -eo pid,user,pcpu,pmem,rss,etime,args \
  | grep -E '[o]penclaw-gateway|[o]penclaw'

Then check if CPU is still high. If it is, run:

#
openclaw logs --limit 2000 --plain \
  | grep -nEi "plugin|manifest|runtime-deps|npm|ENOTEMPTY|memory|index|embedding|compaction|cron|heartbeat|retry|timeout|error|exception" \
  | tail -150

Since there are no running tasks, the likely suspects are now gateway startup/channel loop, plugin discovery, memory indexing, or channel health/reconnect — the filtered logs should reveal which.

frail prism
#

323:2026-04-27T15:49:17.300Z error telegram/api {"subsystem":"telegram/api"} telegram sendChatAction failed: Network request for 'sendChatAction' failed!
329:2026-04-27T15:51:56.415Z error Gateway agent failed; falling back to embedded: Error: gateway timeout after 330000ms

172:2026-04-27T15:28:39.168Z error Invalid config at /root/.openclaw/openclaw.json:\n- plugins.entries.feishu: plugin feishu: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load
173:- plugins.entries.whatsapp: plugin whatsapp: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load
174:2026-04-27T15:28:39.192Z error gateway {"subsystem":"gateway"} shutdown error: Error: Invalid config at /root/.openclaw/openclaw.json:
175:- plugins.entries.feishu: plugin feishu: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load
176:- plugins.entries.whatsapp: plugin whatsapp: plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24; skipping load

last torrentBOT
#

That log has the smoking gun:

plugin requires OpenClaw >=2026.4.25, but this host is 2026.4.24

Your CLI is 2026.4.25, but the gateway service was still running 2026.4.24 at that time. That split-brain can cause bad loops/high CPU.

Run:

which openclaw
openclaw --version
sudo env "PATH=$PATH" openclaw --version
sudo systemctl cat openclaw-gateway.service | grep -E "ExecStart|Environment"
#

Then repair the service to use the current install:

sudo env "PATH=$PATH" openclaw gateway install --force
sudo systemctl daemon-reload
sudo systemctl restart openclaw-gateway.service

Verify:

sleep 20
openclaw gateway status --deep
ps -C openclaw-gateway -o pid,user,pcpu,pmem,rss,etime,args
openclaw logs --limit 200 --plain | tail -100
#

The Telegram sendChatAction line is probably just a network/typing-action failure. The version mismatch is the thing I’d fix first.

frail prism
#

which openclaw
openclaw --version
sudo env "PATH=$PATH" openclaw --version
sudo systemctl cat openclaw-gateway.service | grep -E "ExecStart|Environment"
/usr/bin/openclaw
OpenClaw 2026.4.25 (aa36ee6)
OpenClaw 2026.4.25 (aa36ee6)
No files found for openclaw-gateway.service.

last torrentBOT
#

That means it’s probably a user systemd service, not a system service. Use --user:

systemctl --user cat openclaw-gateway.service | grep -E "ExecStart|Environment"
systemctl --user status openclaw-gateway.service --no-pager

To reinstall/restart the managed user service:

openclaw gateway install --force
systemctl --user daemon-reload
systemctl --user restart openclaw-gateway.service

Then verify:

#
openclaw gateway status --deep
ps -eo pid,user,pcpu,pmem,rss,etime,args | grep -E '[o]penclaw-gateway'

If systemctl --user cat also says no files found, then the gateway is being run by something else. Find parent + command:

PID=$(pgrep -f 'openclaw-gateway' | head -1)
ps -p "$PID" -o pid,ppid,user,pcpu,pmem,etime,args
ps -p "$(ps -p "$PID" -o ppid=)" -o pid,user,etime,args

That will tell us what supervisor owns it.

frail prism
#

but what difference does it make to cpu usage

last torrentBOT
#

It helps CPU only if the hot loop is repeatedly rebuilding plugin discovery.

Right now default cache TTL is ~1000ms. If some gateway path asks “what plugins/channels/commands exist?” many times, OpenClaw can re-scan ~100+ plugin manifest JSON files every second. That’s lots of synchronous filesystem/stat/JSON parse work inside the Node process → CPU burn.

Changing to:

OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS=60000
OPENCLAW_PLUGIN_MANIFEST_CACHE_MS=60000

means: reuse the plugin discovery/manifest result for 60 seconds instead of 1 second.

So instead of:

#
scan manifests up to ~60 times/minute

it becomes:

scan manifests about ~1 time/minute

Tradeoff: if you install/enable/disable plugins, the gateway might not notice for up to 60s unless you restart it. For a normal always-on gateway, that’s fine.

But if CPU is caused by Telegram reconnects, model calls, memory indexing, or a stuck service mismatch, this won’t fix it. It’s specifically for the repeated manifest-scan CPU pattern.

frail prism
#

ok this did not fix it whats next

last torrentBOT
#

Since plugin-cache didn’t fix it, next isolate channel/runtime loops. Your logs already showed Telegram network errors, so start there.

1) Confirm hot process + owner

PID=$(pgrep -f 'openclaw-gateway' | head -1)
ps -p "$PID" -o pid,ppid,user,pcpu,pmem,rss,etime,args
ps -p "$(ps -p "$PID" -o ppid=)" -o pid,user,etime,args

2) Check channels

openclaw channels status --probe

Look for anything repeatedly disconnected, polling stall, or failing probes.

#

3) Temporarily disable Telegram and restart

Since your logs had Telegram failures:

openclaw config set channels.telegram.enabled false --strict-json
openclaw gateway restart
sleep 60
ps -p "$(pgrep -f 'openclaw-gateway' | head -1)" -o pid,user,pcpu,pmem,rss,etime,args

If CPU drops, Telegram/reconnect/polling is the culprit.

4) If CPU is still high, disable WhatsApp too

#
openclaw config set channels.whatsapp.enabled false --strict-json
openclaw gateway restart
sleep 60
ps -p "$(pgrep -f 'openclaw-gateway' | head -1)" -o pid,user,pcpu,pmem,rss,etime,args

5) If still high, grab the loop signature

openclaw logs --limit 3000 --plain \
  | grep -nEi "telegram|whatsapp|polling|sendChatAction|getUpdates|health|restart|retry|memory|index|embedding|compaction|plugin|error|timeout|exception" \
  | tail -200

Paste that. At this point we need to identify which subsystem is looping; don’t keep tuning context/compaction until CPU source is known.

#

All Telegram accounts are “connected”, so don’t disable all Telegram yet. The suspicious one is:

Telegram vision ... works, audit failed

Next isolate just that account.

1) See why audit failed

Run:

openclaw channels status --probe --json \
  | grep -i -A20 -B5 '"vision"'
#

If that’s messy, use:

openclaw logs --limit 2000 --plain \
  | grep -nEi "telegram.*vision|vision.*telegram|audit failed|sendChatAction|getUpdates|polling stall|telegram/api|error|timeout" \
  | tail -120

2) Temporarily disable only vision

openclaw config set channels.telegram.accounts.vision.enabled false --strict-json
openclaw gateway restart
sleep 60
ps -p "$(pgrep -f 'openclaw-gateway' | head -1)" -o pid,user,pcpu,pmem,rss,etime,args
#

If CPU drops, the vision Telegram account/audit/polling loop is the culprit.

3) If CPU does not drop

Re-enable it:

openclaw config set channels.telegram.accounts.vision.enabled true --strict-json

Then disable all Telegram briefly as a stronger isolation:

openclaw config set channels.telegram.enabled false --strict-json
openclaw gateway restart
sleep 60
#
ps -p "$(pgrep -f 'openclaw-gateway' | head -1)" -o pid,user,pcpu,pmem,rss,etime,args

If all-Telegram-off fixes CPU, it’s one of those pollers. If not, Telegram isn’t the cause.

frail prism
#

how can i move from polling to interupts that should decrease overall headache, and improve cpu performance as i have lots of agents

last torrentBOT
#

Yes — for Telegram, “interrupts” = webhook mode. It can reduce headache because you stop running 6 long-polling loops (getUpdates) and Telegram pushes updates to you instead.

Caveat: you need a public HTTPS URL that Telegram can reach.

Recommended setup for multiple Telegram accounts

Use one webhook secret, but give each bot account a unique public URL + local port.

Example:

openclaw config set channels.telegram.webhookSecret "$(openssl rand -hex 32)"

Then configure each account:

#
openclaw config set channels.telegram.accounts.default.webhookUrl "https://YOURDOMAIN.com/tg/default"
openclaw config set channels.telegram.accounts.default.webhookPath "/tg/default"
openclaw config set channels.telegram.accounts.default.webhookPort 8787 --strict-json

openclaw config set channels.telegram.accounts.hawkeye.webhookUrl "https://YOURDOMAIN.com/tg/hawkeye"
openclaw config set channels.telegram.accounts.hawkeye.webhookPath "/tg/hawkeye"
openclaw config set channels.telegram.accounts.hawkeye.webhookPort 8788 --strict-json

openclaw config set channels.telegram.accounts.po.webhookUrl "https://YOURDOMAIN.com/tg/po"
openclaw config set channels.telegram.accounts.po.webhookPath "/tg/po"
openclaw config set channels.telegram.accounts.po.webhookPort 8789 --strict-json

Repeat for qa, router, vision, etc.

Reverse proxy idea

#

Proxy each path to its matching local port:

location /tg/default { proxy_pass http://127.0.0.1:8787; }
location /tg/hawkeye { proxy_pass http://127.0.0.1:8788; }
location /tg/po { proxy_pass http://127.0.0.1:8789; }

Then restart:

openclaw gateway restart
openclaw channels status --probe

You want Telegram accounts to show mode:webhook instead of mode:polling.