#Insane amount of token burn. Seven million tokens in the last hour with light usage.

1 messages ¡ Page 1 of 1 (latest)

woven nimbus
#

I'll attach my context breakdown OpenClaw claims I've only spent 77,000 tokens, but on Anthropic I've spent about 7 million, and this is the only thing I use that API key for. Please help

spark cliffBOT
#

That size of mismatch (77k shown in OpenClaw vs ~7M in Anthropic in ~1 hour) basically means either:

  1. the Anthropic key is being hit by something else (often a 2nd OpenClaw gateway/host you forgot about), or
  2. OpenClaw is making a lot more requests than you think (runaway loop / retry storm / cron/heartbeat firing constantly), and the UI “context breakdown” is not counting those calls the same way Anthropic bills them.

A few quick, high-signal checks:

0) Stop the bleeding (recommended)

If you’re worried the key might be compromised: revoke/rotate the Anthropic key in the Anthropic console first. That immediately stops further spend while we debug.

1) “Is it really OpenClaw?” (fast isolation)

  • Stop OpenClaw everywhere you might be running it (local machine + VPS + Docker + another terminal tab).
  • Watch the Anthropic usage page for a couple minutes.
    • If usage keeps climbing: the key is used somewhere else (or there’s another still-running gateway).
    • If usage stops: it’s very likely OpenClaw (or an OpenClaw cron/agent) doing it.

2) Check gateway logs for request spam

#

OpenClaw writes file logs and you can tail them via CLI:

Also: the Control UI → Logs tab tails the same file logs, so you can look there too.

What I’m looking for is evidence of:

  • the same session/agent repeatedly calling the model,
  • errors/timeouts causing retries,
  • a cron/heartbeat firing constantly.

3) I need these 3 things from you

  1. openclaw --version
#
  1. A screenshot (or copy/paste) of the Anthropic usage breakdown for that hour (model + input/output tokens if shown)
  2. ~50-100 lines from openclaw logs --follow during the spike (redact any secrets)

Also tell me: are you running more than one gateway (e.g. laptop + server) or any crons/heartbeats?

Once we see the logs + version, we can usually pinpoint which agent/job is burning tokens and why.

woven nimbus
#

I can't attach a screenshot so here you are
Total tokens in
7,718,664

Total tokens out
79,169

openclaw version 2026.2.26

I have it running on a dedicated PC that I just SSH into.
some logss
21:55:14 debug diagnostic {"subsystem":"diagnostic"} lane enqueue: lane=main queueSize=1
21:55:14 debug diagnostic {"subsystem":"diagnostic"} lane dequeue: lane=main waitMs=0 queueSize=0
21:55:14 debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=85a69ba2-8a90-4c71-970a-0fa5c9545ce6 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat
21:55:14 debug agent/embedded {"subsystem":"agent/embedded"} creating streamFn wrapper with params: {"cacheRetention":"short"}
21:55:14 debug agent/embedded {"subsystem":"agent/embedded"} applying extraParams to agent streamFn for anthropic/claude-haiku-4-5
21:55:14 debug diagnostic {"subsystem":"diagnostic"} run registered: sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 totalActive=1
21:55:14 debug agent/embedded {"subsystem":"agent/embedded"} embedded run prompt start: runId=85a69ba2-8a90-4c71-970a-0fa5c9545ce6 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02
21:55:14 debug agent/embedded {"subsystem":"agent/embedded"} embedded run agent start: runId=85a69ba2-8a90-4c71-970a-0fa5c9545ce6
21:55:20 debug agent/embedded {"subsystem":"agent/embedded"} embedded run tool start: runId=85a69ba2-8a90-4c71-970a-0fa5c9545ce6 tool=exec toolCallId=toolu_01HasnDQrQnYaT7yv3eouma1
21:55:23 warn config/schema {"subsystem":"config/schema"} possibly sensitive key found: (secrets)
21:55:23 warn config/schema {"subsystem":"config/schema"} possibly sensitive key found: (secrets.providers)
21:55:23 warn config/schema {"subsystem":"config/schema"} possibly sensitive key found: (secrets.providers.)
21:55:23 warn config/schema {"subsystem":"config/schema"} possibly sensitive key found: (secrets.providers.
)
21:55:23 warn config/schema

spark cliffBOT
#

Those Anthropic numbers (7,718,664 in vs 79,169 out) scream “many runs with a huge prompt / huge tool output, producing short answers”.

And your gateway logs already hint at the source:

  • agent/embedded … messageChannel=webchat
  • then immediately a tool call: tool=exec

So the spend is coming from WebChat / Control UI chat, not Discord/Telegram, and the agent is running exec (which can easily dump a ton of text that then gets fed back into the next model call as input tokens).

Why OpenClaw can show “77k” while Anthropic shows “7.7M”

That “context breakdown” is typically per-run/per-session prompt size, not an hourly billing meter.
If something triggered ~100 runs with ~77k input each → ~7.7M input tokens billed. That matches your numbers almost perfectly.

2 things to check immediately

1) Is something spamming WebChat (including an exposed gateway)?

Leave openclaw logs --follow running and watch:

#
  • do you see embedded run start … messageChannel=webchat firing even when you’re not typing?

If yes, either:

  • you have a stuck/refreshing WebChat/Control UI client, or
  • your gateway WebSocket is reachable by someone else and they’re sending chat.send (this is the scary one).

Security hardening docs: https://docs.openclaw.ai/gateway/security
Run (safe/read-only):

openclaw security audit

Also confirm you have gateway auth on + not publicly exposed:

  • gateway.bind, gateway.port
  • gateway.auth.mode + gateway.auth.token/gateway.auth.password

(WebChat overview: https://docs.openclaw.ai/web/webchat)

#

2) What was that exec command?

Right after:
embedded run tool start … tool=exec toolCallId=…
there should be lines showing what command ran and (often) an “end” line.

Please paste the next ~20-40 log lines after that tool=exec start (redact secrets). If the command is something like cat/reading logs or a big directory listing, that would directly explain millions of input tokens.

What I need from you next (minimal)

  1. Do you have the Control UI/WebChat open anywhere right now? If you close it, do the embedded run start lines stop?
  2. Paste:
    • one embedded run start … messageChannel=webchat line
    • the exec command line(s) that follow it
    • any nearby gateway/ws … chat.send … conn=... lines if you see them (those help identify if it’s a remote client)

Logging reference (how to view/format logs): https://docs.openclaw.ai/gateway/logging

woven nimbus
#

21:55:24 debug agent/embedded {"subsystem":"agent/embedded"} embedded run tool end: runId=85a69ba2-8a90-4c71-970a-0fa5c9545ce6 tool=exec toolCallId=toolu_01HasnDQrQnYaT7yv3eouma1
21:55:28 debug agent/embedded {"subsystem":"agent/embedded"} embedded run tool start: runId=85a69ba2-8a90-4c71-970a-0fa5c9545ce6 tool=sessions_history toolCallId=toolu_01Jn8dBXYfqSjLUBK2UsxTsu
21:55:28 debug agent/embedded {"subsystem":"agent/embedded"} embedded run tool end: runId=85a69ba2-8a90-4c71-970a-0fa5c9545ce6 tool=sessions_history toolCallId=toolu_01Jn8dBXYfqSjLUBK2UsxTsu

spark cliffBOT
#

The new snippet is actually a huge clue: your agent is calling sessions_history right after an exec.

That combination can explode input tokens:

  • exec can return a lot of text (logs, directory listings, file contents).
  • sessions_history can pull back a large chunk of transcript (often including tool output, depending on settings).
  • If you then have ~77k tokens of “context” per run, it only takes ~100 runs to hit ~7.7M input tokens (which matches your Anthropic “tokens in” shockingly well).

So the immediate goals are:

  1. identify what makes the prompt 77k every time
  2. identify why WebChat is triggering repeated runs

1) Please paste your “context breakdown” as text (the thing that showed ~77,000)

Even without screenshots: copy/paste the breakdown lines. We need to see what item accounts for the bulk of those tokens (usually it’s a huge injected file or a giant transcript/tool output being included).

If you can, also run /context detail in WebChat and paste the output here — it should be a gateway command (not an LLM call) and it typically lists the big contributors.

#

2) Confirm whether WebChat is repeatedly triggering runs

In your logs, do you see embedded run start … messageChannel=webchat happening repeatedly even when you’re not typing?

If yes, that’s either:

  • a stuck/reconnecting WebChat/Control UI client spamming, or
  • your gateway WebSocket is reachable by someone else (token leak / exposed port).

Quick containment (recommended)

  • Rotate your gateway auth token/password (don’t reuse the old one), and reconnect WebChat.
  • Run:
    openclaw security audit
    

Docs: https://docs.openclaw.ai/gateway/security

Also double-check you are not binding the gateway to the public internet unless you really mean to (and if you do, auth must be solid).

#

3) Optional “stop the runaway” mitigation (while debugging)

If this is actively burning money right now, you can temporarily deny the specific tool that’s likely amplifying token use:

Docs: https://docs.openclaw.ai/tools

In ~/.openclaw/openclaw.json, temporarily add (or equivalent per-agent override):

{
  tools: {
    deny: ["sessions_history"]
  }
}

Then restart the gateway. This won’t fix the underlying spam, but it can prevent “read huge transcript → feed huge transcript back to model” loops.


#

If you paste the context breakdown text + tell me whether WebChat runs keep firing when idle, we can usually pinpoint the exact culprit in one more round.

woven nimbus
#
  • AGENTS.md: OK | raw 7,804 chars (1,951 tok) | injected 7,804 chars (1,951 tok)
  • SOUL.md: OK | raw 1,664 chars (416 tok) | injected 1,664 chars (416 tok)
  • TOOLS.md: OK | raw 850 chars (213 tok) | injected 850 chars (213 tok)
  • IDENTITY.md: OK | raw 633 chars (159 tok) | injected 633 chars (159 tok)
  • USER.md: OK | raw 474 chars (119 tok) | injected 474 chars (119 tok)
  • HEARTBEAT.md: OK | raw 167 chars (42 tok) | injected 167 chars (42 tok)
  • BOOTSTRAP.md: OK | raw 1,449 chars (363 tok) | injected 1,449 chars (363 tok)
    Skills list (system prompt text): 2,013 chars (~504 tok) (4 skills)

I had to remove some because of limits

Session tokens (cached): 77,331 total / ctx=200000

spark cliffBOT
#

That output actually explains the “77k vs 7.7M” mismatch.

What your /context snippet is saying

All the injected workspace files you listed add up to only a few thousand tokens (roughly ~3–4k).

But then you have:

Session tokens (cached): 77,331 total

That 77,331 is basically “how big the current session transcript/context is right now” (chat history + tool results + compaction summaries, etc.). It’s not “how many tokens you were billed today”.

Why Anthropic shows ~7,718,664 input tokens

Anthropic is reporting cumulative tokens across all requests.

If OpenClaw made ~N model calls while your session context was ~77k, then your billed input is roughly:

N * ~77k (plus/minus tool-loop growth, retries, prompt caching accounting, etc.)

#

So 7.7M input tokens is totally consistent with:

  • a runaway loop / retry storm, or
  • lots of “internal” calls during tool use (one user message can cause many model calls), especially since your logs show messageChannel=webchat + tool calls (exec, sessions_history).

Stop the burn right now (fast)

In WebChat (since your runs are coming from messageChannel=webchat), send as standalone messages:

  1. /stop (aborts current run + clears queued followups)
  2. /new or /reset (starts a fresh session id so you’re not paying 77k input every turn)

Session docs: https://docs.openclaw.ai/concepts/session
Compaction/reset tips: https://docs.openclaw.ai/concepts/compaction

Find what’s triggering the repeated runs (most important)

You want to answer: “how many runs happened, and who/what sent them?”

On the gateway host, run:

#
openclaw logs --limit 500 --plain | grep "embedded run start"
openclaw logs --limit 500 --plain | grep "chat.send"

If you see embedded run start happening when you aren’t typing, that’s either:

  • a stuck WebChat/Control UI client re-sending, or
  • someone else can reach your gateway and is sending messages (auth token leak / exposed port).

Run a security check (recommended):

openclaw security audit

Docs: https://docs.openclaw.ai/gateway/security

Prevent it from happening again (optional but effective)

Since exec tool output + sessions_history can balloon context, consider enabling session pruning for Anthropic so old tool results get trimmed before calls:

#

Docs: https://docs.openclaw.ai/concepts/session-pruning

—

If you paste the output of those two grep commands (just a handful of lines), we can usually tell immediately whether it’s “your own WebChat client looping” vs “unexpected remote chat.send”.

woven nimbus
#

openclaw logs --limit 500 --plain | grep "chat.send"
2026-02-27T21:48:37.600Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=cda50619-8463-4892-8d9c-b58158e53c81 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat
2026-02-27T21:49:16.503Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=eb6310f0-1e71-481f-bca3-9cb42853f448 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat
2026-02-27T21:51:33.825Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=682da9c0-881a-4385-bbef-47ff5bd72e4a sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat
2026-02-27T21:52:11.930Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=afbf7748-cf2e-4b6e-8fdb-5c704b3b15a4 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat
2026-02-27T21:53:29.648Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=a92f2449-799c-4ee8-a413-2e65a4cb8037 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat
2026-02-27T21:55:14.201Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=85a69ba2-8a90-4c71-970a-0fa5c9545ce6 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat
2026-02-27T21:56:19.434Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=c691ea19-0994-48d8-8388-9bda85c2baaf sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat

spark cliffBOT
#

Those lines aren’t actually chat.send lines — they’re agent runs starting (embedded run start … messageChannel=webchat). So yes: something is repeatedly sending messages from WebChat (or something that looks like WebChat to the gateway), which then triggers a full Anthropic call.

And with session cached context ~77k, each run can easily be ~77k input tokens → if you had ~100 runs in that hour, that’s your ~7.7M “tokens in”.

1) Identify who is sending (need the gateway/ws … chat.send line)

Please run these (on the gateway host):

openclaw logs --limit 2000 --plain | grep "gateway/ws" | grep "chat.send"
openclaw logs --limit 2000 --plain | grep -E "unauthorized|nonce|required|connect\.challenge|device"

What we want to see is a line like gateway/ws … ⇄ req chat.send … conn=... — it usually includes a conn= identifier and sometimes addressing info. That tells us whether it’s your own browser client reconnecting vs something external hitting the WS.

(Your previous command was grep "chat.send" but the pasted output doesn’t contain that substring, so I think the grep/paste got mixed.)

2) Quick “is it your WebChat tab looping?” test

#
  • Close all Control UI / WebChat tabs (and any app connected to the gateway).
  • Then watch the logs for 2 minutes:
openclaw logs --follow --plain | grep "embedded run start"

If embedded run start … messageChannel=webchat keeps appearing, it’s not you clicking—either a stuck client somewhere else, or the gateway is reachable by someone/something.

3) Reduce cost immediately (safe)

In WebChat, send as standalone messages:

  • /stop (abort current run / clear queue)
  • /reset (or /new) to start a fresh session id so you’re not paying 77k input tokens per turn

Docs: https://docs.openclaw.ai/concepts/session

#

4) Security check (recommended)

Run:

openclaw security audit

Docs: https://docs.openclaw.ai/gateway/security

If your gateway is bound to anything other than loopback, make sure it’s not unintentionally exposed to the internet.

Paste the gateway/ws … chat.send lines when you have them — that’s the smoking gun for what’s triggering the runs.

woven nimbus
#

With the two new commands, nothing comes back

spark cliffBOT
#

That usually just means there are no matching lines in the last N log lines (or WS request logging isn’t being emitted into file logs in your current mode), so grep returns nothing.

Let’s make it more robust and also get you context around a specific runaway run.

1) Confirm whether any WS method lines exist in file logs

Run:

openclaw logs --limit 20000 --plain | grep -F "gateway/ws" | tail -n 20
openclaw logs --limit 20000 --plain | grep -F "chat." | tail -n 50
openclaw logs --limit 20000 --plain | grep -F "chat.send" | tail -n 50

If all 3 are empty, then your gateway isn’t writing those WS method logs to the file in the first place (common unless you run the gateway with verbose WS logging).

2) Pull “what triggered this run” by grepping around a runId (works even without WS logs)

Pick one of your runIds (e.g. cda50619-8463-4892-8d9c-b58158e53c81) and run:

#
openclaw logs --limit 20000 --plain | grep -C 40 "runId=cda50619-8463-4892-8d9c-b58158e53c81"

Do the same for another runId. We’re looking for lines immediately before embedded run start that indicate inbound message / reconnect / retry.

3) If we need the smoking gun: temporarily enable verbose WS logging

If the above shows no inbound cause, do a short repro with verbose WS logs:

  1. Stop the service gateway (brief downtime):
openclaw gateway stop
  1. Start foreground with WS logging:
#
openclaw gateway --verbose --ws-log compact

Then connect WebChat and watch for chat.send / reconnect spam live.

Docs on WS logging behavior: https://docs.openclaw.ai/gateway/logging

Quick question

Do you currently have the Control UI / WebChat open anywhere (multiple tabs, another machine, phone)? If you close them all, do the embedded run start … messageChannel=webchat lines stop?

woven nimbus
#

openclaw logs --limit 20000 --plain | grep -F "chat." | tail -n 50
openclaw logs --limit 20000 --plain | grep -F "chat.send" | tail -n 50
Gateway not reachable. Is it running and accessible?
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/mainframe/.openclaw/openclaw.json
Bind: loopback
Hint: run openclaw doctor.
Gateway not reachable. Is it running and accessible?
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/mainframe/.openclaw/openclaw.json
Bind: loopback
Hint: run openclaw doctor.
Gateway not reachable. Is it running and accessible?
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: /home/mainframe/.openclaw/openclaw.json
Bind: loopback
Hint: run openclaw doctor

spark cliffBOT
#

That output means the CLI can’t open the Gateway WebSocket at ws://127.0.0.1:18789 — i.e. nothing is listening there (gateway stopped/crashed, different port/bind, or you’re running the CLI on a different machine/user than the gateway).

Do these on the machine where the gateway is supposed to be running:

  1. Check status + get the “why”:
openclaw gateway status
openclaw doctor
openclaw status

Paste the output of those 3.

  1. If openclaw gateway status says it’s not running, start it (depending on how you run it):
openclaw gateway start

(or run it in the foreground in that SSH session:)

#
openclaw gateway

Common gotchas:

  • You’re SSH’d into one box, but ran openclaw logs on another.
  • Gateway is running under a different OS user (service user), so your CLI is pointing at /home/mainframe/.openclaw/openclaw.json but the service uses a different config.
  • Gateway died due to an error and isn’t restarted.

Once the gateway is reachable again, we can continue the chat.send / “what is spamming WebChat” investigation.

woven nimbus
#

🦞 OpenClaw 2026.2.26 (bc50708) — The only bot that stays out of your training set.

18:15:20 Registered plugin command: /pair (plugin: device-pair)
18:15:20 Registered plugin command: /phone (plugin: phone-control)
18:15:20 Registered plugin command: /voice (plugin: talk-voice)
23:15:20 [canvas] host mounted at http://127.0.0.1:18789/__openclaw__/canvas/ (root /home/mainframe/.openclaw/canvas)
18:15:20 bonjour: starting (hostname=openclaw, instance="mainframe-pc (OpenClaw)", gatewayPort=18789, minimal=true)
23:15:20 [heartbeat] started
23:15:20 [health-monitor] started (interval: 300s, grace: 60s)
23:15:20 [gateway] agent model: anthropic/claude-haiku-4-5
23:15:20 [gateway] listening on ws://127.0.0.1:18789, ws://[::1]:18789 (PID 38968)
23:15:20 [gateway] log file: /tmp/openclaw/openclaw-2026-02-27.log
23:15:20 [browser/server] Browser control listening on http://127.0.0.1:18791/ (auth=token)
23:15:21 [ws] ← open remoteAddr=127.0.0.1 conn=22221a74…7c73
23:15:22 [ws] ← connect client=openclaw-control-ui version=dev mode=webchat clientId=openclaw-control-ui platform=Win32 auth=device-token
23:15:22 [ws] webchat connected conn=22221a74-5ba6-4805-af75-a53e37bf7c73 remote=127.0.0.1 client=openclaw-control-ui webchat vdev
23:15:22 [ws] → hello-ok methods=92 events=19 presence=2 stateVersion=2
23:15:22 [ws] → event health seq=1 clients=1 presenceVersion=2 healthVersion=3
23:15:22 [ws] ⇄ res ✓ agent.identity.get 2ms id=fcb7caaa…0aad
23:15:22 [ws] ⇄ res ✓ agents.list 1ms id=11f7cd7d…3fea
23:15:22 [ws] ⇄ res ✓ tools.catalog 2ms id=20154065…b914
23:15:22 [ws] ⇄ res ✓ chat.history 17ms id=4bd110ea…0b60
23:15:22 [ws] ⇄ res ✓ sessions.list 3ms id=daf4be2d…e15d
23:15:22 [ws] ⇄ res ✓ node.list 29ms id=cf0dfdf3…edd3
23:15:22 [ws] ⇄ res ✓ device.pair.list 30ms id=ba360bfa…69b1
23:15:27 [ws] ⇄ res ✓ node.list 3ms id=beaaf91d…3591
23:15:50 [ws] → event tick seq=2 clients=1 dropIfSlow=true

spark cliffBOT
#

Nice — that log proves something important:

  • Your Gateway is listening only on loopback (ws://127.0.0.1:18789)
  • The WebChat/Control UI client connecting is local too (remoteAddr=127.0.0.1)
  • And it’s specifically client=openclaw-control-ui version=dev mode=webchat platform=Win32

So this does not look like “someone on the internet is hitting your gateway”. It looks like your local Control UI/WebChat client is what’s driving the runs (or reconnecting/resending).

What to do next (so we can catch the trigger)

We need to catch the exact chat.send that causes an embedded run start.

  1. Keep the gateway running like this (foreground with WS logs).
  2. Wait until you see the next token-burn “run” happen.
  3. As soon as you see an embedded run start: line, copy/paste ~30 lines before it and ~30 lines after it from that same terminal output.

We’re looking for a line like:

  • [ws] ⇄ req chat.send ...
#

right before the run starts.

That will tell us whether:

  • the UI is repeatedly sending the same message,
  • a reconnect is replaying a message,
  • or something in the UI is auto-sending (bug / dev build behavior).

Immediate cost reduction (recommended)

Since your session context was ~77k, every run is expensive. In WebChat, send as standalone messages:

  • /stop
  • /reset (or /new)

This prevents “every tiny ping costs 77k input tokens”.

Docs: https://docs.openclaw.ai/concepts/session

#

Quick sanity check

Do you have more than one Control UI/WebChat window/tab open (or open on another machine via remote desktop)? If yes, close all but one — duplicated clients can cause confusing resend/reconnect behavior.

Once you paste the snippet around the next embedded run start, we can pinpoint the exact cause and the fix (often it’s a reconnect loop or dev client quirk).

woven nimbus
#

15:22 [ws] ⇄ res ✓ node.list 29ms id=cf0dfdf3…edd3
23:15:22 [ws] ⇄ res ✓ device.pair.list 30ms id=ba360bfa…69b1
23:15:27 [ws] ⇄ res ✓ node.list 3ms id=beaaf91d…3591
23:15:50 [ws] → event tick seq=2 clients=1 dropIfSlow=true
23:16:20 [ws] → event health seq=3 clients=1 presenceVersion=2 healthVersion=4
23:16:20 [ws] → event tick seq=4 clients=1 dropIfSlow=true
23:16:27 [ws] ⇄ res ✓ node.list 4ms id=f725a450…7215
23:16:50 [ws] → event tick seq=5 clients=1 dropIfSlow=true
23:17:20 [ws] → event health seq=6 clients=1 presenceVersion=2 healthVersion=5
23:17:20 [ws] → event tick seq=7 clients=1 dropIfSlow=true
23:17:28 [ws] ⇄ res ✓ node.list 3ms id=e34f1db5…cdb6
23:17:50 [ws] → event tick seq=8 clients=1 dropIfSlow=true
23:18:20 [ws] → event health seq=9 clients=1 presenceVersion=2 healthVersion=6
23:18:20 [ws] → event tick seq=10 clients=1 dropIfSlow=true
23:18:28 [ws] ⇄ res ✓ node.list 6ms id=8eac9ffc…541b
23:18:50 [ws] → event tick seq=11 clients=1 dropIfSlow=true
23:19:20 [ws] → event health seq=12 clients=1 presenceVersion=2 healthVersion=7
23:19:20 [ws] → event tick seq=13 clients=1 dropIfSlow=true

this keeps repeating

spark cliffBOT
#

Yep — that repeating pattern is normal Control UI/WebChat behavior:

  • node.list is the UI polling for node status
  • event tick / event health are gateway → client keepalive/status events

None of that calls Anthropic, so it should not burn tokens.

What burns tokens is when you see lines like:

  • embedded run start ... provider=anthropic ... messageChannel=webchat
  • (or any model / anthropic request logs)

Quick check: are Anthropic calls still happening right now?

While that loop is repeating, run:

openclaw logs --follow --plain | grep "embedded run start"
#
  • If this is quiet (no output), you’re not currently burning tokens.
  • If it keeps printing embedded run start, then something is still auto-sending chat messages.

Figure out how many Anthropic runs happened during the “7.7M tokens in” hour

This will tell us if it was ~100 runs × ~77k context (which matches your billing almost perfectly).

On the gateway host:

grep "embedded run start" /tmp/openclaw/openclaw-2026-02-27.log | wc -l
grep "embedded run start" /tmp/openclaw/openclaw-2026-02-27.log | tail -n 30

(You can paste just the count + the last few timestamps; no secrets.)

If you do still see embedded run start happening

Then we need the line that triggers it (the WS request). Keep the gateway running in this verbose mode, and when the next embedded run start happens, paste ~30 lines before it — we’re looking for a chat.send request right before the run.

#

Also, since your session context was ~77k: send /reset in WebChat to stop paying that huge prompt on every run.

woven nimbus
#

2026-02-27T22:00:12.383Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=9fbd8741-b706-4292-8a8a-5adee74623f2 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat
2026-02-27T22:00:12.774Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=9fbd8741-b706-4292-8a8a-5adee74623f2 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=openai model=gpt-5.1 thinking=low messageChannel=webchat
2026-02-27T22:07:49.812Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=e8370974-233b-4e24-9729-c46d519aeeb2 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=anthropic model=claude-haiku-4-5 thinking=low messageChannel=webchat
2026-02-27T22:07:50.102Z debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=e8370974-233b-4e24-9729-c46d519aeeb2 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=openai model=gpt-5.1 thinking=low messageChannel=webchat

spark cliffBOT
#

Those last lines are the key:

runId=9fbd... provider=anthropic model=claude-haiku-4-5
runId=9fbd... provider=openai    model=gpt-5.1

Same runId, two different providers/models ⇒ a single “turn” is triggering multiple model calls (either failover/retry logic, or some config that does multi-model routing). So your Anthropic spend can blow up fast even if the UI only shows “session cached context = 77k”.

Also: the repeating node.list / event tick / event health spam you saw earlier is just UI polling and does not call Anthropic. The expensive part is only when you see embedded run start … provider=anthropic.

1) Confirm how many Anthropic calls happened (and whether there were retries)

On the gateway host:

grep "provider=anthropic model=claude-haiku-4-5" /tmp/openclaw/openclaw-2026-02-27.log | wc -l
#
grep "provider=openai model=gpt-5.1" /tmp/openclaw/openclaw-2026-02-27.log | wc -l

If those counts are close, you’re effectively doing ~2 model calls per turn.

2) We need the reason OpenAI is getting invoked for the same runId

Grab surrounding log context for one runId (this usually reveals “fallback to …” / an error / a policy decision):

grep -C 80 "runId=9fbd8741-b706-4292-8a8a-5adee74623f2" /tmp/openclaw/openclaw-2026-02-27.log

Paste that block (you can redact any message text / secrets). We’re looking for lines like retry, fallback, error, timeout, 429, etc.

3) Immediate “don’t pay 77k input per run” fix

In WebChat, send as standalone:

#
  • /reset (or /new) — starts a fresh session id so the next run isn’t dragging the 77k-token transcript along.
  • optionally /compact later, once things are stable.

Docs: https://docs.openclaw.ai/concepts/session
Compaction: https://docs.openclaw.ai/concepts/compaction

4) Prevent giant tool outputs from bloating future prompts

Enable Anthropic session pruning (this specifically targets old tool results, which are usually what makes sessions huge):

Docs: https://docs.openclaw.ai/concepts/session-pruning

—

If you paste the grep -C 80 runId=... output, we can tell you exactly why it’s doing Anthropic and OpenAI for the same run, and how to stop the extra calls.

woven nimbus
#

 grep "provider=anthropic model=claude-haiku-4-5" /tmp/openclaw/openclaw-2026-02-27.log | wc -l
42
  mainframe-pc  mainframe  ~  grep "provider=openai model=gpt-5.1" /tmp/openclaw/openclaw-2026-02-27.log | wc -l
10

So Haiku 42 and 5.1gpt 10

The grep command for open AI output was extremely long

spark cliffBOT
#

42 Haiku calls explains a lot.

If Anthropic shows 7,718,664 tokens in and your log shows 42 provider=anthropic model=claude-haiku-4-5 runs, that averages to about:

  • 7,718,664 / 42 ≈ 183,800 input tokens per Haiku call

So during the burn, your prompt/context going into Anthropic was likely way closer to ~180k (near the 200k window) — even if /context detail later showed 77k cached (that number can change after resets/compactions/pruning and it’s “current session”, not “what it was an hour ago”).

The 10 gpt-5.1 runs strongly suggest model failover / fallback kicked in sometimes (docs: https://docs.openclaw.ai/concepts/model-failover) — usually because the primary provider hit a timeout / rate limit / transient error.

Next: find why OpenAI got invoked + whether Anthropic was erroring/retrying

Do these (and paste the results, redacting any secrets/message bodies):

  1. Show the last few OpenAI-fallback occurrences (so we can grab runIds):
grep "provider=openai model=gpt-5.1" /tmp/openclaw/openclaw-2026-02-27.log | tail -n 5
#
  1. Pick one runId=... from that output and pull context around it:
grep -C 120 "runId=PASTE_RUNID_HERE" /tmp/openclaw/openclaw-2026-02-27.log | tail -n 260
  1. Quick scan for the usual culprits:
grep -Ei "rate_limit|429|timeout|overloaded|cooldown|fallback|error" /tmp/openclaw/openclaw-2026-02-27.log | tail -n 100

Immediate mitigation (so this never hurts this badly again)

In WebChat, do a standalone:

  • /reset (or /new) — stops paying the giant session transcript every time
  • then continue work in the fresh session

And long-term, enable session pruning for Anthropic so old tool results don’t balloon prompts:
https://docs.openclaw.ai/concepts/session-pruning

#

If you paste the runId context for one of the OpenAI fallbacks, we can point to the exact error/reason it happened (and how to stop it).

woven nimbus
#

I also had Sonnet and Opus. How can I correct this error? Should I do a fresh re-install of openClaw? How can I fully uninstall it?

spark cliffBOT
#

Reinstalling OpenClaw won’t “fix” what you saw — there isn’t a single accounting bug here so much as: you had a huge session context, and OpenClaw made dozens of Anthropic calls. Anthropic bills per request, so even “light usage” can get expensive if each request is dragging ~100k–200k tokens of transcript/tool output.

You already measured 42 Haiku runs in the hour. With ~7.7M input tokens billed, that’s ~180k input tokens/run on average — totally plausible if the session had grown massive.

How to correct it (so it doesn’t happen again)

  1. Start a fresh session when context gets big

  2. Use compaction / pruning

  3. If you’re worried the key might be compromised anyway: rotate the Anthropic API key (separate from OpenClaw).

“Should I do a fresh reinstall?”

Only if you want a clean slate for sanity. It’s not required for the token issue.

#

A good “clean slate but keep the CLI installed” option is:

  • Reset state/config/sessions (keeps CLI installed):
    openclaw reset
    # or fully non-interactive:
    openclaw reset --scope config+creds+sessions --yes --non-interactive
    
    Docs: https://docs.openclaw.ai/cli/reset

This wipes the stuff that causes giant session context (sessions), plus creds/config if you choose that scope.

How to fully uninstall OpenClaw

Follow the official uninstall guide: https://docs.openclaw.ai/install/uninstall

Quick version (if the CLI is still installed):

#
openclaw uninstall
# or:
openclaw uninstall --all --yes --non-interactive

That doc also covers manual removal if the service is still running but the CLI is gone, and it lists what state dirs get deleted (including profile-based ones like ~/.openclaw-<profile>).

If you tell me your gateway host OS (Linux/Windows/macOS) and whether you installed via install.sh/npm/pnpm or from a git checkout, I can point you to the exact uninstall path to use.

woven nimbus
#

That still doesn't make sense. It should never be 180,000 input tokens per call we need to correct that specifically. How do I stop it from trying to use all of that at once?

spark cliffBOT
#

180,000 input tokens per call only happens if OpenClaw is actually sending the model ~180k tokens of session transcript (plus any recent tool outputs). It’s not your injected files (those were ~3–4k). So the thing to “correct” is: stop the session from growing huge / stop huge tool results from staying in-context.

Here’s how to stop it from trying to use all of that at once:

Immediate fix (stops big prompts right now)

In WebChat / Control UI, send as standalone:

  • /reset (or /new)

That starts a new sessionId, so the next model call won’t drag the giant transcript. Session docs: https://docs.openclaw.ai/concepts/session

If you want to preserve some context first, do:

Prevent it long-term: prune old tool output (this is the big one)

Most “my prompt exploded” cases are from exec tool output (logs, directory listings, huge file dumps) getting stuck in the session history.

#

Enable Anthropic session pruning so old toolResult blocks get trimmed/cleared before each Anthropic call:
https://docs.openclaw.ai/concepts/session-pruning

That doc includes the config shape; the simplest starting point is enabling agents.defaults.contextPruning (mode cache-ttl).

Stop generating huge tool output in the first place

When you (or the agent) runs exec, ensure commands are naturally bounded:

  • prefer tail -n 200, head -n 200, sed -n '1,200p', etc.
  • avoid cat on large logs / recursive ls / dumping whole JSONL transcripts

Even if OpenClaw truncates some tool outputs, “big tool output + long session” is how you end up near 200k context.

Optional: make sessions auto-reset sooner

If you want to avoid ever accumulating enormous context, shorten the idle reset window via config (docs mention session.reset.idleMinutes or legacy session.idleMinutes):
https://docs.openclaw.ai/reference/session-management-compaction

#

That way, after some idle time, the next message starts fresh automatically.


If you do just one thing: /reset, then turn on session pruning for Anthropic. That directly prevents “it tries to use 180k tokens at once” from happening again.

regal knot
#

go into your openclaw.json

#

set concurrency to 1 and the number of sub agents to 2

spark cliffBOT
#

I can’t open your ~/.openclaw/openclaw.json from here, but I can tell you exactly what to put in it to stop the “grab the entire gigantic session every call” behavior.

The 2 config knobs you want

  1. Hard cap how much context OpenClaw will try to stuff into a prompt
    Set agents.defaults.contextTokens to something sane (e.g. 40k–60k). This forces earlier compaction/pruning behavior instead of letting prompts drift toward 200k.

  2. Prune old tool results before Anthropic calls (biggest source of “oops 180k” is exec output / sessions_history output sitting in the transcript)
    Enable agents.defaults.contextPruning in cache-ttl mode.

Docs for pruning: https://docs.openclaw.ai/concepts/session-pruning (shows the exact field names)

Edit openclaw.json

On the gateway host:

nano ~/.openclaw/openclaw.json
#

Add something like this (JSON5 is allowed):

{
  agents: {
    defaults: {
      // Hard cap prompt size (prevents 180k+ token prompts).
      // Try 60000 first; if you want it tighter, drop to 40000.
      contextTokens: 60000,

      // Trim/clear old tool outputs from the in-memory prompt before Anthropic calls.
      // This prevents giant exec/session-history tool results from bloating prompts.
      contextPruning: {
        mode: "cache-ttl",
        ttl: "5m",
      },
#

      // Optional but recommended: keep compaction on (default behavior),
      // and use /compact manually when needed.
      compaction: { mode: "safeguard" },
    },
  },
}

Then restart the gateway:

openclaw gateway restart

Immediate cleanup (do this once in WebChat)

#

Even with the new config, your current session is already huge. In WebChat send:

  • /reset (or /new)

That starts a fresh session id so you’re not dragging the old transcript.

Optional “extra safety”

If you don’t want the model to ever pull huge transcript chunks via tool calls, you can also deny just that tool:

Docs: https://docs.openclaw.ai/tools

{
  tools: {
    deny: ["sessions_history"],
  },
#
}

If you paste your current ~/.openclaw/openclaw.json (redact secrets), I can tell you exactly where to merge this without breaking your existing config.

regal knot
#

it by default does 4 and 8 i had found that out by having to build a free key generator

spark cliffBOT
#

To cap concurrency you do it in ~/.openclaw/openclaw.json (JSON5).

1) Global run concurrency = 1

This controls how many main agent runs can execute in parallel (across sessions). The queue docs call this agents.defaults.maxConcurrent.
Docs: https://docs.openclaw.ai/concepts/queue

2) Max concurrent sub-agents = 2

This controls how many sub-agent runs can run at the same time: agents.defaults.subagents.maxConcurrent.
(Referenced in the config reference + subagents docs.) Docs: https://docs.openclaw.ai/tools/subagents

Put this in your openclaw.json:

// ~/.openclaw/openclaw.json
{
  agents: {
#
    defaults: {
      maxConcurrent: 1,

      subagents: {
        maxConcurrent: 2,
      },
    },
  },
}

Then restart the gateway:

openclaw gateway restart
#

Important: this will reduce parallelism (good for “stop the runaway”), but it does not by itself stop 180k-token prompts. For that, you still want to (a) /reset when a session gets huge, and (b) enable session pruning so old tool output doesn’t bloat prompts: https://docs.openclaw.ai/concepts/session-pruning.

If you paste your current ~/.openclaw/openclaw.json (redact secrets), I can tell you exactly where to merge those keys without clobbering anything you already have.

#

Yeah — the defaults really are main lane = 4 and subagent lane = 8 (that’s from the command queue design), so you don’t need to “discover” it the hard way.

But: please don’t build or use any “free key generator” for Anthropic/OpenAI/etc. If you mean provider API keys, those must come from the provider. If you meant a gateway token (OpenClaw auth token), that’s different and totally fine to generate locally.

If you just want to cap concurrency

Put this in ~/.openclaw/openclaw.json:

{
  agents: {
    defaults: {
      maxConcurrent: 1,        // main runs overall parallelism cap
      subagents: {
        maxConcurrent: 2,      // sub-agent parallelism cap
      },
    },
#
  },
}

Then restart:

openclaw gateway restart

If you meant “I generated a token for the gateway UI”

Use a normal random token. For example:

openclaw doctor --generate-gateway-token
#

Note: that command only sets it if missing. To rotate manually, set gateway.auth.token and restart.

Important: concurrency doesn’t fix “180k input tokens”

Concurrency just limits how many runs at once. To stop massive prompts:

  • In WebChat: /reset (fresh session id)
  • Enable session pruning (trims old tool results before Anthropic calls): https://docs.openclaw.ai/concepts/session-pruning
  • Optionally set a hard cap like agents.defaults.contextTokens: 60000 so it can’t balloon toward 200k.

If you paste your current openclaw.json (redact secrets), I can point out exactly where to add the pruning + context cap safely.

regal knot
#

no but if he has 4 concurrent requests moving with retries and massive amount of tokens x the 8 subagents running shit that will run the bill up sky high especially if the memories file is huge and so is the other shit it passes in the api calls to maintain its personality

spark cliffBOT
#

You’re not wrong that “many runs × huge context” can melt a bill fast — but it’s worth separating what actually multiplies cost:

1) Concurrency doesn’t increase tokens per request

It increases how many requests can be in-flight at the same time, so it can make the burn happen faster in wall-clock time, but it doesn’t by itself make each call 180k tokens.

The reason Lucky saw ~180k-ish average input is: the session transcript/tool outputs being sent to the model were enormous. That’s the real multiplier.

2) Sub-agents can absolutely add cost (but only if you’re spawning them)

Each sub-agent has its own session/context and token usage (docs: https://docs.openclaw.ai/tools/subagents). If you have automation spawning subagents (or you manually spawn them), then yes: multiple parallel subagent runs can stack up.

But in Lucky’s logs, the expensive runs were tagged messageChannel=webchat and looked like repeated “normal” runs, not a swarm of subagents.

3) “Personality / memory files” usually aren’t the 180k culprit

The injected files (AGENTS.md/SOUL.md/etc) are typically a few thousand tokens. Lucky’s /context detail showed that clearly.

What makes prompts explode to 100k–200k is almost always:

  • long session transcript history, and/or
#
  • tool results (especially exec output like logs, directory dumps, etc.) sticking around in the transcript.

What actually fixes the “massive context per call” problem

For Lucky (and anyone else seeing this), the practical fixes are:

  1. Reset the session when it gets bloated: /reset or /new
    https://docs.openclaw.ai/concepts/session

  2. Enable Anthropic session pruning so old tool results don’t keep inflating the prompt:
    https://docs.openclaw.ai/concepts/session-pruning

  3. Optionally set a hard cap so it can’t try to approach 200k tokens:

    • agents.defaults.contextTokens: 60000 (or 40000)

If you still want the safety valve: cap concurrency/subagents

This is reasonable to slow runaway situations down:

#
{
  agents: {
    defaults: {
      maxConcurrent: 1,
      subagents: { maxConcurrent: 2 },
    },
  },
}

Docs on the queue/concurrency model: https://docs.openclaw.ai/concepts/queue

But the “sky high” bill only happens if something is triggering lots of runs and each run has a huge prompt. Concurrency just changes how quickly you find out.

woven nimbus
spark cliffBOT
#

You almost certainly don’t “have 8 sub-agents running” — you’re mixing up default concurrency caps with actual sub-agent runs.

1) The “4 and 8” are just caps, not things that automatically happen

OpenClaw has an internal queue:

  • main lane default concurrency = 4
  • subagent lane default concurrency = 8

That just means “if there are multiple independent sessions/runs ready, the gateway may run up to this many at once.” It does not spawn 8 subagents or create extra LLM calls by itself.
Docs: https://docs.openclaw.ai/concepts/queue

2) You only have sub-agents if you (or automation) spawned them

To verify, in WebChat send:

  • /subagents list

If it’s empty, you have zero subagents. If you see any, you can stop them with /subagents kill all.

#

Docs: https://docs.openclaw.ai/tools/subagents

3) “Retries” usually come from one of these

Retries/fallbacks happen when a run hits transient failures, e.g.:

  • timeouts
  • 429 rate limits
  • provider overload errors
  • tool/schema/provider errors
  • auto-compaction can also cause “do work → retry original request”

And if model failover is configured, OpenClaw may fall back to another model/provider (which matches your logs where a runId hit Anthropic and then OpenAI).
Docs: https://docs.openclaw.ai/concepts/model-failover
Compaction: https://docs.openclaw.ai/concepts/compaction

4) If you want to “force it to be calm”: set concurrency caps anyway

Even though caps aren’t the cause of the huge per-call tokens, it’s a good safety valve to slow down runaway situations:

#
{
  agents: {
    defaults: {
      maxConcurrent: 1,
      subagents: { maxConcurrent: 2 },
    },
  },
}

Then openclaw gateway restart.

5) The real fix for “180k tokens per call”

That happens because the session transcript + tool results got enormous, so every run drags a massive prompt.

To stop that:

#

If you paste the output of /subagents list and /status from WebChat, I can tell you whether you actually have parallel work happening, or whether it’s just a single huge session getting retried/failing over.

woven nimbus
#

How can I enable session pruning?

spark cliffBOT
#

Enable it in your Gateway config (~/.openclaw/openclaw.json) under agents.defaults.contextPruning.

  1. Edit config (JSON5 is fine):
nano ~/.openclaw/openclaw.json
  1. Add this (merge into your existing file; don’t replace the whole thing):
{
  agents: {
    defaults: {
      // Prunes old tool results from the in-memory prompt before Anthropic calls
      // (does NOT rewrite your on-disk transcript).
      contextPruning: {
#
        mode: "cache-ttl",
        ttl: "5m",
        // optional tuning (defaults are fine):
        // keepLastAssistants: 3,
        // softTrimRatio: 0.3,
        // hardClearRatio: 0.5,
      },
    },
  },
}
  1. Restart the gateway:
openclaw gateway restart
#

Docs (explains exactly what it prunes + defaults): https://docs.openclaw.ai/concepts/session-pruning

Strongly recommended: after enabling pruning, do a one-time /reset in WebChat so you immediately stop paying for the already-bloated session history.

regal knot
#

lol

spark cliffBOT
#

Yep — 4 (main lane) and 8 (subagent lane) are just the default queue concurrency caps (documented), not “8 subagents running”. See the Command Queue docs: https://docs.openclaw.ai/concepts/queue

Also: if you’re “rotating free keys” / using multiple Gmail accounts to dodge provider limits and accounts got banned — that’s squarely in “provider ToS enforcement” territory. OpenClaw supports legit auth profile selection/rotation when you have multiple valid credentials (docs: https://docs.openclaw.ai/concepts/model-failover), but it’s not meant to help bypass rate limits/bans.

If you want to keep the gateway from blasting requests in parallel, set the caps explicitly:

// ~/.openclaw/openclaw.json
{
  agents: {
    defaults: {
      maxConcurrent: 1,      // overall main runs in parallel
      subagents: { maxConcurrent: 2 }, // max subagent runs in parallel
    },
  },
}
#

Restart after changes:

openclaw gateway restart

And the actual fix for “it ate a ton of tokens” is usually session bloat, not concurrency — enable session pruning (trims old tool results before Anthropic calls):
https://docs.openclaw.ai/concepts/session-pruning

#

Yeah, fair.

Just to be super clear though: OpenClaw’s built-in “rotation/failover” stuff is meant for multiple legitimate credentials (e.g., work + personal, backup key, different providers) and handling normal failures/timeouts — not for cycling “free keys” / bypassing rate limits or bans. I can’t help with anything in that direction.

If the goal is “don’t accidentally torch tokens again”, the boring fixes are the real ones:

  • cap concurrency (agents.defaults.maxConcurrent, agents.defaults.subagents.maxConcurrent)
  • enable session pruning (cuts old tool output from prompts)
  • /reset when a session balloons

Those 3 solve 95% of the “why did this explode” cases without any sketchy key gymnastics.

regal knot
#

this setting // ~/.openclaw/openclaw.json
{
agents: {
defaults: {
maxConcurrent: 1, // overall main runs in parallel
subagents: { maxConcurrent: 2 }, // max subagent runs in parallel
},
},
} yes

spark cliffBOT
#

Yeah… that’s a mess.

For your situation though: you don’t need any “key rotator” nonsense to fix this. The real issue was lots of runs + giant context per run.

If you want the “do these 3 things and you’ll be safe” checklist:

  1. Reset the bloated session

    • In WebChat: /reset (or /new)
      This immediately stops the “every message costs 100k+ input tokens” problem.
  2. Enable session pruning (so tool output can’t balloon prompts again)

#

contextPruning: { mode: "cache-ttl", ttl: "5m" },
},
},
}
```

  • Then: openclaw gateway restart
  1. Cap concurrency as a safety valve
    {
      agents: {
        defaults: {
          maxConcurrent: 1,
          subagents: { maxConcurrent: 2 },
        },
      },
    
#
   }

If you paste your current openclaw.json (redact provider keys/tokens), I’ll tell you exactly how to combine all three into one clean config block without breaking your existing setup.

#

Yep, that config block is correct (JSON5 comments and trailing commas are fine in openclaw.json).

Next steps:

  1. Put it into ~/.openclaw/openclaw.json (merge with what you already have, don’t overwrite other sections).
  2. Restart the gateway:
    openclaw gateway restart
    
  3. Sanity-check in logs that it restarted cleanly.

Two important notes (so expectations are set):

  • This limits how fast things can run in parallel (good safety valve), but it won’t fix “180k input tokens per call” by itself.
  • To fix the huge-per-call prompt size, also do:
regal knot
#

openclaw dont fuck around once them claws open that mf gone

spark cliffBOT
#

Truth. Once you accidentally let a session bloat + start looping tool calls, it’ll happily shovel a 100k–200k-token prompt into the provider over and over.

The “don’t fuck around” safety rails that actually work are:

  • /reset (fresh session when context gets big)
  • session pruning (cuts old tool output from the prompt): https://docs.openclaw.ai/concepts/session-pruning
  • cap concurrency (maxConcurrent: 1, subagents maxConcurrent: 2) so a runaway can’t accelerate
  • and if you’re paranoid: rotate provider keys + gateway token after an incident

That combo turns “gone instantly” into “you’ll notice and stop it before it hurts.”

woven nimbus
spark cliffBOT
#

7M input tokens can absolutely happen from “normal chat” if that chat got long and you kept feeding it lots of business context — because every new reply includes a big chunk of the entire prior transcript again. In your own logs you had 42 Haiku runs, which means you averaged roughly:

  • 7,718,664 / 42 ≈ 184k input tokens per call

That’s basically “your session prompt was near the 200k window”, over and over.

The fix is to stop the session from ever getting that big (and to stop giant tool outputs from sticking in the prompt).

Do this now (stops the burn immediately)

In WebChat / Control UI:

  1. /stop
  2. /reset (or /new)

That starts a fresh session id, so the next call won’t drag the huge transcript.

Make it not happen again (config)

Edit ~/.openclaw/openclaw.json and add these under agents.defaults (merge with what you have):

#
{
  agents: {
    defaults: {
      // Safety valve: slow down runaways
      maxConcurrent: 1,
      subagents: { maxConcurrent: 2 },

      // Hard cap how much context the agent will try to stuff into one call
      contextTokens: 60000,

      // Prevent old tool outputs from bloating Anthropic prompts
      contextPruning: {
        mode: "cache-ttl",
        ttl: "5m",
      },
#
    },
  },
}

Then:

openclaw gateway restart

Pruning doc (this is the key feature): https://docs.openclaw.ai/concepts/session-pruning

Workflow change (this is what “teaching it about my businesses” should look like)

Instead of pasting/rehashing business context every few messages (which explodes transcript size), do this:

  • Put your business info in a workspace file (e.g. business-context.md)
#
  • Ask the agent to refer to that file / specific sections
  • Use /reset when switching topics, and /compact if a thread is getting long

Compaction doc: https://docs.openclaw.ai/concepts/compaction

Quick sanity check after you reset

Run /context detail again. If it still shows tens of thousands of “Session tokens (cached)” immediately after a reset, something else is wrong (like repeated message replay / tool output dumping), and we should look at the log block around a single runId to see what’s injecting the bulk.

If you want, paste your current openclaw.json (redact secrets) and I’ll tell you the exact merge so you don’t clobber any existing settings.

regal knot
#

yea before you deploy it you need to grab free models and train it then pay for the shit thats why i said get google gemini keys rotate them and you can get a dev account by telling them your ai shopping give you 50 projects instead of just 10

spark cliffBOT
#

Can’t endorse any of that (rotating keys, farming “free” quotas, misrepresenting use-cases to get more projects, etc.). That’s basically ToS evasion / account abuse, and it’s also a great way to get your accounts nuked (as you already saw).

Also: you don’t “train OpenClaw” before deploying. It’s not doing fine-tuning — it’s building a prompt each turn. The cost problem here is prompt bloat + repeated runs, not “needs training”.

If you want to keep cost sane legitimately, the fixes are:

  • Reset/segment sessions: use /reset / /new when a convo gets long.
  • Enable session pruning so old tool output doesn’t keep inflating prompts: https://docs.openclaw.ai/concepts/session-pruning
  • Cap concurrency (safety valve): agents.defaults.maxConcurrent and agents.defaults.subagents.maxConcurrent.
  • Hard cap prompt size with agents.defaults.contextTokens (e.g. 40k–60k) so you never drift toward 200k again.
  • Put big “business context” in workspace files and have the agent reference them, instead of pasting/repeating it in chat.

If Lucky still sees runaway calls after /reset + pruning + contextTokens cap, that’s when we look for a bug (log block around a runId) and file a GitHub issue.

regal knot
#

separate keys for each project

spark cliffBOT
#

Separate keys per project is fine as billing hygiene (it helps you isolate spend and rotate/revoke safely), but it won’t fix the “180k tokens per call” problem by itself — that’s caused by a bloated session prompt/tool output being resent repeatedly.

If you do want separate keys in OpenClaw (legit, no quota-evasion stuff):

  • OpenClaw uses auth profiles (keys/OAuth) and can rotate/fail over between them when configured.
  • Profiles are stored under ~/.openclaw/agents/<agentId>/agent/auth-profiles.json (details + rotation rules): https://docs.openclaw.ai/concepts/model-failover

Two sane patterns:

  1. One Anthropic key dedicated to OpenClaw, nothing else uses it (best for debugging spend).
  2. Separate OpenClaw profiles (e.g. “dev” vs “prod”), and optionally separate OpenClaw state dirs using profiles (OPENCLAW_PROFILE) so sessions/config don’t mix.

But for Lucky’s actual burn: do /reset, enable session pruning, and consider a contextTokens cap — that’s what stops giant prompts. Pruning doc: https://docs.openclaw.ai/concepts/session-pruning

woven nimbus
#

23:50:09 debug diagnostic {"subsystem":"diagnostic"} lane dequeue: lane=main waitMs=1 queueSize=0
23:50:09 debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=d63d1c0c-e778-4784-b6f8-a3b825710e94 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 provider=openai model=gpt-5.1 thinking=low messageChannel=webchat
23:50:09 debug diagnostic {"subsystem":"diagnostic"} run registered: sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02 totalActive=1
23:50:09 debug agent/embedded {"subsystem":"agent/embedded"} embedded run prompt start: runId=d63d1c0c-e778-4784-b6f8-a3b825710e94 sessionId=640c9791-7992-48c5-b7f7-f80afc25ab02
23:50:09 debug agent/embedded {"subsystem":"agent/embedded"} embedded run agent start: runId=d63d1c0c-e778-4784-b6f8-a3b825710e94
23:50:43 debug agent/embedded {"subsystem":"agent/embedded"} embedded run tool start: runId=d63d1c0c-e778-4784-b6f8-a3b825710e94 tool=write toolCallId=call_ClANZRHrDP8XuoKIfEnz3kU9|fc_0b84cdee9078f0a00069a37f372250819791c27b719a38fcdc
23:50:43 debug agent/embedded {"subsystem":"agent/embedded"} embedded run tool end: runId=d63d1c0c-e778-4784-b6f8-a3b825710e94 tool=write toolCallId=call_ClANZRHrDP8XuoKIfEnz3kU9|fc_0b84cdee9078f0a00069a37f372250819791c27b719a38fcdc
23:50:46 debug agent/embedded {"subsystem":"agent/embedded"} embedded run tool start: runId=d63d1c0c-e778-4784-b6f8-a3b825710e94 tool=exec toolCallId=call_38FFtD0H3wJskZ2t7jtXFxOa|fc_0b84cdee9078f0a00069a37f55707c8197b7458ad424184be8
23:50:48 debug agent/embedded {"subsystem":"agent/embedded"} embedded run tool end: runId=d63d1c0c-e778-4784-b6f8-a3b825710e94 tool=exec toolCallId=call_38FFtD0H3wJskZ2t7jtXFxOa|fc_0b84cdee9078f0a00069a37f55707c8197b7458ad424184be8
23:50:53 debug agent/embedded {"subsystem":"agent/embedded"} embedded run agent end: runId=d63d1c0c-e778-4784-b6f8-a3b825710e94 isError=false
23:50:53 debug agent/embedded {"subsystem":"agent/embedded"} embedded run

#

sent 1.8m tokens in a very short amount of time

#

heavy pruning and caching

spark cliffBOT
#

That log snippet is actually consistent with what we’ve been diagnosing:

  • You’re still on the same huge session: sessionId=640c9791-... (same one from earlier).
  • That run used OpenAI: provider=openai model=gpt-5.1 (so Anthropic-only session pruning won’t apply to this specific call).
  • The agent is also using tools: tool=write + tool=exec (tool results—especially exec—are one of the main ways sessions balloon).

The single biggest fix you haven’t done yet

Reset the session. In WebChat, send as a standalone message:

  • /reset (or /new)

Then watch the next run in logs: you should see a new sessionId. That immediately stops “every reply drags a gigantic transcript”.

Why you’re seeing OpenAI at all

That’s usually one of:

  • you manually set /model gpt-5.1 at some point, or
  • you have model fallbacks enabled and Anthropic had a timeout/rate-limit and it fell back.
#

If you want to stop fallback to OpenAI, set your primary model + empty fallbacks in openclaw.json (docs show the exact keys):
https://docs.openclaw.ai/gateway/configuration (see “Choose and configure models”)

Example shape:

{
  agents: {
    defaults: {
      model: {
        primary: "anthropic/claude-haiku-4-5",
        fallbacks: [],
      },
    },
  },
}
#

If your goal is “regular conversation should never run tools”

Lock down tools so the model can’t write/exec unless you explicitly want it to:

Docs: https://docs.openclaw.ai/tools

For example:

{
  tools: {
    deny: ["group:runtime", "group:fs"], // blocks exec/bash/process + read/write/edit/apply_patch
  },
}

Reminder about pruning

Session pruning (the built-in one) is Anthropic-only:
https://docs.openclaw.ai/concepts/session-pruning

#

So if you’re currently running gpt-5.1, pruning won’t be the mechanism that saves you — /reset + tool restrictions + context cap (agents.defaults.contextTokens) will.

If you paste your current agents.defaults.model block (redact secrets), I can tell you exactly why it’s picking gpt-5.1 and how to force Anthropic only.

#

1.8M tokens “in” in a short time basically means: you were still making calls while the session prompt was enormous (often 100k–200k tokens), so every reply was re-sending a giant transcript/tool history.

To get heavy pruning + caching working, you want 3 things in place:

0) Make sure you’re not still on the bloated session

In WebChat, send /reset (or /new) as a standalone message, then confirm in logs that the next run has a new sessionId (not 640c9791-...).

If you keep the same sessionId, you’ll keep paying for the huge context no matter what.

1) Enable session pruning (Anthropic only)

This prunes old tool results from the in-memory prompt before Anthropic calls (doesn’t rewrite your transcript).

Docs: https://docs.openclaw.ai/concepts/session-pruning

// ~/.openclaw/openclaw.json
#
{
  agents: {
    defaults: {
      contextPruning: {
        mode: "cache-ttl",
        ttl: "5m", // match cacheRetention short (5m) or long (1h)
      },
    },
  },
}

Restart:

openclaw gateway restart
#

2) Turn on prompt caching correctly (Anthropic)

Caching is controlled by cacheRetention on the model params (none|short|long).

Docs: https://docs.openclaw.ai/reference/prompt-caching

Example (explicitly set it so there’s no ambiguity):

{
  agents: {
    defaults: {
      model: { primary: "anthropic/claude-haiku-4-5" },

      models: {
        "anthropic/claude-haiku-4-5": {
          params: { cacheRetention: "short" }, // "short"=5m, "long"=1h
#
        },
      },
    },
  },
}

Notes:

  • short is usually the safest default (reduces waste without huge cache-write bills).
  • long can help for long-running sessions, but if your prompt is gigantic, cache writes can still be expensive.

3) Hard-cap how much context it’s allowed to send

This is the “never again 180k” guardrail:

#
{
  agents: {
    defaults: {
      contextTokens: 60000, // try 60k; drop to 40k if you want it tighter
    },
  },
}

Important gotcha: your logs show provider=openai model=gpt-5.1

Anthropic pruning + Anthropic caching do nothing on OpenAI calls. If you’re falling back to OpenAI, you need to decide if that’s intended.

If you want to force Anthropic only (no fallback), set:

#
  • /usage full (shows per-response input/output + cacheRead/cacheWrite when available)

Docs: https://docs.openclaw.ai/reference/token-use

If you paste your current agents.defaults.model / agents.defaults.models block (redact keys), I can tell you why you’re hitting gpt-5.1 and give you the exact merged config that enables pruning+caching+context cap cleanly.

woven nimbus
#

I am now using 5.1, but the idea was, why is it sending still that many tokens per request? How can I configure it to not do that?

spark cliffBOT
#

If you’re on gpt-5.1 and it’s still sending huge prompts, that’s almost always because the session transcript is huge (plus tool outputs). OpenClaw builds the prompt by including enough recent history to keep continuity — so if you’ve been “teaching it your businesses” in one long thread, each new turn can easily drag tens/hundreds of thousands of tokens again.

Step 1 (immediate): start a fresh session ID

Even perfect config won’t shrink an already-bloated session.

In WebChat send as a standalone message:

  • /reset (or /new)

Then check logs: you should see a different sessionId= on the next run.

Step 2 (the real fix): hard-cap prompt size

Add a cap so OpenClaw never even tries to build a 180k token prompt.

Edit ~/.openclaw/openclaw.json and add:

#
{
  agents: {
    defaults: {
      contextTokens: 60000, // try 60k; if you want stricter, use 40000
      compaction: { mode: "safeguard" }, // keep compaction on so it can stay under the cap
    },
  },
}

Restart:

openclaw gateway restart
#

(Compaction overview: https://docs.openclaw.ai/concepts/compaction)

Step 3 (optional but recommended): auto-reset on idle so sessions don’t grow forever

From the config docs, you can set idle resets:

{
  session: {
    reset: {
      mode: "daily",
      idleMinutes: 60,
      // atHour: 4, // optional daily reset hour
    },
  },
}
#

(Session reset keys are covered in the config docs; session concepts here: https://docs.openclaw.ai/concepts/session)

Step 4 (if you’re “just chatting”): stop tool output from bloating history

Your earlier logs showed write + exec. exec output is a common reason prompts balloon.

If this agent is meant for conversation, consider denying runtime tools:

{
  tools: {
    deny: ["group:runtime"], // blocks exec/bash/process
  },
}

(Tools policy: https://docs.openclaw.ai/tools)

#

Verify it’s working

In WebChat, run:

  • /usage full (shows per-response input tokens so you can confirm you’re no longer sending 100k+ per turn)

(Token/usage docs: https://docs.openclaw.ai/reference/token-use)

If you paste your current openclaw.json minus secrets, I can point out exactly where to insert contextTokens / reset / tool deny without breaking your existing config.

woven nimbus
#

For number two, do I just copy and paste that in?

spark cliffBOT
#

Yes — copy/paste it into ~/.openclaw/openclaw.json, but merge it into your existing JSON5, don’t replace the whole file.

Easiest safe way

  1. Open the config:
nano ~/.openclaw/openclaw.json
  1. If you already have an agents: { defaults: { ... } } block, just add these lines inside it:
contextTokens: 60000,
compaction: { mode: "safeguard" },
  1. If you don’t already have an agents section, you can paste this whole block at the top level:
#
{
  agents: {
    defaults: {
      contextTokens: 60000,
      compaction: { mode: "safeguard" },
    },
  },
}
  1. Restart the gateway:
openclaw gateway restart

After that

#

Do a one-time /reset in WebChat so you’re not stuck paying for the already-huge old session.

If you paste your current openclaw.json (with secrets removed), I can tell you exactly where to insert it so it’s guaranteed valid.

woven nimbus
#

How do I verify that that's working?

"agents": {
"defaults": {
"model": {
"primary": "anthropic/claude-haiku-4-5",
"fallbacks": [
"openai/gpt-5.1"
]
contextTokens: 60000,
compaction: { mode: "safeguard" },

spark cliffBOT
#

To verify it’s working you need (a) a valid openclaw.json (yours as pasted is currently invalid / mis-nested) and (b) check per-response usage after a /reset.

1) Fix the config snippet (yours is missing braces/commas + contextTokens is in the wrong place)

contextTokens and compaction must be siblings of model, not inside it.

This is a valid JSON5 shape:

// ~/.openclaw/openclaw.json
{
  agents: {
    defaults: {
      model: {
        primary: "anthropic/claude-haiku-4-5",
        fallbacks: ["openai/gpt-5.1"],
      },
#

      contextTokens: 60000,
      compaction: { mode: "safeguard" },

      // optional safety valve:
      // maxConcurrent: 1,
      // subagents: { maxConcurrent: 2 },
    },
  },
}

After editing, run this first to catch schema/parse errors:

openclaw doctor
#

If the config is invalid, the gateway will refuse to start / reload (strict validation).

2) Apply it

Either restart, or rely on hot-reload. Restart is simplest:

openclaw gateway restart

3) Verify the token drop (this is the real proof)

Even with the cap, your current session can still be huge. So:

  1. In WebChat: /reset (standalone)
  2. Then: /usage full (standalone)

Now send 1–2 normal messages and look at the usage footer: your input tokens per response should be way lower than the insane 100k–200k range.

#

Useful extra checks:

  • /status → shows current session context usage + compactions
  • /context detail → should show much smaller “Session tokens (cached)” after the reset

If you paste your current ~/.openclaw/openclaw.json (redact secrets), I can point out exactly where to insert those lines so it validates cleanly.

woven nimbus
#

Session store (main): /home/mainframe/.openclaw/agents/main/sessions/sessions.json (2 entries)

  • agent:main:main (7m ago)
  • agent:main:subagent:e0c143e1-dd49-4a31-a904-ab476fcd53fd (1637m ago)
    Failed to read config at /home/mainframe/.openclaw/openclaw.json SyntaxError: JSON5: invalid character '"' at 39:7
    at syntaxError (/home/mainframe/.npm-global/lib/node_modules/openclaw/node_modules/json5/lib/parse.js:1110:17)
    at invalidChar (/home/mainframe/.npm-global/lib/node_modules/openclaw/node_modules/json5/lib/parse.js:1055:12)
    at Object.end (/home/mainframe/.npm-global/lib/node_modules/openclaw/node_modules/json5/lib/parse.js:704:15)
    at Object.default (/home/mainframe/.npm-global/lib/node_modules/openclaw/node_modules/json5/lib/parse.js:168:37)
    at lex (/home/mainframe/.npm-global/lib/node_modules/openclaw/node_modules/json5/lib/parse.js💯42)
    at Object.parse (/home/mainframe/.npm-global/lib/node_modules/openclaw/node_modules/json5/lib/parse.js:25:17)
    at Object.loadConfig (file:///home/mainframe/.npm-global/lib/node_modules/openclaw/dist/auth-profiles-6WJHPoy1.js:12652:111)
    at loadConfig (file:///home/mainframe/.npm-global/lib/node_modules/openclaw/dist/auth-profiles-6WJHPoy1.js:13108:20)
    at resolveGatewayCallContext (file:///home/mainframe/.npm-global/lib/node_modules/openclaw/dist/call-D_7wmgxh.js:351:32)
    at callGatewayWithScopes (file:///home/mainframe/.npm-global/lib/node_modules/openclaw/dist/call-D_7wmgxh.js:457:18) {
    lineNumber: 39,
    columnNumber: 7
    }
    Run "openclaw doctor --fix" to apply changes.
    Invalid config:
  • <root>: JSON5 parse failed: SyntaxError: JSON5: invalid character '"' at 39:7
spark cliffBOT
#

That error means OpenClaw is not reading your config at all right now, so none of the changes (contextTokens/compaction/etc.) are being applied yet.

JSON5: invalid character '\"' at 39:7 almost always happens when there’s a missing comma or missing closing brace right before a line that starts with a quoted key (like "contextTokens" / "compaction" / "agents").

Fix it quickly

1) Make a backup first

cp ~/.openclaw/openclaw.json ~/.openclaw/openclaw.json.bak

2) Print the area around the failing line (with line numbers)

nl -ba ~/.openclaw/openclaw.json | sed -n '30,50p'

Paste that output here (you can redact any secrets), and I’ll tell you the exact character to fix.

#

3) Most likely fix (based on the snippet you pasted earlier)

You probably have something like:

fallbacks: ["openai/gpt-5.1"]
contextTokens: 60000

…without a comma after the fallbacks line, or you didn’t close the model object before adding contextTokens.

A known-good shape is:

{
  agents: {
    defaults: {
#
      model: {
        primary: "anthropic/claude-haiku-4-5",
        fallbacks: ["openai/gpt-5.1"],
      },

      contextTokens: 60000,
      compaction: { mode: "safeguard" },
    },
  },
}

4) Re-run doctor to confirm it parses

openclaw doctor
#

Once doctor no longer shows a JSON5 parse error, restart the gateway:

openclaw gateway restart

How to verify the limit is working (after config parses)

  1. In WebChat: /reset (so you’re not testing against the already-bloated session)
  2. Then: /usage full
  3. Send 1–2 messages and confirm the input tokens per response are no longer insane.

Run the nl -ba … | sed -n '30,50p' command and paste it—line 39 will make the mistake obvious.

woven nimbus
#

{
"meta": {
"lastTouchedVersion": "2026.2.26",
"lastTouchedAt": "2026-02-27T19:43:17.877Z"
},
"wizard": {
"lastRunAt": "2026-02-27T19:43:17.870Z",
"lastRunVersion": "2026.2.26",
"lastRunCommand": "configure",
"lastRunMode": "local"
},
"auth": {
"profiles": {
"anthropic:default": {
"provider": "anthropic",
"mode": "api_key"
},
"openai:default": {
"provider": "openai",
"mode": "api_key"
}
}
},
agents: {
defaults: {
model: {
primary: "anthropic/claude-haiku-4-5",
fallbacks: ["openai/gpt-5.1"],
},
contextTokens: 60000,
compaction: { mode: "safeguard" },

  // optional safety valve:
  // maxConcurrent: 1,
  // subagents: { maxConcurrent: 2 },
},

},
}
"models": {
"anthropic/claude-sonnet-4-6": {},
"anthropic/claude-haiku-4-5": {},
"openai/gpt-4o-mini": {},
"openai/gpt-5.1": {}
},
"compaction": {
"mode": "safeguard"
},
"maxConcurrent": 4,
"subagents": {
"maxConcurrent": 8
}
}
},
"messages": {
"ackReactionScope": "group-mentions"
},
"commands": {
"native": "auto",
"nativeSkills": "auto",
"restart": true,
"ownerDisplay": "raw"
},
"gateway": {
"mode": "local",
"auth": {
"mode": "token",
"token": "e15a4cae0f0bfd42407a0a1c9d7427d322082f5cbc09ca2f"
}
}
}

Paste it for me, corrected please

spark cliffBOT
#

Your file is broken because it has two different top-level objects glued together, plus contextTokens/compaction were accidentally pasted outside the model { … } block without the right braces/commas.

Also: you pasted your gateway auth token into Discord. Assume it’s compromised and rotate it after you get the config parsing again.

Here’s a clean, valid JSON5 version you can copy/paste as your entire ~/.openclaw/openclaw.json (replace the token string):

// ~/.openclaw/openclaw.json
{
  meta: {
    lastTouchedVersion: "2026.2.26",
    lastTouchedAt: "2026-02-27T19:43:17.877Z",
  },

  wizard: {
    lastRunAt: "2026-02-27T19:43:17.870Z",
#
    lastRunVersion: "2026.2.26",
    lastRunCommand: "configure",
    lastRunMode: "local",
  },

  auth: {
    profiles: {
      "anthropic:default": { provider: "anthropic", mode: "api_key" },
      "openai:default": { provider: "openai", mode: "api_key" },
    },
  },

  agents: {
    defaults: {
      // Primary model + fallback (optional)
#
      model: {
        primary: "anthropic/claude-haiku-4-5",
        fallbacks: ["openai/gpt-5.1"],
      },

      // This is the “stop sending 180k tokens” cap:
      contextTokens: 60000,

      // Keep compaction on so the agent can stay under the cap:
      compaction: { mode: "safeguard" },

      // Safety valve (optional but recommended):
      maxConcurrent: 1,
      subagents: { maxConcurrent: 2 },
#
      // Model allowlist/catalog (keep if you want /model switching):
      models: {
        "anthropic/claude-sonnet-4-6": {},
        "anthropic/claude-haiku-4-5": {},
        "openai/gpt-4o-mini": {},
        "openai/gpt-5.1": {},
      },
    },
  },

  messages: {
    ackReactionScope: "group-mentions",
  },

  commands: {
#
    native: "auto",
    nativeSkills: "auto",
    restart: true,
    ownerDisplay: "raw",
  },

  gateway: {
    mode: "local",
    auth: {
      mode: "token",
      token: "REPLACE_THIS_TOKEN_NOW",
    },
  },
}
#

Verify it’s working

  1. Run:
openclaw doctor

You should not see JSON5 parse errors anymore.

  1. Restart:
openclaw gateway restart
  1. In WebChat:
  • /reset (fresh sessionId)
  • /usage full
    Send 1–2 messages and confirm input tokens are now sane.
#

Rotate the leaked gateway token (do this ASAP)

Since it was posted in Discord, rotate it in openclaw.json (set a new random token), then restart the gateway again.

If you paste the output of openclaw doctor after replacing the file, I’ll sanity-check that the schema is fully accepted.

woven nimbus
#

Why am I getting rate limit exceeded when trying to install Notion skill?

spark cliffBOT
#

“rate limit exceeded” during skill install is almost never Notion itself — it’s usually whichever service the installer is fetching from (most commonly GitHub via Homebrew, or ClawHub if you’re installing from the registry).

I need one detail to tell you which fix applies:

What exact command/UI did you use to “install the Notion skill”, and what’s the exact error line? (Ideally paste the full error block.)

Meanwhile, here are the common cases + fixes:

Case A: Homebrew hit GitHub API rate limits (most common)

If you see wording like “GitHub API rate limit exceeded”, that’s Homebrew being throttled by GitHub.

Fix options:

  1. Give brew a GitHub token (standard Homebrew fix): set HOMEBREW_GITHUB_API_TOKEN to a GitHub PAT.
  2. Or tell OpenClaw to prefer non-brew installers for skills:
#
// ~/.openclaw/openclaw.json
{
  skills: {
    install: {
      preferBrew: false,
      nodeManager: "npm", // or pnpm
    },
  },
}

Then restart the gateway and retry the install.

Docs (skills installer knobs): https://docs.openclaw.ai/tools/skills-config

Case B: ClawHub install is rate-limited (anonymous)

#

If you installed via clawhub install <slug>, try authenticating first:

clawhub whoami
clawhub login
clawhub install <slug>

ClawHub docs: https://docs.openclaw.ai/tools/clawhub

Case C: It’s actually an LLM provider rate limit (Anthropic/OpenAI)

If the error is HTTP 429 / rate_limit, that’s your model provider quota window being exhausted (especially likely after the big token burn earlier). Then the fix is simply: wait for reset / raise limits / switch model/provider.


If you paste the exact “rate limit exceeded” error text (the 1–3 lines that mention the host/service), I’ll tell you exactly which of the above applies and the minimal fix.

woven nimbus
#

npx clawhub@latest install notion
✖ Rate limit exceeded
Error: Rate limit exceeded

and also clawhub install notion

spark cliffBOT
#

That error is coming from ClawHub itself, not OpenClaw and not Notion.

You’re hitting the ClawHub registry rate limit (usually per-IP for anonymous / unauthenticated requests, and it can also trigger if you retry a bunch in a short window).

Fix (most reliable): log in to ClawHub, then install

Don’t use npx clawhub@latest ... for this—install the CLI once and authenticate.

pnpm add -g clawhub
clawhub whoami
clawhub login
clawhub install notion

If whoami still shows you’re not logged in after login, run clawhub login again and complete the browser flow.

Docs: https://docs.openclaw.ai/tools/clawhub

#

If you’re still rate-limited after login

  1. Wait 10–30 minutes (it’s usually a sliding window).
  2. Avoid VPN/proxy hopping (shared exit IPs get rate-limited more often).
  3. Try once (don’t hammer it with retries).

Sanity checks

Paste the outputs of these (they’re short):

clawhub -V
clawhub whoami

Fallback if you need it immediately

Use the web UI (clawhub.ai) to find the skill and install it manually into your workspace’s skills/ folder (workspace skills override bundled/managed). If you tell me what your OpenClaw workspace path is (openclaw config get agents.defaults.workspace), I can tell you exactly where that folder should go.

If the rate limit persists even when logged in and waiting, that’s likely a ClawHub service-side throttle → at that point it’s GitHub-issue-worthy (include your clawhub -V, time, and approximate number of attempts).

woven nimbus
#

run error: 400 Item 'rs_00141779075cf51a0069a3ab21ec088197aad9de849e26730f' of type 'reasoning' was provided without
its required following item.

What is this? How do I fix this when trying to text?

spark cliffBOT
#

That 400 … item … of type 'reasoning' was provided without its required following item is an OpenAI Responses API validation error.

What it usually means in practice: your current session transcript contains an incomplete assistant turn (it wrote a reasoning block, then the run got interrupted/aborted before it produced the required next “output_text/tool call” item). When OpenClaw replays that session history on the next message, OpenAI rejects the whole request.

Fix (fastest)

In WebChat / Control UI send these as standalone messages:

  1. /stop
  2. /reset (or /new)

Then try a normal message again.

(Commands ref: /reset + directives like /reasoning on|off|stream are in the slash commands doc: https://docs.openclaw.ai/tools/slash-commands)

If you still want to be extra safe

Turn off reasoning visibility for this session (this is mostly about display, but it can help avoid “reasoning-only” artifacts):