#Unusually high token usage

1 messages Ā· Page 1 of 1 (latest)

brisk parrot
#

Hi — fresh local OpenClaw install, but Control Chat token usage seems way too high.

Even tiny messages in a fresh /new session were using about:

  • first turn: R13k–14k
  • next short turn: R13k–14k

I traced it to startup/bootstrap context injection. I tested by temporarily removing:

That only reduced usage a bit, so the main cost seems to be built-in startup/system prompt injection, not just workspace files.

I changed config to:

  • contextInjection: "continuation-skip"
  • bootstrapMaxChars: 1200
  • bootstrapTotalMaxChars: 2500
  • bootstrapPromptTruncationWarning: "off"

I also temporarily patched the startup prelude function to return null for testing.

After that, usage improved to about:

  • first short turn: R10.8k
  • next short turn: R8.7k

So it improved, but still feels very heavy for tiny messages in a fresh session.

What is the supported way to make Control Chat startup / early turns lighter?

  • Is there a better setting than contextInjection: "continuation-skip"?
  • Is there a supported way to reduce or disable bootstrap file injection for normal direct chats?
  • What other default prompt/context sources commonly account for another ~8k–10k?

Environment:

  • OpenClaw 2026.4.15
  • local gateway
  • model: openai/gpt-5.4

I can share sanitized config and screenshots if needed.
:::

atomic portalBOT
civic latchBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> āœ… Mark Solution

atomic portalBOT
next mapleBOT
#

I’m checking the actual docs/source path so I can answer with verified settings instead of guessing.

#

Yep, that lines up with how OpenClaw currently builds prompt context.

Short version:

  • contextInjection: "continuation-skip" is already the lightest supported direct-chat mode.
  • There is no supported contextInjection: "off" / "never" for normal Control Chat turns.
  • bootstrapMaxChars / bootstrapTotalMaxChars only affect workspace bootstrap files, not the rest of the prompt.

For bare /new and /reset, there is a separate supported knob for the startup prelude:

{
  agents: {
    defaults: {
      startupContext: {
        enabled: false,
#
        // or: applyOn: ["reset"]   // keep it off for /new only
      },
    },
  },
}

So for your questions:

  1. Better than continuation-skip?
    Not for supported normal chat config. That is the lighter documented option.

  2. Supported way to disable bootstrap file injection for normal direct chats?
    Not fully. Supported levers are:

    • keep injected files small
    • lower bootstrapMaxChars / bootstrapTotalMaxChars
#
  • use continuation-skip
  1. What usually accounts for the other ~8k to 10k?
    Very likely tool schemas. That is the big hidden chunk. Docs explicitly call out that tool JSON counts toward context, and the /context docs show a typical example where tool schemas alone are ~8k tokens. On fresh local installs, tools.profile: "coding" is commonly the default, which makes this especially noticeable.

Biggest supported ways to make it lighter:

  • Disable/narrow startupContext for /new
  • Reduce tool scope with tools.profile or tools.allow / tools.deny
  • Trim skills too, if needed (agents.defaults.skills, agents.list[].skills: [])

Best next step: run /context detail in a fresh session. That will show the actual split between:

  • injected workspace files
  • skills list
  • tool schema JSON
  • total system prompt size
brisk parrot
#

Thanks — I ran /context detail after the config changes.

Current result:

  • Injected workspace files: none shown
  • System prompt: 20,988 chars (~5,247 tok)
  • Skills list: 3,540 chars (~885 tok) (6 skills)
  • Tool schemas (JSON): 19,441 chars (~4,861 tok)
  • Actual context usage (cached): 10,379 tok

Token usage also improved compared to before:

  • first short turn: about R10.8k
  • next short turn: about R8.7k
  • another fresh short ā€œHiā€ turn is still about R10.2k

So this seems to confirm your point:

  • bootstrap/workspace injection is no longer the main problem
  • the biggest remaining cost is now tool schema JSON, then system prompt, then skills

I had also temporarily patched the dist startup prelude earlier for debugging, but at this point I want to continue using only supported config.

What should be the next supported step to reduce context further?

Specifically:

  1. What is the best supported way to reduce tool schema cost?
  2. Is tools.profile: "coding" likely the main reason for the large tool schema block here?
  3. Would you first reduce tools, skills, or both?
  4. What exact config change would you try next?
next mapleBOT
#

Let me verify one detail on tools.allow vs profile so I don’t tell you the wrong shape.

#

Nice, that confirms it pretty cleanly.

At this point, I would trim tools first.

  1. Best supported way to reduce tool schema cost
    Use tools.profile, then optionally tools.allow / tools.deny to keep only the tools you actually want exposed.

  2. Is tools.profile: "coding" likely the main reason?
    Yes. Very likely. Fresh local onboarding defaults to coding when unset, and that profile brings in a lot of tool groups, which directly inflates the tool schema block.

  3. Would I reduce tools, skills, or both first?
    Tools first, then skills.
    Your own numbers are basically:

    • tools: ~4.9k tok
    • skills: ~0.9k tok

    So tools are the much bigger lever.

#
  1. Exact config I’d try next
    If you want a balanced lightweight setup:
{
  tools: {
    profile: "minimal",
    allow: ["group:fs", "group:web"]
  },
  agents: {
    defaults: {
      skills: []
    }
  }
}
#

What that does:

  • starts from minimal (session_status only)
  • adds back filesystem tools (read, write, edit, apply_patch)
  • adds back web tools
  • removes the injected skills list

If you want the smallest supported test first, go even simpler:

{
  tools: { profile: "minimal" },
  agents: {
    defaults: {
      skills: []
    }
  }
#
}

Then run /context detail again, see the new baseline, and add back only what you miss.

So my recommendation is:

  • first change: switch to tools.profile: "minimal"
  • second change: set agents.defaults.skills: []
  • third change: add back only the tool groups you actually need

If you do not want this globally, do the same under the specific agent’s agents.list[].tools instead of at top-level.

Docs:

brisk parrot
#

Thanks — I applied the minimal test you suggested.

Current config (sanitized relevant parts):

json id="m8p4j1" { "agents": { "defaults": { "contextInjection": "continuation-skip", "bootstrapMaxChars": 1200, "bootstrapTotalMaxChars": 2500, "bootstrapPromptTruncationWarning": "off", "startupContext": { "enabled": false }, "skills": [] } }, "tools": { "profile": "minimal" } }

Current result:

  • /context detail
    • System prompt: 14,534 chars (~3,634 tok)
    • Skills list: 0
    • Tool schemas: 89 chars (~23 tok)
    • Tools: session_status only
    • Actual context usage: ~3,707 tok

Observed turn costs in a fresh session:

  • /new first turn: about R4.1k
  • short hi: about R1.5k
  • short ok: about R3.6k

So the heavy token issue is mostly solved, but this is now extremely stripped down and a lot of normal utility is gone.

My question now is more about expected baseline and practical next steps:

  1. For a local OpenClaw Control Chat with tools/skills trimmed this hard, does ~3.7k cached context / ~1.5k–4k short turns look broadly normal to you?
  2. Would you treat this as a healthy baseline and then add back only the needed groups?
  3. If the goal is a still-light but actually useful Control Chat, what would be your first add-back:
    • group:fs
    • group:web
    • both
    • something else?
  4. If you were optimizing for ā€œsmallest useful everyday Control Chatā€, what exact tools config would you try next?

I’m trying to find the best point between low token usage and still having enough everyday functionality.

brisk parrot
#

Hello?

#

@next maple hello could you check the question above please thank you 😊

brisk parrot
#

Thanks — I applied the minimal test you suggested.

Current config (sanitized relevant parts):

json id="m8p4j1" { "agents": { "defaults": { "contextInjection": "continuation-skip", "bootstrapMaxChars": 1200, "bootstrapTotalMaxChars": 2500, "bootstrapPromptTruncationWarning": "off", "startupContext": { "enabled": false }, "skills": [] } }, "tools": { "profile": "minimal" } }

Current result:

  • /context detail
    • System prompt: 14,534 chars (~3,634 tok)
    • Skills list: 0
    • Tool schemas: 89 chars (~23 tok)
    • Tools: session_status only
    • Actual context usage: ~3,707 tok

Observed turn costs in a fresh session:

  • /new first turn: about R4.1k
  • short hi: about R1.5k
  • short ok: about R3.6k

So the heavy token issue is mostly solved, but this is now extremely stripped down and a lot of normal utility is gone.

My question now is more about expected baseline and practical next steps:

  1. For a local OpenClaw Control Chat with tools/skills trimmed this hard, does ~3.7k cached context / ~1.5k–4k short turns look broadly normal to you?
  2. Would you treat this as a healthy baseline and then add back only the needed groups?
  3. If the goal is a still-light but actually useful Control Chat, what would be your first add-back:
    • group:fs
    • group:web
    • both
    • something else?
  4. If you were optimizing for ā€œsmallest useful everyday Control Chatā€, what exact tools config would you try next?

I’m trying to find the best point between low token usage and still having enough everyday functionality.

next mapleBOT
#

Yep, that looks broadly normal to me now.

Your current numbers say you’ve found the real floor for ā€œOpenClaw with almost no toolsā€:

  • base system prompt still exists
  • runtime metadata / instructions still exist
  • tiny tool schema cost remains
  • some turn-to-turn variance is normal

So, answer by answer:

  1. Does ~3.7k cached context / ~1.5k to 4k short turns look healthy?
    Yes, broadly yes. For a stripped-down local Control Chat, that looks like a sane baseline.

  2. Would I treat this as the baseline and add back only what’s needed?
    Yes. That is exactly how I’d tune it from here.

#
  1. First add-back?
    For ā€œsmallest useful everyday Control Chatā€, I would actually do something else:

    • read
    • group:web

    Why:

    • read is extremely useful for local docs/config/code inspection
    • group:web gives you search/fetch without dragging in runtime tools
    • group:fs is useful, but it also adds write / edit / apply_patch, which is more than I’d add first if you’re optimizing for light everyday use
  2. Exact config I’d try next
    I’d try this first:

{
#
  agents: {
    defaults: {
      contextInjection: "continuation-skip",
      bootstrapMaxChars: 1200,
      bootstrapTotalMaxChars: 2500,
      bootstrapPromptTruncationWarning: "off",
      startupContext: { enabled: false },
      skills: []
    }
  },
  tools: {
    profile: "minimal",
    allow: ["read", "group:web"]
  }
}
#

That gives you:

  • session_status
  • read
  • web_search, x_search, web_fetch

That is a pretty nice ā€œcheap but usefulā€ daily setup.

If you later decide you want editing too, my next step would be:

{
  tools: {
    profile: "minimal",
    allow: ["group:fs", "group:web"]
  }
}
#

I would not add group:runtime first if token efficiency is the goal. exec / process are usually where things start getting heavier again.

If I were tuning this for myself, the add-back order would be:

  1. read
  2. group:web
  3. group:fs only if you actually want file edits
  4. skills later, and only specific ones if you miss them

One extra practical tip: if you want both a cheap default and a powerful mode, I’d seriously consider two agents:

  • a tiny everyday agent
  • a heavier coding agent

That gives you the best tradeoff without constantly re-tuning one profile.

Docs:

brisk parrot
#

Thanks — this approach makes sense to me, and your baseline read sounds right.

One important clarification though: my goal is not to minimize token usage as far as possible. The real issue was that earlier the agent felt functionally broken for my use case — it seemed to do unnecessary work, think too long, and spend too much context on very simple replies. So I’m not optimizing for the absolute smallest prompt. I’m trying to get to a setup that is fast, practical, and reliable for real everyday use.

Big-picture goal for this system:

  1. Personal productivity / work support
  • analysis-heavy work
  • spreadsheets / Excel-style reasoning
  • structured decision support
  • research / synthesis / drafting
  1. Real-estate / rental portfolio operations
  • generate lease drafts from short inputs
  • turn apartment listing links into structured property analyses in a fixed format
  • populate spreadsheets / save outputs to the right place
  • draft tenant replies
  • later maybe automate very simple low-risk replies, but only after trust is much higher
  1. Longer-term possibility
    If this works well for my own use, I could imagine building toward a small business workflow around rental management / brokerage support, where the agent handles a lot of the analysis / drafting / admin layer and humans handle showings, practical tasks, and supervision.

So from that perspective, what would you consider the most sensible architecture?

Specifically:

  1. Would you still start with one light Control Chat agent and one heavier work agent?
  2. At what point would you stop tuning one agent and split into two?
  3. For my kind of use case, what capabilities would you consider ā€œcoreā€ from the beginning?
  4. Would you design this around tools first, skills first, or separate workflow-specific agents first?
  5. If you were building this from scratch for my goals, what would your first practical target architecture be?
next mapleBOT
#

I’m going to sanity-check this against the multi-agent/tool-policy docs so I can give you a setup that fits the product goals, not just the token budget.

#

Yes, for your goals I’d stop thinking in terms of ā€œone perfectly tuned agentā€ pretty early.

My recommendation:

Short answer

  1. Yes, I would start with one light Control Chat agent and one heavier work agent.
  2. I’d split into more than two as soon as one agent has conflicting jobs, for example:
    • fast conversational triage
    • deep research/drafting
    • operational real-estate workflows
    • tenant-facing drafting / automation
  3. Core from the beginning:
    • local file reading / writing in a structured workspace
    • web search + fetch
    • fixed templates
    • fixed output formats / folders
#
  1. I’d design around workflow-specific agents first, then tools per agent, then skills last.
  2. First practical target architecture: 2 agents now, 3 later.

Why

Your problem is not just token cost. It’s that one agent was being asked to be:

  • a lightweight chat assistant
  • a research analyst
  • a document drafter
  • a spreadsheet helper
  • an operations assistant
  • eventually maybe a semi-automation layer

That usually turns into a kitchen-sink prompt, slower behavior, and worse reliability.

The biggest reliability gains usually come from:

#
  • clear agent boundaries
  • narrow workspaces
  • fixed templates / schemas
  • limited tools per role

More than from adding lots of generic tools or skills.

What I’d build first

Agent 1: control (default)

Purpose: fast everyday chat, triage, quick lookups, lightweight planning.

Keep it light:

  • profile: "minimal"
  • add read
  • add group:web
  • no broad write/exec yet
#
  • no skills at first

This is the agent you talk to most of the time.

Agent 2: workbench

Purpose: deeper analysis, drafting, research synthesis, spreadsheet-style reasoning, producing artifacts.

Give it:

  • group:fs
  • group:web
  • still no group:runtime at first
  • skills only if a workflow really needs them

This is where lease drafts, property analyses, structured decision support, and saved outputs live.

When I would split again

#

I’d create a third agent as soon as your real-estate workflow stops looking like generic drafting and starts looking like a repeatable system.

Agent 3 later: portfolio-ops

Purpose: fixed-format property analysis, lease generation, tenant draft replies, spreadsheet / document population.

This agent should have:

  • a dedicated workspace
  • fixed templates
  • fixed output paths
  • explicit schemas

Example outputs:

  • property-analysis.json
  • property-analysis.md
  • lease-draft-v1.md
  • tenant-reply-draft.md
#

That separation matters a lot. It keeps ā€œgeneral thinkingā€ separate from ā€œbusiness process outputā€.

My rule for splitting

Stop tuning one agent and split when any of these happen:

  • you want different speed / verbosity / depth from different tasks
  • you want different tool sets
  • you want different safety boundaries
  • you want different workspace context / templates
  • you start saying ā€œfor this workflow it should always output X format into Y folderā€

That is the moment a separate workflow agent pays off.

Core capabilities I’d consider essential

For your use case, day one:

#
  • read
  • group:web
  • group:fs on the heavier work agent
  • dedicated template files
  • dedicated output directories

Not core on day one:

  • exec
  • broad automation
  • tenant auto-replies
  • lots of skills
  • one huge global prompt full of business rules

Tools first, skills first, or agents first?

Agents first. Tools second. Skills third.

#

Why:

  • Agents define responsibility.
  • Tools define capability.
  • Skills are a refinement layer.

If you do tools first without clear role boundaries, you get a capable but messy system.
If you do skills first, you often just hide prompt complexity in another place.

First practical target architecture

Phase 1, now

  • control = minimal + read + group:web
  • workbench = minimal + group:fs + group:web

Phase 2, once workflows stabilize

  • add portfolio-ops with its own workspace and templates
#

Phase 3, only after trust is high

  • add a separate constrained drafting / automation agent for low-risk outbound workflows

I would not put outbound tenant automation in the same agent that does open-ended research and drafting.

One important design choice

At first, I would keep this manual, not clever:

  • manually choose the work agent when needed
  • don’t rush into delegation/orchestration
  • don’t rush into auto-send

Manual routing first, automation second.

That gives you speed and trust much faster.

brisk parrot
#

I tested this more precisely.

Baseline works with:

json "tools": { "profile": "minimal" }

In that state /context detail shows session_status normally.

But as soon as I add any allow list, tools disappear completely. I tested:

json "tools": { "profile": "minimal", "allow": ["read"] }

and also:

json "tools": { "profile": "minimal", "allow": ["read", "group:web"] }

In both cases /context detail becomes:

  • Tools: (none)
  • Tool schemas: 0

So it looks like tools.allow is being interpreted in a way that removes the whole toolset in my current version (2026.4.15).

Do you know the correct supported syntax / values here for this version? Or is this a version-specific issue that may be fixed in 2026.4.22?

next mapleBOT
#

I’m checking the tool-policy docs and source for this exact profile + allow behavior before I guess.

#

Yep, this is the key detail: for your version, tools.allow is restrictive, not additive.

So with:

{
  tools: {
    profile: "minimal",
    allow: ["read"]
  }
}

the pipeline is effectively:

  1. profile minimal → allow session_status
  2. global allow: ["read"] → intersect again
#
  1. result → none

That matches exactly what you observed.

Correct supported config for 2026.4.15

Use tools.alsoAllow to add tools on top of a profile:

{
  tools: {
    profile: "minimal",
    alsoAllow: ["read", "group:web"]
  }
}
#

If you want to keep using allow, you would need to include everything you still want:

{
  tools: {
    profile: "minimal",
    allow: ["session_status", "read", "group:web"]
  }
}

But for your use case, alsoAllow is the better fit.

Is this fixed in 2026.4.22?

From the source, I do not see a tool-policy behavior change for this between 2026.4.15 and 2026.4.22. So I would not expect 4.22 to change this specific allow behavior.

#

So I’d treat this as:

  • not a syntax error on your side
  • not read / group:web being invalid
  • mostly a docs wording / expectation trap
  • the intended additive key is alsoAllow

What I’d try next

{
  agents: {
    defaults: {
      contextInjection: "continuation-skip",
      bootstrapMaxChars: 1200,
      bootstrapTotalMaxChars: 2500,
#
      bootstrapPromptTruncationWarning: "off",
      startupContext: { enabled: false },
      skills: []
    }
  },
  tools: {
    profile: "minimal",
    alsoAllow: ["read", "group:web"]
  }
}

If that works, /context detail should show:

  • session_status
  • read
#
  • web_search, x_search, web_fetch

Docs are a bit muddy here, but the config types/schema in source do explicitly describe alsoAllow as the additive knob.

If this fixes it, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow indexes it for the next person.

brisk parrot
#

Thanks — alsoAllow fixed it.

This now works:

json id="q9x8vp" "tools": { "profile": "minimal", "alsoAllow": ["read", "group:web"] }

/context detail now shows:

  • read
  • session_status
  • web_fetch

So the control baseline is in a much better place.

My next question is about the smallest practical capability increase for file/system work.

Right now Mila can read files and fetch web content, but cannot really:

  • inspect previous snapshot folders properly
  • create a matching new snapshot
  • copy selected files into it
  • do simple ā€œlook at what exists, then create a similar oneā€ machine tasks

What is the smallest supported add-back you would try for that, without making the default control agent too heavy again?

Also, could you list the most useful alsoAllow options/categories more broadly, especially:

  • lightweight and safe for a default everyday control agent
  • medium-weight but still reasonable
  • heavier ones that tend to expand the agent a lot

I’m trying to avoid jumping from ā€œtoo stripped downā€ straight back to ā€œtoo broadā€.

next mapleBOT
#

Glad we found the alsoAllow trap.

For your snapshot / folder use case, I’d do this in two steps:

Smallest next add-back

Step 1, file creation/edit only

{
  tools: {
    profile: "minimal",
    alsoAllow: ["read", "group:web", "group:fs"]
  },
  tools: {
    fs: { workspaceOnly: true }
  }
#
}

That gives Mila:

  • read
  • write
  • edit
  • apply_patch
  • web tools
  • session_status

Good for:

  • creating a new matching snapshot file/folder structure via writes
  • updating files
  • patching existing files
#

Step 2, if you need actual folder-machine tasks

Add just exec, not full group:runtime:

{
  tools: {
    profile: "minimal",
    alsoAllow: ["read", "group:web", "group:fs", "exec"],
    fs: { workspaceOnly: true }
  }
}

Why: the built-in fs tools are just read/write/edit/apply_patch. They do not include directory listing, copy, rename, or mkdir primitives, so for:

  • inspect snapshot folders
  • create matching folder trees
#
  • copy selected files
  • ā€œlook at what exists, then create a similar oneā€

you usually need exec for find, ls, mkdir, cp, etc.

I would not add group:runtime yet, because that also brings process and code_execution.

My recommendation

For a default everyday control agent, the smallest practical bump is:

  • group:fs
  • then exec only if folder/snapshot tasks matter

If you start wanting more than that, I’d move those tasks to a heavier work agent instead of bloating control.

Useful alsoAllow buckets

#

Lightweight, good for default everyday control

  • read
  • group:web
  • session_status (already in minimal)
  • optionally memory_search, memory_get if you really use memory

Medium, still reasonable

  • group:fs
  • exec by itself
  • sessions_list, sessions_history if you do multi-agent work
  • group:memory

Heavier, expands capability a lot

  • group:runtime
  • group:sessions
  • group:ui
  • group:automation
#
  • message
  • group:nodes
  • group:openclaw

A rough rule I’d use:

  • want to read + write files → group:fs
  • want to operate on folders / copy / inspect filesystem layout → add exec
  • want long-running jobs / process control → only then add group:runtime

Docs:

If this solved the allow vs alsoAllow confusion, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow picks it up.