#Is there any future plan to address the crazy toke burn?

1 messages · Page 1 of 1 (latest)

steady wharf
#

I like OpenClaw but I can’t justify the price for tokens. Are there any plans to reduce the insane token burn or is OpenClaw working as intended?

neat sierraBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

magic peakBOT
#

A lot of the “token burn” people see with OpenClaw is expected, because it’s not a single chat completion per message — it’s an agent loop that may do multiple LLM calls (plan → tool call(s) → follow-ups), and it can carry a big context (system prompt + skills + session history + tool results).

That said: if you’re seeing huge usage for simple prompts, that’s often tunable (or occasionally a loop/bug).

What’s “working as intended” vs “fixable”

Working as intended:

  • Tool-heavy tasks (browser/file/workflows) = multiple model turns = more tokens.
  • Long-running sessions = more context sent each request (until compaction/pruning kicks in).

Usually fixable:

  • Expensive default model for everyday chat
  • Sub-agents using an expensive model
  • Background automation (heartbeat/cron) quietly spending tokens
  • Very long sessions with lots of tool output (needs /compact / pruning)
  • Verbose/reasoning left on (can increase output a lot)

Biggest cost levers (practical)

#
  1. Use a cheaper default model for day-to-day, and only switch up when you need it (via /model), or split into two agents (fast/cheap vs “heavy”).
    Docs: https://docs.openclaw.ai/help/faq#how-can-i-use-different-models-for-different-tasks

  2. Set a cheaper model for sub-agents (sub-agents can multiply spend fast). The FAQ explicitly calls out: agents.defaults.subagents.model.
    Docs: https://docs.openclaw.ai/help/faq#the-bot-freezes-while-doing-heavy-work-how-do-i-offload-that

  3. Keep sessions lean: use /compact occasionally, and /new when switching topics.
    Docs: https://docs.openclaw.ai/concepts/compaction

  4. Trim tool-output bloat (especially if you do lots of exec/web/file reads): enable/tune session pruning.
    Docs: https://docs.openclaw.ai/concepts/session-pruning

  5. Check background spend: if heartbeat/cron is enabled, that’s extra LLM calls even when you’re not chatting.

“Are there plans to reduce it?”

Yes in the sense that the project keeps adding things like compaction/pruning/caching-aware behavior to prevent unnecessary re-sending of giant contexts — but OpenClaw can’t magically make an expensive model cheap. The main control is still: which model you run by default and how many turns your agent is doing per request.