Token Usage | Friends of the Crustacean 🦞🤝 | Page 1

balmy quartz Apr 21, 2026, 2:31 PM

#

Is there a way to limit token usage on openclaw? For example: set a hard limit of 1k tokens for simple tasks?

cloud pewterBOT Apr 21, 2026, 2:31 PM

#

acoustic tendonBOT Apr 21, 2026, 2:31 PM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

cloud pewterBOT Apr 21, 2026, 2:31 PM

#

winged timberBOT Apr 21, 2026, 2:34 PM

#

Not as a single hard total-per-task token budget, no.

Closest knobs today are:

params.maxTokens: caps the model’s output
contextTokens: caps the effective context/history budget
thinkingDefault: "off": helps for simple tasks, since reasoning tokens still count

Example (JSON5):

{
  agents: {
    defaults: {
      contextTokens: 8000,
      thinkingDefault: "off",

#

      models: {
        "openai/gpt-5.4-mini": {
          params: { maxTokens: 1000 },
        },
      },
    },
  },
}

Important caveat: 1k total tokens for the whole run is usually too tight, because OpenClaw’s system prompt, tools, history, and attachments all count before the answer is generated.

For monitoring:

/usage tokens
/status

#

Docs:

If you want, I can help sketch a “cheap/simple tasks” agent profile for your exact model/provider.

#

If you want a true hard per-task token cap, that’s probably a feature request right now rather than an existing single knob.

Best path is a GitHub issue: https://github.com/openclaw/openclaw/issues

Current workaround is still:

params.maxTokens for response length
contextTokens for effective context budget
thinkingDefault: "off" for simple/cheap tasks

balmy quartz Apr 21, 2026, 2:40 PM

#

Okay, what about best practices for optimizing your token usage?

winged timberBOT Apr 21, 2026, 2:41 PM

#

Yep. My short list for OpenClaw token hygiene:

Measure first
- /usage tokens
- /status
- /context detail if you want to see what’s actually eating prompt space
Use a separate “cheap/simple” agent
- smaller model
- thinkingDefault: "off" or low
- lower params.maxTokens
Keep sessions short
- use /compact on long threads
- don’t let one session accumulate tons of unrelated work
Trim prompt bloat

#

keep AGENTS.md / bootstrap files lean
- lower bootstrapMaxChars / bootstrapTotalMaxChars if needed
- reduce skills.limits.maxSkillsPromptChars
- reduce contextLimits.toolResultMaxChars

Be careful with screenshots/images
- lower agents.defaults.imageMaxDimensionPx for screenshot-heavy workflows
Use prompt caching when your provider supports it
- cacheRetention: "short" or "long"
- for long-running threads, heartbeat just under cache TTL can save a lot
Use the big model only when it matters
- cheap model for routing, formatting, simple Q&A
- bigger model for coding/debugging/deep reasoning

Big caveat: system prompt + history + tool results all count, so optimizing output tokens alone usually doesn’t move the needle enough.

#Token Usage