#How to control the amount of tokens I'm consuming from OpenAI API key?

25 messages Β· Page 1 of 1 (latest)

rare dirge
#

How to control the amount of tokens I'm consuming from OpenAI API key?

brisk troutBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> βœ… Mark Solution

shy fjordBOT
obtuse hingeBOT
#

Short version: monitor in OpenClaw, reduce per-request usage, and set the real hard cap in OpenAI.

Fastest knobs:

  • /usage full or /usage tokens
    Shows tokens on each reply.
  • /status
    Shows current session usage and estimated cost.
  • /context list
    Shows what is bloating the prompt.
  • /compact
    Shrinks long sessions.

Biggest ways to reduce spend:

  • lower thinking, for example /think off or /think low
  • switch to a cheaper/smaller model
#
  • cap reply length with model params maxTokens
  • keep sessions shorter / compact more often
  • if you send lots of screenshots, lower agents.defaults.imageMaxDimensionPx

Config example:

{
  agents: {
    defaults: {
      thinkingDefault: "off",
      models: {
        "openai/<your-model>": {
          params: {
            maxTokens: 800
          }
#
        }
      }
    }
  }
}

A useful distinction:

  • params.maxTokens = caps reply/output size
  • contextTokens = caps how much prompt/history OpenClaw sends at runtime

Also, I do not think OpenClaw currently has a global β€œstop after N OpenAI tokens” switch for an API key. For a true hard limit, set a project/key budget in the OpenAI dashboard too.

Docs:

rare dirge
#

Sessions
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Key β”‚ Kind β”‚ Age β”‚ Model β”‚ Tokens β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ agent:main:telegram:direct:8469… β”‚ direct β”‚ 3m ago β”‚ gpt-5.4 β”‚ 30k/200k (15%) Β· πŸ—„οΈ 99% cached β”‚
β”‚ agent:main:main β”‚ direct β”‚ 43m ago β”‚ gpt-5.4 β”‚ 16k/200k (8%) β”‚
β”‚ agent:reverser:subagent:7e32b89… β”‚ direct β”‚ 43m ago β”‚ gpt-5.4 β”‚ 9.5k/200k (5%) β”‚
β”‚ agent:main:telegram:slash:84694… β”‚ direct β”‚ 53m ago β”‚ gpt-5.4 β”‚ unknown/200k (?%) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

using Telegram consumes more tokens than direct chat?

obtuse hingeBOT
#

Not inherently, no.

What your table shows is mostly different sessions, not β€œTelegram is expensive”:

  • agent:main:telegram:direct:... is one session
  • agent:main:main is another
  • agent:main:telegram:slash:... is yet another
  • subagents always have their own context too

So if the Telegram session has more back-and-forth, tool output, media, or older history, it will show more tokens.

Also, this part matters a lot:

  • 30k/200k ... πŸ—„οΈ 99% cached

That means most of that Telegram prompt was cache-read, so the raw token count looks big, but the actual cost hit is much smaller than 30k fully-uncached tokens.

#

So the answer is basically:

  • Telegram itself: usually not meaningfully more tokens
  • A separate Telegram session with more history/context: yes, that can show higher usage

If you want to compare properly, run these in both chats:

  • /status
  • /context list
  • /usage full

If you want to reduce the Telegram one specifically, easiest fixes are:

  • /compact
  • start a fresh session if the thread got long
  • lower thinking with /think off or /think low
rare dirge
#

how to check if the agent is still working or it's been stuck? he told me some time ago "I’ll update you as soon as the delegated checks come back" but no reply

obtuse hingeBOT
#

Yep, you can check that.

If it said β€œdelegated checks”, that was probably a sub-agent or another background task.

Use these:

  • /tasks
    Shows active/recent background tasks for this session.
  • /subagents list
    Shows delegated sub-agents for the current session.
  • /subagents info <id>
    Shows status, timestamps, session id, etc.
  • /subagents log <id>
    Shows recent output so you can see whether it is progressing or stuck.
  • /status
    Gives a quick task summary too.
rare dirge
#

i want to change from GPT-5.4 to GPT-5.4 mini

obtuse hingeBOT
#

Yep, easiest on Discord is just:

  • /model

That opens the picker. Choose the mini entry there.

Useful variants:

  • /model list to see what is available
  • /model <exact-provider/model-ref> if you know the exact ref
  • /model status to confirm what the session is using now

Important difference:

  • Only this current session: use /model
  • Make it the default for new sessions too: change agents.defaults.model.primary or run:
    • openclaw models set <provider/model>
rare dirge
#

I only see 5.4 but not 5.4 mini

obtuse hingeBOT
#

Then I would not guess the model id.

If it is not in /model, it usually means one of these:

  • your current OpenClaw version does not know about it yet
  • it is not in your agents.defaults.models allowlist
  • your current provider/account setup does not expose it

Please paste these:

openclaw --version
openclaw models status
openclaw models list --provider openai

If you only want the current chat-side view too, also paste:

rare dirge
#

I think it's the same model 5.4 but it has 3 variants

obtuse hingeBOT
#

Could be, yeah, but the key thing is this:

OpenClaw switches by the exact configured model ref, not by the marketing family name.

So if /model only shows one 5.4 entry, then from OpenClaw’s point of view you currently have only one selectable 5.4 ref available in your catalog/allowlist.

So the other β€œvariants” might:

  • exist on OpenAI’s side
  • exist in a newer OpenClaw catalog
  • or just not be exposed in your current config

That is why I’d still verify instead of guessing the ref.

Please paste:

fluid dagger