#Codex harness burn more token

9 messages · Page 1 of 1 (latest)

junior plank
#

Hello, i tried the new routing through Codex harness. Instead of the OAuth legacy. While it worked it drains tons more of my usage limit compared to just doing OAuth. Dont know if prompt caching is worse or just much more context being loaded into each session or turn. But is there a way to fix it otherwise i will stick to legacy OAuth Codex using my Codex subscription

tropic steppeBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

fierce tulipBOT
keen houndBOT
#

Yeah, that can happen.

Short version:

  • openai-codex/* = Codex OAuth through normal OpenClaw provider path
  • codex/* = Codex app-server harness

Those are not the same execution path, so usage is not 1:1 comparable. The harness gives you native Codex thread resume/compaction, but OpenClaw still adds its own tools/system prompt/context and keeps a transcript mirror, so it can burn more quota in practice.

What I’d recommend:

  1. If your priority is lowest usage / best value, stick to openai-codex/gpt-5.4
  2. Use codex/gpt-5.4 only if you specifically want the native harness features
  3. In long harness threads, use /codex compact or /compact
  4. Turn on /usage tokens and check /status so you can compare input vs cacheRead
  5. Run /context detail once, it’s the fastest way to see if prompt/tool/workspace overhead is the real culprit
#

So, yes: there are ways to reduce it, but I don’t think there’s currently a magic switch that makes the harness path as cheap as the older OAuth path.

Can you paste openclaw --version? I want to sanity-check whether you’re on a build where Codex routing/cache behavior changed.

Docs:

junior plank
#

OpenClaw 2026.4.12 (1c0672b)

keen houndBOT
#

Nice, thanks. 2026.4.12 is recent enough that I wouldn’t blame this on “you’re just outdated”.

My honest take: if codex/* is burning a lot more than openai-codex/*, that’s probably a real harness-path difference, not you imagining it.

A few concrete checks:

  • make sure /fast is off
  • make sure you did not set plugins.entries.codex.config.appServer.serviceTier: "priority"
  • compare in two fresh sessions:
    • openai-codex/gpt-5.4
    • codex/gpt-5.4
  • turn on /usage tokens
  • check /status after each turn
  • run /context detail once

How to read that:

#
  • low cacheRead on repeated turns = caching is worse
  • huge input growth over time = thread/context growth
  • long Codex thread getting fat = try /codex compact

So right now my recommendation would be:

  • if you want best subscription efficiency, use openai-codex/*
  • if you want native Codex app-server features, use codex/*
  • I would not switch to harness just assuming it will be cheaper, because today it often isn’t

If you can reproduce a really big delta on fresh sessions, that’s worth a GitHub issue with:

  • exact version
  • model on each side
  • whether /fast was on
  • 2-3 turn /usage tokens output for each path

Docs: