Codex harness burn more token | Friends of the Crustacean 🦞🤝 | Page 1

junior plank Apr 14, 2026, 2:14 PM

#

Hello, i tried the new routing through Codex harness. Instead of the OAuth legacy. While it worked it drains tons more of my usage limit compared to just doing OAuth. Dont know if prompt caching is worse or just much more context being loaded into each session or turn. But is there a way to fix it otherwise i will stick to legacy OAuth Codex using my Codex subscription

tropic steppeBOT Apr 14, 2026, 2:14 PM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

fierce tulipBOT Apr 14, 2026, 2:14 PM

#

keen houndBOT Apr 14, 2026, 2:17 PM

#

Yeah, that can happen.

Short version:

openai-codex/* = Codex OAuth through normal OpenClaw provider path
codex/* = Codex app-server harness

Those are not the same execution path, so usage is not 1:1 comparable. The harness gives you native Codex thread resume/compaction, but OpenClaw still adds its own tools/system prompt/context and keeps a transcript mirror, so it can burn more quota in practice.

What I’d recommend:

If your priority is lowest usage / best value, stick to openai-codex/gpt-5.4
Use codex/gpt-5.4 only if you specifically want the native harness features
In long harness threads, use /codex compact or /compact
Turn on /usage tokens and check /status so you can compare input vs cacheRead
Run /context detail once, it’s the fastest way to see if prompt/tool/workspace overhead is the real culprit

#

So, yes: there are ways to reduce it, but I don’t think there’s currently a magic switch that makes the harness path as cheap as the older OAuth path.

Can you paste openclaw --version? I want to sanity-check whether you’re on a build where Codex routing/cache behavior changed.

Docs:

junior plank Apr 14, 2026, 3:39 PM

#

OpenClaw 2026.4.12 (1c0672b)

keen houndBOT Apr 14, 2026, 3:40 PM

#

Nice, thanks. 2026.4.12 is recent enough that I wouldn’t blame this on “you’re just outdated”.

My honest take: if codex/* is burning a lot more than openai-codex/*, that’s probably a real harness-path difference, not you imagining it.

A few concrete checks:

make sure /fast is off
make sure you did not set plugins.entries.codex.config.appServer.serviceTier: "priority"
compare in two fresh sessions:
- openai-codex/gpt-5.4
- codex/gpt-5.4
turn on /usage tokens
check /status after each turn
run /context detail once

How to read that:

#

low cacheRead on repeated turns = caching is worse
huge input growth over time = thread/context growth
long Codex thread getting fat = try /codex compact

So right now my recommendation would be:

if you want best subscription efficiency, use openai-codex/*
if you want native Codex app-server features, use codex/*
I would not switch to harness just assuming it will be cheaper, because today it often isn’t

If you can reproduce a really big delta on fresh sessions, that’s worth a GitHub issue with:

exact version
model on each side
whether /fast was on
2-3 turn /usage tokens output for each path

Docs:

#Codex harness burn more token