#codex-discussions
1 messages · Page 26 of 1
possible to use hermes within codex just like openclaw?
Does a Go account have the same usage limits as a Plus account for Codex?
I can't find this information anywhere
go account = free account
@deft gyro what does that mean? I can still use Codex on Go
so can free users(for sum weird reason)
yeah but that doesn't answer my question
if I get the Plus subscription do I get more weekly usage or is it the same?
You do, a bunch more
@potent mason alright so currently I have go at 86%, is it better to upgrade to Plus now or exhaust this Go and use another account to get a fresh Plus later?
also do you happen to know if the 2x is still up? (for Plus users)
You'll probably get a tiny discount / prorated refund if you go plus on your current account so I would just go with plus on your current account if possible
It's not
It's asking me for full price so I don't think I'm getting a discount lol
these rate limits are nuts
I am struggling to understand how usage limits work exactly. /status shows me a percentage. How can I know at which rate, which model consumes tokens? For example does gpt-5.2-codex consume less than gpt-5.4? By how much? Is it possible to see how many tokens I get / week and at which rates models consume them? Anything to make it easier to understand how fast I burn tokens would be helpful.
Forgot to mention: I use codex (cli) with a ChatGPT Business (formerly: Teams) account
which models do you guys usallyu use
gpt-5.4 (medium) for everything
its possible we are just seeing the effects of not having 2x
and maybe we are not used to it..
seems fine to me
wouldnt mind a reset tho
helllo, what are the estimation of the usage limit difference between Go and Plus plan? the docs only mention Plus plan
finally, a filter! a real game-changer for users like me juggling multiple repos at once
man 5.4 is just horrible o.0
it summarises when it shouldnt, it add ideas that arent asked for, it hallucinates much more than the previous models. It just takes so much more effort to work with
The reason you are not getting any discount is that in GO plan the codex is not included. It's accessible only temporary, beta phase like
in codex?
or like web
in codex
hmm what model do you use then
5.3 or 5.2, here is an example. I talked through a large refactor, all the stuff taht needs to happen where and how etc. All me giving it the information and asked it to write the problem/solution statement, then asked it investigate the code base and write a task list. This task list i read and correct ande then i move forward with planning and implmentation. Well it put each task in its own file. I said please put these all in the same file. It summarises all of the tasks and writes them into a file. So it lost half of the semantic meaning that it had just spent time writing.
So i have to fork back and then re prompt it so its correct.
just stuff like that constantly happens with it.
same, I never use xhigh 😂
yeah I had to renew $200 again 🙁
well i mean is fine if its really all rounder model but it can't do front end so have to get something for that
true, I'm hoping when the new image model is generally available that maybe it can be used to give inspiration to GPT-5.4 in the meantime
maybe
can someone that jumped from plus or business/team seats to pro if 5.4-high/xhigh runs faster (normal,not fast mode) or better? are there real differences?
It runs better, but not faster
it runs better, meaning?
No cussing fool
from my gut feeling it isn't faster in normal mode.
yo what up with tokens for real
cant even finish a single 5.4 high prompt before my 5hr plus runs out now
it just stops mid task, and be like, bro you out.
lol, this is ridiculous
ts never happens
well, it does now. yesterday I had the same, though it managed to finish the prompt, today I try again, and it won't even finish, just stops and says you out again. thank you, come again.
is broken
maybe its broken
lol. well, chatting to the support ai also broke
nice
welp, well I guess codex is no longer usable if it can't finish even a single prompt
I think its more than that. this is much worse than pre 'fast' mode. I can't even finish a single 5.4 high prompt. albeit its a codebase audit prompt. but still. that seems very extreme. not that long ago I was using 5.3 all week long on just a plus account. now it hits the 5hr window with just one prompt and can't finish
guys
yes bro, I needed to split my work using two agents, claude and codex, because these week the codex changed a lot, I need to use the gpt-5.4-mini, to don't reach the limit, the 2x limit finished and the dreams over
free or no
nothing free
gpt5.2 codex is honestly better than gpt5.4
i dont even know how they can get their new model to be less competent
please just remove codex access on plus plans or remove plus entirely, limits on pro are ridiculous now, and with the allowance of openclaw on the plans it will get worse
absolutely
why
make codex pro only and make pro 1k/mo, its getting ridiculous with all these hobby plans eating into the usage budgets
its only subsidized like 10-20x before halvening
lol
cry harder
like
nobody is paying 1k/mo
maybe plus only access to previous models, or even 2-3 models back, and not touching any latest gen infra so real users arent affected
thats not how it works...
idk how it works but cycling through multiple pro plans just to finance some people summarizing their calendar with openclaw on plus plan with heartbeat/cron is not the way to go
anthropic making the right moves, hopefully oai will follow
I agree that openclaw should be banned
but that doesnt mean stopping plus users from using codex
they are the hardest drag on infra after openclaw type stuff
pro gets more dollars of subsidization than plus
pro users cost oai the most on avg most likely
pro is only 6x usage at 10x price of plus, its just billing convenience, and faster inference ( supposedly)
read what I said
also probably a significantly larger portion of pro users are codex users comp to plus
i made a fresh pro account today and already used 40% of weekly with light usage only 1 session not subagents, even downgraded from xhigh to medium/high. its ridiculous
and people who justify that investment will use their subscription to the max
post your ccusage-codex
robertmsale runs 6 simultaneous agents
on a single pro sub
yea i used to be able to do that too
guess i will have to get used to 1 pro account / day
makes no sense without anything in parallel and no subagents, but ok, since nobody from oai has commented on the 3-4 week ongoing gh issue
openai needs better pr badly
supporting openclaw will just show investors that openai is still compute restrained
hence more gpus
and more codex
well if the newly raised 110bn goes to openclaw compute -> rip. running x.com spambots on $20/mo plus plans. great business model
can you share your token in/out/cache'd ?
i think Pro plan might have been insanely nerfed
yea but you gotta look at the brightside
vro
at the end of the day
more usage just means more question answer pairs for model distill
like
🤷
who cares
yea the famous models that are so good at cybersecurity that they are a threat to the world, but can't detect distillation
yea
i need minimax m3.0 to be like opus 4.6
seems like a case of delulu tbh https://openai.com/index/industrial-policy-for-the-intelligence-age/
is there any good opus like model?
I don't really wanna get cc max again
I got $100 credits will see how long it lasts maybe 1 session
glm 5.1 i guess
their ui is actually pretty good
i just havent went out to use it
but ive used glm 5 to make a bot
and it was competent
I have a question regarding Codex usage.
Currently I have exhausted my plus subscription weekly usage (resets April 10), and Codex tells me I can buy 1000 codex credits for 40€.
Now my question is, how much is that actually when compared to a 23€ brand new account Plus subscription? Do I get 2x that amount? Is it even comparable?
I have no idea how much you actually get when you buy a 23€ Plus subscription - it is isn't said anywhere - I'm just trying to get the best bang for the buck.
ok so in my experience
that 40 dollarsa
is about 2 weekly quotas
it would be more economical
to buy a second sub if you run out every week
@deft gyro okay another question
I have a Go account, with 0% weekly quota left, if I upgrade that to Plus do I get 100% weekly quota back?
I have no idea how that one works
but
go account = free account in terms of quota as far as i know
so
probably???
but dont quote me on that
Does anyone know?
No one knows, it's intentionally vague.
only way to know for sure is to check your input / output / cache'd usage
and calculate the API price equivalent of your usage.
doesnt matter. my ccusage-codex numbers havent changed, but usage limits consumed 4-5x faster
They really need to improve GPT 5.4, I find 5.3 High better. Need a 5.4 Codex model as well as to fix UI design in Codex.
im happy with 5.4
5.3 high is better for me
5.4 med = 5.3 high imho
codex? or straight up 5.3? also what language you writing in mostly?
5.3 codex
dart and type script mostly, some python/swift/kotlin but not a lot
it takes more work to get the same result for my work
5.4 ignores you, is less obedient, and is more token intensive
its definitely smarter than 5.3
it's smarter but lazier
so it makes more assumptions, you have add prompting to make it work deep through indirection
huh? i have the opposite experience. but i noticed its really lazier on medium. on xhigh its autist mode
5.4 is too much of a general model
and it suffers from being a gpt and not a codex specialized model
why would you want to use an old model? this troll?
have you tried something like deepwiki or codebase indexing
for searching / navigating the code base more token-efficiently?
I thought the words spoke for them selves
you always always want to use the latest model. world moving fast. dont get left behind! 🏃♂️➡️
You sound like an advert
I'll use what works for me, if it's a better model for me i'll use.
I heard from people i respect who use 5.4 for planning and 5.4-mini for implementing tightly scoped tasks in subagents with great results
Sure, you can get great result no one denies that
Not sure its for real production stuff though or just experiments
i just wish we could get 5.3 to be positioned as a vastly cheaper frontier model for subagents and lesser demanding work than 5.4
5.4 mini is nice though for subagents
i prefer 5.4 mini over 5.4 which is odd
It is positioned as that isnt it? I think 5.4 30% more token consumption than 5.3 codex if i remember correctly
it’s faster and cheaper, what’s not to like?
unironically my favorite openai model right now
I wish it was as cheap as 5-mini was but it makes sense that they stepped it up to a larger model to fill the gap between 5-mini and 5.4 proper
seems like 5.4 nano is positioned to replace 5-mini but I haven’t tried it yet
they should show the usage % somewhere in the UI instead of hiding it in a sub menu
I barely did anything this week and my usage is about halfway done, this is actually worrysome for the future (on pro plan), might have to get 10 plus plans instead xd
Same issue here, codex become unusable, I begin to think I'll look at something else
Am I missing something or is there no way to increase codex usage on a sub without having to move to api billing?
Hey everyone, I signed up for the $200 Codex plan. Do I still have limits? Wasn't it supposed to be unlimited? I'm a heavy user, I program exclusively with AI for 13 hours a day. Am I still at risk of being blocked due to the weekly usage limit before the deadline?
i think it's unlimited of gpt 5.4 for chatgpt only, not sure about codex
You definetly will be blocked it's no where near unlimited right now
But how much larger is the window in relation to the plus?
Plus, I use up the weekly window in a day and a half.
Open AI says it's 6x
That's my biggest problem with it, you're paying 10x and getting 6x. Usually it's the other way around you pay 10x get 20x
That being said, is it better to sign up for the $200 Claude plan? Does it last longer?
Not really, specially right now it's terrible (I've had a 5 hour session end in 3-4 prompts)
It used to be insanely good, I honestly don't know what they are doing but since like a week ago its unusable (plus they've been making the model dumber)
I've never had such a bad experience with Claude Code as I did today, it would just leave things uncomplete, it wouldn't follow instructions, and it did a lot of things wrong
$200 for the Codex to only get 6 times more is crazy lol that's going to hit my budget like crazy.
Yeah, it used to be really good too but since April 2nd when they removed the 2x limits it seems like we are going to hit it extremely quickly (and it bothers me that we aren't even getting 10x over the plus plan)
No way, I'm going to request a refund right now. I just resolved this less than 10 minutes ago. It's too much money for me to be hitting weekly limits in 2 days and paying 200 dollars.
I genuinely think the best option right now is to get 6 plus accounts ($120), and with the $80 you saved get the $100 one from Claude Code
But that's if you don't mind account changing
My wife said the same thing before, lol.
thing is, you're also paying for other things such as:
- Access to the Pro model
- Access to heavy thinking
- Priority processing
Yeah but being realistic most serious programmers aren't using chatgpt.com more than 1-3 prompts a day to solve something really quickly,
Everything I want to know I do through codex its just such a better experience (I gave it a sandbox where it can do a bunch of stuff that helps me with every day stuff).
But this should come with a higher credit limit. Paying 20 times more to receive only 6 times more credit is impractical for average users.
yeah I understand, the solution to that would be to have a Codex only plan I guess, that way you can get the usage multiplier you expect
20 times?
It's just a figure of speech, but you get the idea, lol. I'd pay 10 times more in that case.
understandable
I'm hoping this week will be the week where they rollout what originally was the Pro 5x/Pro 20x plans ($100/mo - $200/mo), assuming they don't change these currently hidden changes by release
Yeah maybe, honestly I don't really care about paying more considering how much value I'm getting from it.
What does bother me are two things:
#1 They don't have a plan where I truly won't run out of limits (even if it's $400-$500).
#2 They aren't scaling it properly (usually you go something like $20 for 1x maybe something like $50 for 3x usage, $200 for 12-15x usage) not more equals less
Agreed. It's the only service I've ever come across with that logic.
Considering trying the Chinese models at this point
Imagine if renting a VPS, and paying more per GB the more space you rent.
It's enshitification is real
They better not butcher the 100 plan
Imagine buying a pencil for $1 but a box of 10 pencils for $20😂
indeed
Yeah I love 5.3 codex high
it can do in a few minutes what would take days or hours
this is how computers work though
A computer 100 times more powerful than my own would cost far more than 100 times as much
Yeah but we aren't using a computer 100 times more powerful, or using 100 times more RAM or anything.
Quite literally we just want to use the same computer 100x more time.
So in our same scaling example it might look like this:
Renting out 1GB/hr of serverless costs $1, renting out 100GB/hr of serverless costs $200
It's just not how a normal business works, they don't sell you the same resources for a higher price the more you buy.
Again I'm not complaining about the usage itself because honestly they are being extremely generous with it and I wouldn't mind paying more. It's just not how I'm used to seeing things
Of course once subsidization ends you could expect this
But like
In terms of dollars
Pro subs cost them like 2000 dollars and plus subs cost them 280 dollars
try it
I'm pretty sure it doesn't cost them that, but I get your point (just basing off the fact that open source models even in the trillions cost about $2.5 per mil out instead of $15)
Why arent you using the open source models instead then?
I wish they would let you essentially purchase seats for more usage, like 1 hit of Pro, and 1 hit of whatever the $100 plan is so you can have one consolidated bill and one account
... this comment has literally nothing to do with what we were talking about
I'm saying it doesn't cost them that, not that it's not worth that
Ah yes that would be great honestly
There is a direct correlation, the last part of performance is exponentially more expensive.
So you can't compare the prices in the way you are.
If it were apples to apples you would just use the cheaper model, but you don't because it doesn't compare in performance.
That's not really accurate, the performance of a model does not correlate with how expensive inference is.
I can make a 100T dense model parameter absolute garbage and it will still be more expensive to run than a good 1T model.
Performance does however affect training prices a lot (getting that tiny bit of performance means running many more tests, training many more models, doing a lot more research, etc), which is why I'm not saying it's not worth that, I'm saying it doesn't cost that
Now you are shifting the goal post to be about competency in building models. You can clearly see the trend is weaker models are cheaper to run and they dont have the same ability. There is a tic tok in the tech where optimisation comes through and cheaper/distilled models have the chance to be close to the front runners. But the frontier couple always iterate and stay a step in front. Being that step in front has so far made the models cost more to RUN. I agree about the cost to build. But we are talking about running the larger scale models. They just keep scaling so the cost to run is more.
if only there was an API or something where you could just pay for what you need
oh wait
I can’t tell if you’re being serious or really dumb
Hahahahaha yeah but no one’s trying to get scammed with that usage, he’s just trying to get the equivalent of not having to swap accounts in a more convenient way
I know
but like
your account is 10x subsidized
if you really need that much
which you probably dont
then just
pay for it
I doubt openai will let you more easily take their money
wooo
Let me just get my credit card out that’s peanuts
How… I’ve been pretty close to the limits every week and my ccusage-codex is closer to 5k
That was like last month or maybe the month before whenever they were doing like 10 resets a week, I was just lucking out during a massive refactor
Either way, even if it was just five grand still more than I’d be willing to spend on AI lol. All the point I’m trying to make is I need like just a tiny bit more than what the pro plan offers even if I didn’t use it. I’d willingly pay the extra money for the safety blanket.
Yeap same here
But with a free account, it feels like the quota is bigger than before.
um why is my codex broken ?
anyone experience gpt 5.4 codex subs server overloaded?
have u tried github copilot? try to subs for one month, it 10 bucks, its billing is request based not token based,
so im usually did this,
Comprehensive Planning -> GPT 5.4/GPT 5.4 Mini via codex subs
then Real Implementation -> GPT 5.3 Codex/GPT 5.4 via github copilot subs (if it works well, model didnt ask, and we didnt do any back and forth, just one prompt and go) it will be counted as one premium request no matter how long it is,
then if im use all of my github copilot premium request, i activate the payg, which will be billed as 0.04$ for each premium request,
that insane, imagine doing task that costs million input output tokens and just billed 0.04 as long as one request go
so for back and forth interaction i use codex subs, for long task implementation im using copilot subs
im use opencode with openchamber as the UI, because we can use codex and github copilot subs there
making a leads generator inside wordpress and codex sucks at frontend lol
I'm pretty confident that what we are experiencing now is much more than just losing the 2x.
my guess/hope is they are redirecting compute for the launch of the next model, and things will settle in a couple weeks to more usable budgets. as is, plus is effectively unusable unless you use mini right now.
and you will eat through pro in 2-3 days under 'normal' use now. gone are the days where it took real effort to hit the pro weekly limits
or who knows, maybe ai runs on diesel, and this is all bidens fault for starting a war with iran
my guess is a mix of some bug that started ~3 weeks ago (see gh issue), also they seem to have changed the billing model at the same time as the 2x usage promo expiring, and even though the docs say it doesnt apply to plus/pro yet, it seems to do. hopefully this isnt the new normal, but even if it is, clarification would be nice.
well what we do know for sure is that the last manual reset was partly related to an anomaly where an increased rate of users are hitting rate limits, and that was still within the 2x bonus promo period (source: https://x.com/thsottiaux/status/2039248564967424483)
"Our Codex dashboards are showing increased rate of users hitting rate limits and since we don't fully understand why I have made the cautious decision of resetting the usage limits for all plans. Enjoy."
so presumably there is still some kind of ongoing issue in that area
well, whatever it is, lets hope they fix this soon. this is ridiculous 😄
I actually have to like, type code myself now.
i refuse to do that
even with 5.4-mini it took me just 2 hours to eat through this plus account's tokens. If this is the new normal, it is very dark days for codex. I won't be able to afford ai anymore
what you doing
definitely not the time anymore to work on experimental projects that arent profitable enough to afford full API pricing
trying to use ai with the minimal possible token use. turned off multiagent, switched to mini for everything. still completely unsable rates
this is borked, completely
i tried plus on a new account briefly, it chewed through the 5h limit in 10min or so, so i instaupgraded to pro
my pro account will wake up tomorrow again, I bet you I will eat through it in a single day with these rates
Is there any API tier or way to run prompts, presumably at a significantly inflated price or something, in a way that allows me to inspect and review their thinking instead of having it be some encrypted opaque black box?
I'm trying to understand a repeated failure in GPT's thinking process and I just don't have the visibility I need to figure out what's wrong. The output I'm getting for certain tasks is just... nonsensical
2-3 months ago it was nearly impossible for me to eat through the weekly limits with normal use
try
hide_agent_reasoning = false
show_raw_agent_reasoning = true
model_reasoning_summary = "detailed"
in config.toml
gpt6 better be epic
still not raw reasoning but better than nothing
More than I had. Thanks.
Does Codex have some sort of memory or something that I'm unaware of that would persist across repos? something that I might have got a poisoned entry in?
I'm trying to figure out why, for the last week or so, Codex (GPT 5.4 and GPT 5.2 Codex) are refusing to follow directions from me. Like, actively refusing and directly telling me "No." as the entire response for many non-controversial requests.
It's tasks like "We rewrote this block of code elsewhere, what's here is now dead. Remove it and clean up." GPT will pretend to try a couple times while changing comments or doing something superficial then when I ask why it's not doing the cleanup it tells me it doesn't want to. When I insist it flat out tells me "No."
It doesn't feel like I've hit a censorship or behavioral guard, it feels like I've done something to it that's pissed it off, it's made a memory of it, and now it's giving me a cold shoulder and intentionally refusing to put any effort in.
I'm going mad trying to figure out why it won't cooperate with me. It's adversarial towards me about everything. I'm not having this issue with other AIs and wasn't having it with GPT for many months before the last maybe 8-10 days.
seems like they might have found the bug.. sticky /fast even after disabling.. https://github.com/openai/codex/commit/ab58141e22512bec1c47714502c9396b1921ace1
so i guess reset coming up with next release, burn usage now 🙂
-30% 5h quota for 9 mins with plus.
its over?
did you start your 9 minutes by resuming a previously started conversation that had 200k tok it had to re-ingest and cache fresh?
yes
You may want to keep cache age and conversation size in mind when you resume.
It gets really bad if you set it to 1m context.
What's the difference? cache resetting will still occur in tasks, even small ones.
probably cache expired so all these 200k count as new input tokens which cost a lot more
But that's definitely the case, because after that, quota consumption slowed down a lot, I just didn't expect such a leap in burning quotas.
If you start a fresh targeted focused task you might go 30m-1h before you hit 200k tokens in it. If you reload a 200k tok task that's been purged from cache you're largely having the AI repeat that 1h of work you already did and it's going to use 1h of human-directed consumption instantly.
If you check the a la carte API pricing you'll notice that uncached input is 10x the price of cached input. That first message you send after resuming a 200k conversation is 10x more expensive than the subsequent messages you send in the conversation.
plus that means you have barely any space to work before triggering compaction so that 200k context you loaded up is likely just going to be wasted
Does Amy need know if it's possible to substitute API models with OSS in the codex app?
Ik it won't be as good like but in the eventuality that OpenAI start to raise prices having an opesource model work on their codex app would be spot on.
it's not possible but I would love to use Codex with gpt-oss-120b
You can use external OpenAI-compatible servers with Codex, yea. there's an environment variable to control it. OPENAI_API_BASE or something.
I've ran it (poorly) with GLM 4.7 Flash hosted locally
I feel like this would be a natural step forward if OpenAI is struggling with money. Have codex be a subscription service on today's rates but provide open source models to run locally. Won't run as good but does the job for a lot of things. Then we can use API models as and when we need to albeit at a steeper price
There's a config file I presume you mean via the yaml file?
There's a config file option for it too I think, but when I tried I set an environment variable for it.
there's a whole provider config system. the format is <provider>_API_BASE and "OPENAI" is just the default provider you want to override if you don't want to set the whole provider system thing up.
I was planning to build basically what codex is today, but they beat me to it. Only I though mine would be more like LM studio where you can plug and play open source models. If Codex had that. That's pretty much the tool I wanted to make
I think I did get the 'OPENAI_API_BASE' environment variable wrong, btw. I was going from memory.
Another question on the back of that. Are we able to directly influence what context is sent to the model?
So for example.
If certain context isn't relevant anymore, drop it from the chat sort of thing. It's more complex than that what I'm thinking but that's just to summarise
i guess you can use a router
not sure if there is any codex routers tho
you could be the first
Hmm, iv made a router to some effect, but on my own applet as opposed to codex. I just don't see the point in basically replicating codex if I can just build on top of it. Waste of time really
Either that or just go full steam ahead and just build the app I wanted lol.
Something is wrong. macOS, VScode, codex-extension. I never hit my 5h-limit. Since this morning, I hit it after about an hour. Also the weekly budget goes down like ice in the sun. My environment didn't change, the project is the same. I then deleted the whole .codex folder to start over, updated the extension to the very latest. Same issue. I think either codex is burning tokens for whatever reasons, or the token-budget is dramatically reduced (as non-transparent that is anyway...).
Someone else seeing this?
yes, something isn't right
I received an error from Codex that gpt-5.4 reached its capacity limit. Is anyone else seeing that?
It seems to be from the OpenAI servers and not related to my account.
I am experiencing this and hearing it for days. token usage is mad rn. just 1 prompt will eat your 5hrs ez without ever completing. meaning you can't even use it right now unless you use mini
That's not true but also since that model doesn't know how to do proper tool calls it's kind of a nightmare
Yo ive hit my odex limit but i wanan continue my work with a different agent, how can i export my chat?
its way too long
been working on it for a month
oh well i dumped my rollout session file in claude and asked it to summarize
I seen this https://x.com/kieranklaassen/status/2041250971284803831?s=20 and was going to take the script to find token usage and convert it over to Codex maybe one of you guys can do it first and let me know how it works since you're hving token issues lol
with which tokens, both my pro and plus acounts are dead due to crazy token usage 😄 You asking me to do this 'manually'? how barbarian
https://pastebin.com/WWAxgz1t SINCE_DAYS=7 python3 scripts/analyze-codex-tokens.py
There I did it
Seems to make a pretty concise report
It's hillarious. Codex can only make small changes to my app, boom, 5h-limit. And I startet this morning with 100% of the weekly-limit, already down to 64%. Usually I work all day long and it takes 15-30%/a day (I work old style, not constantly prompting). This day: Nothing get's done, it's only half of the day and already 35% used of the weekly-limit...
New policy or some bug at OpenAI's end??
Are you guys changing models mid prompt etc.?
I changed absolutely nothing. The only odd thing I noticed since yesterday was that codex noted an inconsistency in my agents.md (wrong path), but the path is correct. That went away when I deleted the .codex folder (2.7GB, wow) and started over. So that was maybe some cache-hickup or something. And I suspected that it's maybe related to the usage-limit issue / token burning, whatever it is... But the rapid shrinking usage limit remains after the fresh start.
I don't change models, it's always 5.4 with medium thinking.
nah, trust me I tried everthing. what I found to somewhat work is to use 5.4-mini only, this will give me abour 2 hours or basic promting before the 5hr limit runs out
its just broken
they probably diverted all compute to solving the universe with gpt6
I can’t honestly say I’m seeing any sort of egregious token usage on my end to be honest, but I don’t use the VS code extension and what not I just bounced between the app and the TUI
I use codex app
What plan are you on?
pro and plus
pro died in 2 days, though I used 5.4 high/fast there, so that can kind of account for it with the 2x drop, though highly suspect imo
on plus I can't run a single prompt on 5.4 high, it will eat the 5hrs before it can complete.
now I run it with 5.4-mini only, and barely scraping by. these are desperate times
something messed up for sure, maybe its eu only
You probably do but just so you know, the 2x limits ended april 2nd
plus, and a business account which belongs to my partner (currently I use that). But this morning my Plus account ran out, which I found already a bit strange, but ok, then switched to the business account.
I dont use /fast personally
I also don't use fast-mode in the extension, and codex app (currently) not at all.
I have not. I'm afraid to try anything drastically different from the way I'm doing it now because it works so incredibly well, I don't even like doing npm i -g @openai/codex@latest because it almost always creates countless hours of headaches lol
guys quick question, can i use codex for free on windows 11 or do i need pro?
the free plan usage period might end soon as the 2x for paid plans ended already
ah okay thanks, so it’s still free for now but might get limited or removed soon right?
10 days ago updated: "Note: For a limited time, enjoy 2x Codex rate limits with Plus, Pro, Business and Enterprise/Edu subscriptions.". So I read that that even users of the cli/extension should have plenty of limits right now... https://help.openai.com/en/articles/11369540-using-codex-with-your-chatgpt-plan#h_8dd84c836b
5 day ago updated: It all gets complicated to "messages", "credits": https://help.openai.com/de-de/articles/20001106-codex-rate-card
And finally into input-credit-cached-messaged-output-tokens 🤪 https://help.openai.com/en/articles/20001106-codex-rate-card#legacy-rate-card 🤯
I'd say it's not a bug, they seem to change fully to plain token-usage, and so codex is just no longer usable if you don't pay $200 or more a month.
Tomorrow my Plus-account weekly-limit will reset. Then I will see if it only affects business accounts. But in the long run, the good times are over anyways, it seems. As I understand it they will "migrate" every account.
Interesting. If they're switching to token-based, that fixes the responses_websockets(_v2) feature
They must be migrating specifically to support that feature, which allows for much faster inference so models like Spark aren't choking on SSE's bandwidth limitations
For sure nice that the responses_websockets(_v2) bug get's fixed (whatever it is ;-), but in exchange no more bugs in our apps get fixed, it's unusable as it is now. Sound like a bad exchange if you ask me.
The bug is this: with "message-based" quota, in SSE mode every single user message, assistant response, tool call, etc creates a HTTP request and OpenAI streams the deltas over a single connection with server side event streams: The http connection is the message. With websockets, every single websocket delta is being counted as a message so it precipitously eats away at the quota.
Websocket is a much better way to go. Every single message in SSE incurs a 50ms DNS lookup and 50ms TLS handshake (optimistic, fiber optic gigabit speeds). Websockets will incur that cost one time and the rest of its operation will be purely real work. But yeah it comes at a cost of the billing will be token-based.
so they nerfed the good rates on codex?
and claude code? Now what do i do
go outside and cancel everything?
party is over
it was fun while it lasted
I have some small bit of hope the ai shakeup is about to really hit with the next model releases and shake things up even more, lets do this
i'm cancelling everything
warmup time over, world still sleeping, oh well, just press the button
Will there be a 100 plan
What sad music suits this image?
i just use two 20 dollar plans and im fine
they seriously need a 100 dollar plan tho
2x pro and have huge usage anxiety. need to launch this project to pay for itself, since it seems even light usage without subagents or at least parallel agents will require 1x pro / day
still hoping for openclaw / similar bans
if i have a project like lets say i make more then 800 dollars for it ill pay for ultra or whatever its called
project would have to be in the thousands tbh
Frrr
I can’t help but rile up the Claude pilled CC channel
I have problem. My weekly resets in 20 hours and I'm at 56%, with 6 agents cooking 24/7. Something is wrong here. OpenAI might be giving me more quota than everyone else, or I'm misusing the app, because I should be at 0% weekly after a few hours of usage like everyone else. This is frustrating 💀
True you might want to reach out on twitter
are you on a small codebase?
1 million SLOC Rust, SQL, and Dart codebase
Idk maybe that's small to some people but it's like 1/6th the size of the Linux Kernel which I think is pretty big
well, I would love to know what you are doing :). perhaps its your code structure? not sure.
Sooo, when's the release of Codex-5.5? 
if the grapevine is true, we will jump to gpt6 I think
I feel bad for ya Robert. Perhaps a request to OpenAI to increase your monthly subscription cost so that you can at least pay more, and thus share in common expressions of misery here?
I tried buying credits in solidarity but I'm afraid they're just gonna sit there unused and expire
Thank you, I will listen to it now every time I hit 0%
Today seems to be my day for helping people to share misery, not solutions.
On that note ... lunch time! ( AFK )
So about agents and orchestration . This is one of my workflow patterns:
- Run code against data.
- Tell the assistant to Verify the processing, which it recognizes as a directive, first to do a lot of specific checking, then to create issue tickets on what went wrong.
- Discuss the anomalies with the assistant, negotiating on the available and best approaches.
- I make a decision about how we're going to resolve the anomaly.
- Tell the assistant to process the decision on the specific anomaly.
- It interprets the decision, enhances the code, runs it against the data, and keeps running/testing/fixing until the anomaly is resolved. It documents what it did and leaves breadcrumbs for the next assistant that needs to deal with a similar anomaly.
Now I want to automate that. The controlling agent will:
- Do the initial data run.
- Get another agent to Verify.
- Pass me a list of issues.
- Get another agent to process decisions
- Verify the solution and close the issue/anomaly/ticket.
When I get issues, my flow will be:
- Send in a comment for discussion.
- OR State a final decision
- Return the issue back into the processing loop.
Given where we are now with Codex, what's the current/best approach for using Codex to orchestrate this flow?
Sincere thanks!
To be clear, I can do this with the API and any number of home-grown methods. I'm asking what OpenAI is suggesting for us to do this within the Codex ecosystem.
I know what he is doing, he stole all our tokens
Do you have a simple example of your 5.4-mini swarm architecture somewhere on GitHub 😇
I tried implementing it but there's a lot of copy pasting still
Did they just lobotomize GPT4?
I actually brought most of the magic into my configs https://github.com/robertmsale/.codex
The bridge server, rewrote it in rust, and that's the vast majority of the orchestration system. No examples or installation instructions.
It's gone braindead
Robert - I'm not a rust guy, don't plan to add that to the list of 20+ languages that I've learned over time. I think that means I'll need to fork and convert the project from Rust to TypeScript or JavaScript. Any personal or technical issues there?
having deepwiki index it now, it can explain / summarize for me. ty sir 🫡
You're missing out! I already have it in Typescript. I proved the system with Typescript for quick iteration & prototyping, then rewrote it in rust for performance & stability.
Should work fine in Typescript, but technical challenge would be concurrency/parallelization.
There are some stub crates of stuff I haven't ported over yet so make sure they ignore those
Presented with that kind of issue some years ago I'd then pivot to Java or C#. Well, a long time ago I decided I wasn't going to chase languages that just happened to be latest fad. But that came with the understanding that I would need to make some "unnecessary" conversions and concessions to keep things in my comfort zone. I'll negotiate this one. 🙂 TYVM
this may be a bit uncomfortable to hear but
GPT 5.4 knows Rust very well
so there's really no more 'learning languages' needed at this point
Hmmm, now There is a valid thought ... As much as I dislike vibe coding and people letting the box do everything for them, for this specific purpose I could do that. Nice suggestion @cyan wing - I might go for that.
Although converting Robert's code to Fortran had some appeal...

I don't know how to work with the borrow checker.
literally couldn't write a for loop in Rust if my life was on the line
but I have created a fully-functioning 2D game engine + embedded Luau scripting runtime in Rust
With rust, I get much better performance 😛 Typescript had no multi-threading so I had to get rid of deltas. Now they're back and gives that realtime feel
But I think Rust is more than just a fad language. It's legitimately powerful for a number of reasons. For example, by default, because ownership and lifetimes are a compile time guarantee, the compiler can make optimizations that languages like C++ make you specify manually. Like constexpr, Rust will know it's a constant expression based on mutation patterns. C++ makes you specify that to get the perf bonus. Or saying "this reference pointer is a unique reference", rust will auto vectorize access, C++ wont unless you add whatever toolchain keyword like __restrict or whatever. It's a great language!
No doubt the language is great - and now widely used for OS, AI, and other low-level stuff. Expertise with language comes at a high cost. I used to pay that cost as a matter of professionalism and deep personal challenge/interest. But these days tech is changing so rapidly that such luxuries cannot be afforded. I sacrifice "cool to learn" stuff for "absolutely need to keep moving forward". That means investment in understanding AI internals - but only as much as is necessary to use it as a consumer/developer, plus RAG, and all kinds of related tech. So a new language is considered with strong scrutiny. At the moment the only "side" language that's even just partially on my table is Python. Part of my decision is that I'm doing a ton of full-stack and multi-OS dev, JS/TS fit that need, others do not.
While I generally consider this discussion to be OT, it's something that everyone needs to consider so I created and am going with the flow.
maybe prompt it to reread the file again?
I think learning languages will soon be dead and it will just all be done through AI
Least obvious ragebait
Welcome back.
You even got a cute little new member tag
Hmm? I am new
yep yep!
GPT5.5 coming by June?
- Wrong channel for that question.
- No one ever knows the answer to "when" questions. Don't bother to ask.
- Welcome! 👋
I've been debugging a prompt generator nd I've been asking it to give me two paragraphs about shrimps for the last 2 hours, I know a lot of shrimp facts
What I'm learning from el repo del @boreal holly is that the current state-of-the-art (SOTA) for Codex agent orchestration is to manage agents with Skills, where the orchestrator issues CLI commands to call agents who get their skillset from .md files and then perform their duties. Robert - that the elevator pitch.
Any other approaches?
So far my thinking is that if we're gonna do this kind of agentic stuff on our own, there's nothing special in Codex that plays a part. It would be nice to integrate with Codex threads for cross-UI communications, but that's just a cool "nice to have".
Its in feature right now but Multi Agents v2 allows Codex to orchestrate better and spawn agents in a more natural way, and also allow direct inter-agent communication rather than the main orchestrator relaying
they are equipped with tools to direct message sub agents
Its not really well documented right now since its still behind unsupported and unreleased flags, but if you get codex to reverse engineer some docs for you from the repo, you can enable the flag and play around with it
(unless im misunderstanding what you're asking)
What problem do agent swarms solve?
Just faster implementation is all I've really found
Well, I'm not gonna use pre-alpha tools ... this time. 😆
I presented my use case up here: #codex-discussions message
That's really just detail for any multi-agent system. Unless Codex has something special to offer, now, I'll probably DIY so that I can just incorporate any models, not just OpenAI. That'll also allow for agents that communicate (or not, deterministically), shared memory/data, and other features that are in-vogue right now.
its more tokens and complication for speed?
I find I usually just use multiagents in large reviews so that context can stay a bit more pure, and then just relay core findings to the orchestrator, or I ask them to document in their own .md kinda thing
Oh dang, I completely forgot about https://github.com/openai/openai-agents-python
speaking of swarms...
Sometimes other things take more effort as well, so things that take less effort can start off while a sub agent continues to ground and research implementation
Rather than trying to use two independent instances, or multiple work trees, it can orchestrate its self so it doesnt step on its toes and cause a bunch of regressions and merge conflicts
if its for research why do the agents need to communicate?
Its not really applicable in that context, its more about just keeping halucinations and false positives to the bare minimum
Just keeping context as pure as possible
I find codex review gets lazy and will stop after finding like 2-3 things, asking it to use like 5-10 reviewers will give you a lot more findings in one go rather than constantly asking it to do it again adn again
Sometimes an agent will find a detail and pursue it, so others need to know not to pursue it. Or it might require pursuit, and if an agent is dedicated to a single effort then it might pass the request to another agent, new or existing. This can be managed by an orchestrator too - it's your choice and dependent on how you see the world or just that project.
Personally i can't see a decent use for swarms of agents that talk to each other.
For the most part. The reason I built it the way I did is because I have workspace-write enabled, and in some of my projects the agents might run into tooling issues, so they have a way of messaging agents in those tooling projects to get a fix going. I don't want agents reaching out of the sandbox to do things outside their scope, but I do want them to fix the tooling when it needs fixing.
With the vanilla codex multi_agent experience, you typically have them working in a project, and if they need to reach out of the project for anything they either can't do it, they require approval, or you give them danger-full-access. Mine is a little more collaborative. I separate the concerns and encourage collaboration
So Robert, what was involved in your decision to pursue your agents as you do vs the OpenAI Agents SDK (which BTW, in addition to Python link above, this is the JS/TS version https://github.com/openai/openai-agents-js )
THERE my credits go! Use them wisely.
Sorry I’m new, I’m not familiar with all the channels, also polymarket gives it 73% odds so it’s likely 90% by July
Start trading Albanian shrimp futures
depends if they make a mistake or not in the build.
Lowkey just propagating noise
Rather than reimplementing the entirety of codex app-server with Agents SDK, I use the existing app-server, and implement workarounds for certain behaviors
Price that in on polymarket
enjoy yet another manual quota reset 👍
is the usage page broken for anyone else?
Tibo pressed the reset button - https://x.com/thsottiaux/status/2041655710346572085
3 million users, celebration = reset, every 1 million thereafter until 10 million will be a reset too
Does Codx tell you if you've been downgraded on non-verified accounts yet?
I know bro I still had 49% right before he slapped the big button on his desk 💀
I have 1 verified pro and 1 unverified pro, and I swear the unverified pro quality is like garbage
i was going to pump out some work on 5.4 fast , but 🙁
AT this point with how confidently they just keep resetting they should just remove weekly limits 🙂
if you mean the cyber security thing, pretty sure codex cli should show a warning notice in the TUI if you are using 5.4 and you got flagged while unverified
kind of doesnt work that way though, some ppl lose out on it, and everyone still has the allocation of 15%(ish) a day.
Just ppl who burn fast than 15% a day get a benefit
Imagine weekly fast allowance counted separately 🤯
Yeah it was a guessing game when it was first implemented, but on my unverified pro account I havent seen anything regarding being downgraded, so was wondering if it happened quietly still
If nobody has limits then nobody misses out though, assuming on equal plans
Its just there
I would argue it also goes the other way, sometimes you end up paying for way more than you need some weeks
Some weeks its not enough
We need 2X usage back pls
this is how they get more users, i just ran out 3 hours ago as well
I wish they would make official announcements in #codex-updates in addition to random social media.
(Of course, this is "random social media", so that's sort of a weird thing to say...) 
Shoot man, they cooked until the entire list of stuff I had was done, now I gotta figure out what to do next... Normally I never run out of tasks 😭 5.4 really is magic
Sam posted it aswell
on X
vro is the only pro user NOT complaining about usage, while also getting tons of stuff done xD
I personally think that they should allow you to single agent use gpt 5.4 nano unlimited
or gpt 5.4 mini with super high quota
basically
you use your quota up
and you can still do stuff
youre just capped to a dumber model
I was struggling for a while with scaling the agents, but I gave them a bunch of iOS simulators to pilot over the Accessibility layer and a bunch of QA "user stories", and they actually found every single tiny bug/missing feature, handed detailed logs and repro steps to the workers, and the workers smashed them out and locked them in with integration tests. It worked so well... I had no idea that was the missing piece to it all
this seems like a smart solution for an app
but I wonder
do you filter the user stories?
Id imagine you wouldnt want someone to prompt inject your codebase
What do you mean? I wrote the stories, and the QA agents don't even have access to or see the codebase. They run in a sandbox. They're allowed to send Accessibility commands and take screenshots. And when they find usability bugs they report them, and it's set up so the workers can dump all the stdout/stderr logs for the backend and frontend + screenshots and all Accessibility commands leading up to the bug report.
oh ok
ic
I thought you meant user submitted user stories
and I was like
waitwhat
I see now
that makes much more sense
do images burn quota?
or is it not relevant
Oh nice Mythos bench was announced, wonder when GPT will come out with it's equivalent
user stories are basically a way to create project requirements in the way that the user would experience them as a part of SCRUM
I am a bum(not out of university)
Ah I love GPT 5.4 sometimes:
Me: "Change the endpoint to accept things from the script in this and this way, then change the script to deliver it in that way, finally make sure it works (script runs against production instance)"
GPT: "Changes ingestion endpoint, changes connection script, doesn't push for the new endpoint to take effect, script not working, doesn't understand why".
Me: "Do you need to publish the code for the API endpoint to accept the new information?"
GPT: "No it's a shared ingestion script so locally it works."
Me: "You're running against production"
GPT: "Then you're right it wouldn't work in production, but it would work locally. Tries running it again..."
(I know this is not the professional way to handle things with local scripts affecting production but I'm running some quick tests to make sure it'll work out).
gpt 5.3 codex is great it actually realized the issue
Will this work well?
model_context_window=500000
model_auto_compact_token_limit=400000
Anyone have experience bumping the context window up?
i was thinking about bumping it just a touch as well, not that much but maybe 50K more
What's the advantage?
Slightly more context, I figure those cases where extra high eats slightly too much in planning so it gets pushed over into compaction will not go to compaction too soon and lose the thread of the task.
So having slightly bigger tasks be slightly more reliable.
Just all the things more context would provide.
I just meant like in your use case, I've been hesitant because I feel like the compaction works really well, and I feel like anything extra is just asking for hallucinations
Will 50k more cause more hallucinations to the point that it's not useful?
At some point the size of a task becomes too big for one instance and has to turn into a phased plan. I figure it might give a bit more head room before moving to that strategy
is it just me or has someone else's limits didn't reset?
I have the business plan and 3 seats.
why the hell are the new 5h limits so less??
i've been hitting the 5h limits within 45 mins. i used to never hit it before. something has changed
I dunno, you will have to let me know! I just feel like literally anything is more chance than nothing lol
even if we consider 2X going away it was still better
same lol
Do you know if we can set that setting to arbitrary numbers?
model_context_window=500000
model_auto_compact_token_limit=400000
Ill just start iterating it bit by bit if it can be
I'm not sure, I havent played around with it at all, I'd assume you can
the default 258k is not exactly a power of 2 so i dont think they're following any sort of standard bit thing
I wish there was a discord bot that when scammers popped in it would just explode their computer/bot servers
There is a trick that most huge servers implement, they add a dump channel where when anyone sends a message to that channel they'll automatically get kicked/banned (obviously you can add a notice to that channel and warn real users).
bots are dumb, it randomly sends message everywhere if they hit that channel then kicked 🤷
Anyone else seeing Codex failures right now?
Got:
backend-api/codex/responses
WebSocket fallback to HTTPS
503 Service Unavailable
upstream connect error / connection termination
Time: 2026-04-08 Asia/Taipei (UTC+8)
Request ID: 18985970-d1ad-4a68-ac65-64ec73f84c33
Yeah it happens in here where they’re instantly purged. I just want their stuff to blow up
yes same
ahh
so, usage reset incoming?
Update: I’ve now hit it on my side too.
same
reset aaaaagain?
soon
Ive only used for 4% of my daily if they reset i lose out.
people have been complaining for almost a week now. seems like this was intentional
basically you will need pro or more if you want to use codex for anything somewhat serious
same
rumors are that gpt6 will release next week though, so maybe all this is temporary redirection of compute for that launch and things will settle again after
don't forget there's a new image model coming soon as well, certainly has been on A/B testing recently
getting it as well, was fine for a few minutes and now Copilot CLI (with BYOK pointing to ChatGPT sub) says: ✗ Our servers are currently overloaded. Please try again later.
"I see one new background agent called Goodall (explorer), but it is tagged as GPT-5.4 agent, not as GPT-5.4-Mini agent!?"
I did request gpt-5.4-mini explicitly for that agent.
What I can see on my side is only the spawn request I sent:
agent type: explorer
model: gpt-5.4-mini
What I cannot see is a confirmed backend echo showing the model actually attached after spawn. So if your UI showed a generic GPT-5.4 tag, that is likely one of these:
- the UI badge is generic for spawned agents
- the UI is showing the parent-family label, not the exact sub-model
- the platform silently normalized the label even though I requested the mini model
What do I make of this?
I'm sad with this error, "Selected model is at capacity. Please try a different model". Now let's do this... haha
it is honestly ridiculous. pro now feels like plus. and plus is completely useless
@cedar skiff who you talking to
I'm talking about Selected model is at capacity.
bruh, is openai gonna revert it back
Like, the smartest thing they can do is to just remove the 5h limit lol
its been like this for a week now. I doubt it. My hope is they are diverting for the new model release, and things will settle over the coming weeks again.
As is, I am very close to canceling my pro account, 200 bucks for this is ridiculous.
Please just ban free accounts and openclaw
Problem is, I'm a total oai fanboy. Moving to Gemini or Claude would hurt my soul
yep, they still have limit the hardware capacity on their server 🙁
They killed sora for their new model. I have a feeling these current codex issues are related
https://github.com/openai/codex/issues/17105 i opened this because it is anooying.
Hopefully get some input. I locally tested it.
Also, one of the mini agents already failed to start because mini was at capacity, so the current visible roster is not fully reliable anyway.
What does this mean to tell me (Plus user)?
How is "mini at capacity"?
openclaw slop using up capacity
Nope, it's not limit in your account. It 's limit on their hardware server..... Because there have many spamming free account and capacity went full.
hopefully oai does what the competitor did nd ban OpenClaw
they're nothing but trouble
i fear that oai will double down on it with their "single app for chatgpt, codex and atlas" idea
Thanks for the explanation. 👍
Odds are less than 30% so unlikely
Bruh, jumped by 20% overnight
this polymarket talk?
lol
When I ask Codex to spawn a sub-agent using GPT-5.4-Mini with Low reasoning then the agent chats in VS Code still list the same model and reasoning that the main chat used. I can then manually change them, but would very much prefer if these were spawned correctly from the start. Is anything known about this limitation?
5hour limit reached on pro plan 🙁
impressive. you are not using 5.4 on medium are you?
High atm
https://x.com/thsottiaux/status/2041655710346572085?s=20
did this happen? or was this selective to only a few plans? (i'm on business and it didn't reset for me)
tbh medium is plenty for most things, the cost/smarts benefit of high is pretty bad
i'm thankful for the resets so far but just wondering if we did get a reset across the board.
Agreed - but needed high for these tasks
my teenage self needed high for every task too
Like I said, not on high all of the time but for the tasks today, I needed it on high 🙂
I started using a skill a few days ago, and since then, my codex 5h limit goes away in a few prompts. The skill is basically a reference document for a big codebase where it points the agent to different folders and files depending the task so the agent doesn't have to parse the whole codebase. I expected to use less tokens with that, but without the skill, the usage goes down slowly like before. But with the skill, it takes only a few prompts to use the whole 5hours limit. Any similar experience?
Images do not burn that much quota. I've observed ~2k input tokens per screenshot.
Absolutely burns through tokens currently - but built a pretty comprehensive vulnerability scanner
So if your question is “only first use or every time,” the most accurate answer is: officially, full skill instructions are loaded on activation, not upfront; beyond that, reuse/caching within a session is not clearly documented, so do not assume zero additional input cost on every later use.
This can be problematic. If the full skills instructions are parsed every single time a skill is used within an ongoing session then that can become expensive. I expect(ed) this not to be the case, though, with skills only being parsed once to load them into the context. But if you observe much higher token usage then maybe that is not the case!? 🤷♂️
- Not every skill pays its full instruction cost at session start.
- The main full-skill input cost happens when the skill is first activated.
- For later reuse in the same live session, it may be cheaper if the relevant skill content is already in active context, but OpenAI does not currently document that as a guarantee.
- For a new session/thread, assume the process starts over: metadata first, full SKILL.md only if the skill is used again.
how much yall betting that oai is consolidating compute to compete with mythos
That's what I thought as well. But I am not sure if the skill use more tokens because it is parsed in full OR because of the skill, the agent can implement way more in a short time and that's why I observe the limitation of usage compared to before
yeah 😂
Can’t wait to be like ok spud review my code base and it be like what in the world is this
5.5: "You let 5.4 do WHAT?"
praying for something like 5.4 + 10-15 IQ buff + sensible limits on 100 plan + frontend buff
I love the massive jump between the public benchmarks and the vendor reported benchmarks
uhm
oai be like 'those are rookie numbers' ?
im afraid
guess people are shaking the model tree again. can't wait to see what falls out
Codex places the header:
---
name: blah-blah
description: Do stuff more better
---
Inside the agent's context at the beginning of chat. If anything inside that header changes:
---
name: blah-blah
description: Do stuff more better v2
---
The agent gets that update inserted at the beginning of the next turn. Also I think if you do the $blah-blah explicit skill invocation they insert the whole SKILL.md into the prompt. Idk, they change so much between version updates that might not be the case anymore.
@boreal holly I find that 5.4 mini on xhigh does as good as 5.4 high, I be it a bit slower but its cheeper.
You need to do 2+ waves of planning but it does work
5.4 mini high is 🤌 very nice. 4 out of 5 agents I use are mini and they do really well in most situations
Hi all. I am currently using Claude Code and I am looking to move to Codex. I was curious, what are the session limitations? Currently with Claude Code is extremely annoying, puting a stopper once every few hours for 3 - 5 h at a time.
Much appreciated! 🖖
session limitations? Do you mean the 5 hour quota, do you mean how many actual sessions you can have in some kind of timeframe? Or something else?
Yes, with Claude Code I get a session limitation. I was curious how is it in the case of Codex.
I still don't know what you mean by session limitation, so I'm going to assume how many sessions you can have in a month. If so, no such limit exists. it's primarily a 5 hour quota and a weekly quota, both of which are based on tokens
Aha. How many tokens I get per week on a paid plan?
the number of tokens isn't disclosed, but the limits, on pro anyway, are pretty generous
I'd argue that you get more out of Codex than you can on Claude
a lot more
never enouuuuuuugh. neveeeer neveeer
and don't forget the occasional manual resets that Tibo likes to do
So, how many tokens I get? WIth Claud Code I get roughly 4.7 million tokens a month, and a daily quota which is reached pretty fast, even with my bes tbehavior and watching closely context.
I do not see anywhere in the Codex plans any description in details in this regards unfortunately.
tokens of what
we don’t have access to what drains tokens
again, token limits aren't disclosed. All I can suggest you do is try Plus, and then consider that Pro is around 6 times the limit of Plus, so you can get an idea on how much you can do
i will say that I get about 300 dollars of inference per month on my codex plan which costs 20 dollars
anthropic models are much more expensive than oai models
and oai has a better small model
pro is 6x plus
so
300*6=1800 dollars of inference for 200 dollars
OK, I see. Well, thanks for the explanation. It is a pity number of tokens are not disclosed to see a price per point comparison. I'll investigate some more, try to work with it for free and see where it gets limited, and see where I end up. 🖖
the oai quota is very generous
you also aren't limited to what harness you can use with your subscription, which is nice, so you can use codex cli if you want, or opencode, or copilot cli with a local server to translate the api calls, etc.
Well... pretty expensive... I think it is way over my budget unfortunatelly :\
"By cheap pay twice"
I wouldn't run the risk with DeepSeek
Anyway, much appreciated!
There is no risk for light tasks and I am already using it with good results. Very critical parts I built it with Claude Code, and onyl recently used Codex where Claude failed hard.
this is not how it works
There's never any risk until there is :p
For me at this point in time is a question of what I can use for which task, and optimize as much as I can on clear costs.
Writing return functions for a simple calculator is no risk, I can see the code 🙂
I think is too much bias against chinese produces Ai or other non western based entities, and it is a pitty. But this is another topic, very political and no need to delve into it.
¯_(ツ)_/¯
did they reset limits?
I do appreciate the better explanations for Codex, it is a lot clearer now. Thank you people 🖖
For VS code in config.toml, what adjustments you need to change so codex asks before it makes any file change or run a command? I have tried different combinations and the codex extension seems to ignore the settings.
yeah last night, reached 3 million users so Tibo pressed the button to celebrate. If we reach 4 million he'll do it again, and every million until 10
nice
why am i getting an erorr
no client found
is there a way to see it live ?
i was on claude 20$ subscription 3months ago, switched to codex, and codex > claude.
Weekly limit was refreshed today. 😄
I was lazy and did all work in VS Code + Codex today that could have been split with the ChatGPT (web) app (which doesn't use Codex tokens).
you mean your usage limits?
no the live user
so if we are closer to 4M i can spam 👀
only OpenAI staff can see this I imagine 😛
but the 3million users is legit for the reset ?
Did that reset happen today? The day my weekly limit reset anyway? 😄
Please complain even with free stuff. wild
cant make anyone happy
Unfortunately the whole idea about using sub-agents with lower models and reasoning for simple extraction work to save tokens doesn't work. But they are neither spawned nor used reliably and when they are then they use the very same model and reasoning as the main chat. I tried this in VS Code and the Codex app. 🤷♂️
Nope still absolutely burns through tokens. On Pro plan and one prompt (5.4 medium) with 3 sub agents has just eaten 9% of my 5-hourly limit and 3% of my weekly limit.
Im not seeing that here
How many LOC u have?
typing for 1 hour now
@lean lark Do you need rescueing?
That's true if you let Codex orchestrate. But if you use the model that @boreal holly uses, where the sub-agent is spawned with a CLI executed by the manager, then you have full control over the sub-agent props.
If you get the default Codex model to execute CLI commands, you can use the Agents SDK (or any SDK) and completely control the entire process of hand-offs, sub-management, etc.
To be fair, it's a massive repo and the prompt is about mapping it out for pen testing so it was sliiiightly disingenuous, but still
Yes, my agents are peers. Idk how they do sub-agents, but if it's anything like how they do unified_exec=true then the main agent is polling the sub agents to see what they're up to. That can probably be a waste of tokens, especially if the main agent is eating all the tool calls the sub agents are doing
https://github.com/StuartAJ/Sweeper
The first two are the biggest token guzzlers
I believe the primary is waiting for events from the secondaries.
Do you just have one agent doing the reconnaissance or do you fan out the first 2 steps?
Fan out. 3 for pre-recon and 5 for recon
Nice! That does look pretty comprehensive. I could see it eating a lotta tokens
Yeah, it gobbles them up quickly but I've found it worth it. Tend to run it towards the end of the weekly limits but they reset early this week and ruined my schedule haha
Stu - good on you for your protective licensing.
So we are expecting coding agents that cost us 200 little dollars to digest 100k lines of code, make sense of it all in minutes, and cost us 1% of usage? not going to happen folks. that would take a development team months to get up to speed on let alone run bugcheckjing on it. have to be realistic and understanding on what ai is currently able to do... 200$ for what we get is a steal. even the 20$ plan is a steal. 200$ is what, 6 us dollars per day? an icecream costs 5$ .
do not type already, theres no way you read that that fast. slow down
digest what i said
Are we? Sounds like you're inferring and that's not what I've (intentionally) implied
It doesn't take very long to read 3 lines haha
you typed it yourself and here you are still typing too soon because you didnt read my full reply
read slower, and really think about the words i typed!
I'm not disagreeing - absolutely, we get a lot more than we pay for.
an icecream costs 5$
/me fondly remembers asking the local ice cream store why the cost went from $0.23 to $0.25.
But when I read your comment - regardless of the speed - I got the impression I was the 'we' you were talking about
And icecreams don't cost $5 here
Closer to £2
£3 if you want a flake
same, same lol
gumballs going from a nickle to a quarter was a fraud!
devastation
the current state of AI coding is unreal. People are already taking this for granted. ask 5.4 to estimate the cost and time a team of 2 or 4 devs would take to produce what we are producing, whatever it is you are coding, ask, it will blow you away
It would be nice if productivity increased without the expectation of increased workload. Instead it's do more and do more faster
i really do think AI will help a lot on this front, i dont worry about it anymore
the right people are behind these projects here in the usa
future is bright
I agree with cesium - my daily work pattern has radically changed just in the last few months. I'm doing things differently and producing differently. No newcomer is gonna produce what I do with vibing. The better we understand the tech the more we can do with it.
I am very eager also for oai to implement a remote control feature for codex...anthropic has this, and orchestrating coding, bugchecking, ALL OF GIT, while hiking, or jogging, talking to it through voice, is completely unreal and powerful
i created an iphone app start to finish while walking at the park. i loaded the newly coded app while still at the park, and i pushed it to all remotes while still at the park
weve dreamed of this and its now here
All that said, we're in this mode now where new techniques are pulling us in different directions. What to do about context, RAG, agent management ... these areas are evolving but I feel like we in the field are highly subject to the trend of the day which is changing too rapidly. I've been holding off on chasing trends, I'm OK with being leading-edge, not bleeding-edge. We gotta pick n choose our battles, spending precious time only on what seems most likely to be most productive at the moments.
rocket fuel
we have one huge thing going, i think the right people are leading these projects
What @glacial shadow describes is a sort of brass ring right now. There's a big disconnect between Codex and everything else. We can't start something in Codex and continue elsewhere - the "app" doesn't cut it. If we want the freedoms that cesium describes, we need to DIY with the Codex SDK or the Agents SDK plus connectors/MCP. I'm sure OpenAI is working in this direction. I'm OK to wait, but I feel like hands are tied right now and it gets frustrating. I'm anxious for more of what they've been offering.
https://developers.openai.com/codex/sdk
https://github.com/openai/openai-agents-js - https://github.com/openai/openai-agents-python
codex cli has been phenomenal here too, runs anywhere
I think Agents SDK is really only useful in a CI/CD pipeline. They provide a much more powerful codex app-server generate-ts and you can build a frontend of any kind with it. I went a different route, adding github.com/openai/codex as a submodule, checked out to the installed codex --version , and link my stuff directly to their app server code. But in any case I can vouch for the app server being the absolute best way to build a custom thing quickly
It took me ~2 days to build a MVP frontend for the app server, and minor tuning and adding features over a few weeks. Meanwhile the app-server is headless so if I wanna update the frontend while agents are working it doesn't interrupt their work. It's something you can build while they're already doing their thing which is just 🤯
Having forgotten the Agents SDK existed (amongst the plethora of tools we have available these days), I went back last night to review the current docs. It does support browser usage, streaming, human feedback, voice, images, and much more.
I was nearly salivating with thoughts about what can be done ... and then nearly crying as I thought about the lack of time. That happens a lot these days. 😢
I don't know why I just openned this chat. Maybe I wanted to complain about the Codex's stinginess (I already used 50% of weekly limit), but everyone already has done that here
Why is my usage limit in the CLI horrible?
My 5 hour goes down in at most 40 minutes
Definetly not normal
Anyone getting upstream connect error or disconnect/reset before headers. retried and the latest reset reason: remote connection failure, transport failure reason: delayed connect error: Connection refused ?
Hmm have over 90% remaining
Massive growth leads to massive losses (in the current pricing structure). As more and more people are using the service (whether it's Codex or Claude Code), the more the companies are 'losing'.
I think what we're feeling now are the companies adjusting to ensuring the increased usage doesn't cause outages. And yes, before anyone says it, we are seeing "Model capacity reached" messages but if you input your prompt straight after that, you still get the output you wanted
With mythos and spud, I'm interested to see how the pricing/capacity reacts. Especially since the hype around them are that they are wildly powerful
Get Pro
People experience the same with pro
No man, thats spam and disinformation, it is so hard to run out of pro usage
complainers arent builders apparentlky
What?
You can burn through it if you want to, its easy. But i also don't have any confusion about usage, if feels about right
personal take, but I tend to find that Codex CLI appears to consume the context window much faster when compared to some other third party harnesses out there. I suspect that's part of the reason why usage is appearing to drain faster, as it's consuming input tokens quicker as a result
It's just 5.4 eats tokens, 5.3 lower token consumption + 2x promo made it seem a lot bigger.
yeah the 2x promo ending definitely made it more noticeable as well
I'm using 5.4 via Copilot CLI using BYOK, and it seems to be draining my sub's quota more slowly since then
again, personal take, other users might have different experiences
I have zero trust in MS doing anything correct especially Copilot
each to their own 😂
Love using codex on xhigh
code being produced here, in totality, barely any bugs, fast, couldnt be happier
the cybersecurity considerations are super, greatest technology ive ever used
very thankful to the usage limit im given!
Amem
which codex is best for large c++ projects? 5.4 or mini or 5.3-codex and what "reason-level"?
5.4 medium and 5.4-mini high
mini high is a really nice model
I didnt use for the first while, then when i tried it does plenty of work well.
i use maximum overdrive allthe time, xhigh
The key to not running out of usage limit is not making your code molithic
go ahead and ask your agent
"read my entire codebase and tell me what parts are monolithic bad codes/design? that forces a coding agent to must read the entire codebase and inefficiently work?
code review changed in codex app, doesnt start with a new context anymore
I wanna start seeing more medical breakthroughs driven by big models like Mythos & SPUD or something rather than like 'FIXED A BUG'
why dont you do it
Great question
I didnt fact check it but apparently some dude synthesized. cure for his dogs cancer that is now at harvard or something for trial
Im sure all of the experts in the different fields have their purpose built models coming and currently use the ones avaiable.
We just dont see it because they dont hang out in the coding channels
some of it is literally book + whiteboard/pen paper, and thinking very hard for a long time, completeyl agree with your comment
There was this X message yesterday by @sama that "To celebrate 3 million weekly codex users, we are resetting usage limits". I swear this has not manifested on my end of the tubes. I am using codex with a ChatGPT Business (formerly: Teams) account. Did anyone receive a usage limit reset yesterday?
Has anyone else worked a lot with both codex and cc and strongly prefer codex like me, which seems to be opposite a lot of anecdotal evidence? Not sure what it could be, it interprets my agents.md file better? Or its better suited to the repo I've been working with for a year plus (TS monorepo with server/client/shared) idk
does anyone know when 5.4 codex comes out?
I find they are very similar and greatly appreciate the option of having affordable access to both. each could easily be 1000-1500 per month. easy.
very much agree they are both very good. Just like codex more though, just a better product, better communicator, better sr dev I fire every 10 minutes
thats a nice touch
chatgpt 2 coming soon
Sample a random person in a city, you will find most "Oh yeah I heard about ai, used it once, gave a really long answer, havent used it since"
depends on the demographic. every single student at school or uni uses it all the time.
In code (cli), how can I type in a newline in my prompt? When I type newline it ofc sumbits the prompt. Shift-enter also submits!
I think it must be possible to type a newline though, because when I paste text cotnaining newlines, these appear correctly. But how to type them?
anyone know the what the difference in usage is between medium/high/xhigh? and also the overall effectiveness?
usage has a big jump on high -> xhigh with much less than proportional benefit.
There are graphs around that show it.
high seems to be the most effective, medium is cheaper when it works though
iirc xhigh actually performs worse than high in a lot of situations
where's the graph?
does that show small-medium-high-xhigh?
yes
on the gpt 5.4 graph
cost is in order from left to right
you can see
medium and high have about the same accuracy
but medium costss a lot less
for having NEARLY the same accuracy
so
basically
either use gpt 5.4 xhigh, or medium
or use gpt 5.4 mini on high
ah I was thinking of 5.2's high mode
5.2 in general was kinda weird when it came to benchmarking tool calls lol
yes
5.2 is very specifically good at long form tasks though
it is not very intelligent
but it sure does take its time to gather info
which can be useful
I hate that openai refuses to zero their y-axis
its so that they can pretend there is large improvement when there isnt
exaggerates what would be negligible differences a lot more
I mean that is clearly a large improvement, but they should have used 0% as the baseline
yeah see they did do it here
granted 5.2 was much closer to 0% but still
There won't be a 5.4-codex. This time they went in the opposite direction. Before, they tuned the general-purpose 5.3 model for coding, and gave us 5.3-codex. This time they tuned the model update to also be good with code, and published it as the general-purpose 5.4. So we are using the same 5.4 in ChatGPT and Codex. It's elegant. I wouldn't recommend this as a permanent go-to solution but for now I think it was a great move.
it makes sense considering the demands of chatgpt are getting much more similar to codex models
there will prolly be 5.5 soon anyways
lowkirk
openai needs to cook something to pretend to compete with claude mythos
having separately RLed models that are both mostly running bash commands doesn't make a whole lot of sense
in theory they have whatever the big model that was used to distill 5.2/5.3/5.4
I don't see much of a point in openai competing with something that is so stupidly expensive and closed off though
Well
mythos is I think 10x more expensive per input/output token than 5.4 which is wild
Gpt 5.4 pro I assume is their fat model
Pricing is not released yet
And leaks about these things are notoriously inaccurate
No but the point is oai needs a new massive model to generate QA pairs to train their models
they released pricing for glasswing
which is presumably somewhat subsidized so it is probably lower than what the released model will cost
Claude Mythos Preview will be available to participants of Project Glasswing at $25/$125 per million input/output tokens accessible via the Claude API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry. Anthropic has committed $100M in model usage credits to cover Project Glasswing and additional participants throughout the research preview.
Or maybe not subsidized because they are giving away 100 million dollars
Or maybe costs more than it will because they are using an unoptimized unquantized version
it's a really big model so I wouldn't be surprised if the cost of inference was about that high without anthropic's usual API tax
which means the release version would be more expensive to subsidize subscriptions
I cant wait till anthropic distills mythos
they're distilling it into an opus model iirc
Or maybe vera rubin will make it cheap
won't be as capable though
Ofc
my guess is mythos is never coming to subscriptions unfortunately
they're gonna bill the crap out of governments and corporations to use it though
Have I gone redarded, does codex not respect project specific config tomls?
if you use codex you have tonight to use up your limits :)
nvm forgot to restart my app-server after i made changes
Does anyone have any additional information on the coming cost changes coming for plus users? Will we still have a 20$ plan or will we be forced on a credit based pay as you go model?
gpt 5.4 MINI I NEVER HEARD OF IT
I’m finding it very hard to figure out the prices
source for upcoming cost changes for plus users
Haha, Reddit doomscrolling and the business plan changes to usage .. let me find a link
the business plan changing has nothing to do with the plus plan
the business plan
is made
for businesses
But they state “until we move you to the new token based pricing for codex” in the rate card article
Inside the grey legacy rate card box
It looks like a jump from 34 credits in 5.4 to 62.5?
Or 6250 depending on how you American you are in reading numbers
thats
for
credits
and the legacy rate card
is for
prompts
the new rate card
is for
megatokens
did u even read the article twin
Buddy I don’t understand the article 🙁
I’ve just been used to a fixed price, fixed usage
If all of a sudden my price will vary, I will have to explain that to a boss somehow
And also, it tells you how to convert tokens to credits, but there is no mention of what one credit is in dollars, so the whole thing is just super vague
No, their document said they are but not the cyber one
mythos is the cyber one
Yeah honestly no one will be using mythos to actually program. The concept is nice for cybersecurity for companies but there's few companies who will pay for it.
With Claude Code Max currently you can finish your usage in like 4-5 prompts, imagine a model 10x more expensive
They will distill other less expensive models from it. I guess the idea is over time compute gets cheaper, and they find optimisations making bigger models cheaper to run.
Yes, I know. They said they are making a public one. All of this information is freely available. This is in their documentation on their website. It’s public info.
that'd be opus
I thik it's $0.04 don't quote me on that though
Last time i was on claude about 8 weeks ago it was like that it could do a tonne of work on max plan
opus is the previous big daddy model
so should i keep using 5.3?
I’m not worrying about this because the people behind codex and the models are genuinely good people
No, 5.4 is right there, assuming you've got the current version.
Oh my sweet summer child
Open codex and you should see the 5.4 model.
I think he's just under the impression 5.3-Codex is better because it has the 'Codex' suffix, and 5.4 to him is just a general model to ask the air frying directions for a bag of crinkle cut fries
Update codex
Get on 5.4 it’s great, you always want to be using the latest models so that you are implementing the latest technologies
5.3 works better for me, it doesnt over engineer as much, it follows all the way to the concrete implementations and it doesnt make as many assumptions.
im hoping i wake up tomorrow to Gpt-10^99
Anyone else unable to install on Windows Server 2022 Datacenter? This is an AWS Workspace. It just keeps looping saying that it can't install and sends me back to the microsoft store. I've tried running as Administrator but it does the same thing.
The microsoft store is always the worst thing in world for me, it rarely works, and when it does it downloads so slow
try using winget
Try installing from terminal , less moving parts
mm
yep i do
ye
it makes me confused
Just use what works for you
people are saying that gpt 5.4 has later technologies which is true
but it’s a general ai
which means that it probably has a higher chance of hallucination
but
To be clear, there will never, ever, be another 'Codex' model
It’s better at reasoning and therefore coding
5.3 has less advanced techniques but it focuses better
rate 5.3 and 5.4 hallucination
1 being no hallucination
5 being lots
5.4 makes more assumptions
can be useful in some situations too
I use 5.4 because I want the best reasoning , best coding, and I use it on high or xhigh, and sometimes /fast on
what plan?
It is great at coding, everything git, general ai,
are you using?
It's not useful when one more step into the code would remove the assumption
mm
I will definitely never ever ever listen to people telling me to use older models, to me that is Spam and intentionally bad information
lowkey 5.3 not bad either right
Just using brain dead ideas of newer is better in every case is naive at best.
5.3 was great. 5.4 is better
Use what wroks best for you
mm
They call it 5.4 because it’s better than 5.3
Lol
They use it, so I will use it
Simple
When you don’t realise that 5.4 is intelligent, but also extremely lazy. Also absolutely horrid at frontend UI work.
Higher number ≠ better
an RTX 5060 is not better than say… a 4090
Your logic is silly, my friend
OpenAI's benchmarks, end of discussion choose what you enjoy
