#codex-discussions
1 messages · Page 28 of 1
I just tested out the new limits. It used 13% of my 5 hour limit in 1 message on extra high (worked for 13 minutes)
yes its garbage now ..
they screwed up something on the token usage
1 message 123k token! never happen
so what happen is they screwed up something on token usage itself .. limits are the same but now 1-2 message it has no more cache .. might cache bug
100% a bug now
nowhere i have that issue not even in cc
lol ... it compacting again
does anyone else have codex CLI keep changing permissions? like i have repeatedly set permissions to default and it repeatedly, seemingly at random, switches to some custom permission set. it sometimes is read-only and other times it just asks before every tool use. its insanely annoying
lol so it compacts read the file again uses straight 136k tokens again ..
wow i loop with 4 message through now 500k tokens
yah we got rug pulled by codex
not usuable anymore
same stuff in cc consumes 5k tokens max
even if
Thats interesting is it trying to read like a huge file or something?
golang files
i mean its the same files i work on since 7 days
Maybe its the tokenizer they messed up (since they changed usage from message to tokens iirc)
cc consumes the same files 6 files .. 3-5k tokens max .. modification when in the session 300 or something
its the cache iam in a compacting loop
it keeps reading the files then iam half in 136k tokens .. 5-6 files .. do something .. 75% .. and then its compacting ..
wow ..
token usage is insane
Recently my VSCode Codex diffs (in the Codex chat window and VSCode Codex Diff editor window) have lost their background colors (red for deleted lines, green for added lines), and instead just display the line numbers in red/green with an accompanying (always green) changebar. The source lines themselves use standard formatting. So it is very difficult to actually pick out the changes. I don't know if this is a VSCode or Codex problem, but it only started a few days ago and I'm sure both VSCode (1.115.0) and Codex (26.406.31014) extension have changed in that timeframe. Is there a VSCode/Codex setting that controls this, or is this a recognized problem that (hopefully) is already being worked on.
Does anyone else see this behavior or is it just me?
this is bad ..
yes it used 70% of my limits burn through my tokens like no other model
the 100 usd model will burn through it via the same rate
The plus sub does seem to burn through the 5 hour limit faster now, but the weekly feels fine
check the token usage there is a bug. it feels like the cache is disabled
It did feel bugged to me but I just figured the $20 sub was cut back for codex usage.
Not gonna lie I think it might just be on your side (based on the error description I don’t think I’ve seen anyone else speaking about it today)
The 5 hour sessions being super short though is a problem everyone’s having
check the token usage
i literally monitored it
│ 5h limit: [█████░░░░░░░░░░░░░░░] 25% left (resets 08:34) │
│ Weekly limit: [██████████████████░░] 88% left (resets 03:34 on 17 Apr)```
5 prompts
unusable ..
trying to get chinese models now
run the checker script and see whats eating your tokens
I haven't had it compact a single time today (83% context left after 2 prompts)
where can find that ?
i was in a compact loop
https://pastebin.com/bn7Zn1Hr
SINCE_DAYS=7 python3 scripts/analyze-codex-tokens.py
comes out looking like ^
moment checking
be curious to see what it says
I just tested it and the adjustments to the limits are already working. The 5-hour limit was reached quickly... it's the first time I've hit the 5-hour limit on the Codex.
wow so it always burned throught tokens .. something is off .. Top sessions:
- Apr 3 — 175.8M tokens
- Apr 8 — 158.0M tokens
- Apr 3 — 12.3M tokens
- Apr 3 — 0.8M tokens
Whats the blurb at the very bottom say?
- Context replay dominates output: input/output is 220.3x and cached-input/output is 199.0x across the whole report window.
- A small number of sessions dominate spend: the top 5 non-subagent sessions account for 346,814,695 tokens (100.0% of total usage).
- The heaviest static instruction payload observed was in `019d532a-8791-7223-afaf-86dc6a728168` (cantwin) at 19,197 characters, or about ~4,799 tokens before any repo/file context was added.```
But between Apr 3 and Apr 8 the 5 hour limit is like instantaneous?
let me check
wait iam on berlin time (bali)
oh
got confused it cant find 9-10
lemme dig in
i think a session what compacting since 8th so yes
i used same session what compacting since 8th ..
I confirmed this is a change from 26.325.31654, which shows the expected diff formatting.
let me get all infos now
- Context replay dominates output: input/output is 220.3x and cached-input/output is 199.0x across the whole report window.
This is good but also inefficient
High cached-input ratio (199×) = ✅ efficient reuse (cheaper + faster than recomputing)
Means caching is working as intended
This ratio is very high, and usually signals inefficiency:
You’re sending way too much context per request
The model is doing a lot of “reading” vs “thinking”
Symptoms:
Higher latency
Higher cost
Diminishing returns on quality
according to chatgpt
something off because it didnt compact after 4 messages yesterday. Always was super smooth
will read the report moment
it coudl very well be bugged too
you're like the 100th person today to say the 5hr limit got smoked
im on the $250 plan so im not sure
so fast it was scary ..
i personally havent noticed
well pro limit feels like an endless bucket today, i think its on double usage
they try to push everyone out from the 20 to the 100 i think.
yeah seems like it
realistically with the 10x rate increase, if you can afford it, even if its just during hte promo, its really worth it imo
yeah
accept the blessing bro
i would go there but not if its like the old 20
im pretty sure they said its 10x usage limits on the $100 plan
till may 31st or something
then.. get rug pulled haha.. cant be there something stable its headache always changing models
I'd be more worried about the bug that your codex has than the plan cost.
I'm on the $200 plan right now and it feels basically unlimited, I'm running two open claws, multiple projects all the time, etc.
And bearly getting close to 10% of weekly right now
not in the great world of technology
yes that is the point .. would go with the 100 .. was waiting for that. Mention that they should create a 100 plan ..
Yeah like wise, i think its double usage again
@kind jay already confirmed double usage for the pro plan
4 files 2 searches 50k .. (new context window)
now exploring .. 6 files 100k tokens ..
8 files 111k i litreally told it please check the git changes if we have any issues on those changes.
yes that is totally garbage
how is it on your 200k plan @frosty zealot related to the context window
What do you mean
sorry i mean 200 usd plan
He means if you're having context issues probably
This is my same experience
I'm in the same boat as the other $250 plan users, I feel unlimited right now
Do you have context window limitations ? Here i made 1 prompt running over 8 files from git changes ..lol
that is not normal
How many MCP's/Skills do you have?
and how many AGENTS.md's
and how many LoC in them
same as yesterday lemme check 0 skills, 1 agents 136 lines, 1 rule 59 lines... max
Oh by the way I wanted to know from someone who's tested this more than me.
Do plugins take up context or are they only called when you / them?
I put in a bunch of them in case they would ever be useful but I forgot to ask if they where context rotting my prompts
i have 0 mcp ..
is this like a fresh prompt or days or what?
did a fresh one on target files (git changes)
how big are the changes though like whats git diff --shortstat read
wont work that command, but it changed 7 files +51-54 .. not much at all
are they uncommitted changes?
yes
in main
if you have a dirty repo it should def print out the total LoC changed, weird
not sure
wow so cc has 1% usage on the same
Is it a visual bug or is it actually stopping you at 0%
it consumes literally almost real time after every greb
But at 0% it's telling you to wait until reset?
yes
Inchresting, yeah def sounds like a bug, like I said earlier you're like the 100th person I seen say that today
LOL so yes cc 1% i let it run on it on fresh window same question
lmao
let me check how much token cc used for the same ..
But, theres always people complaining about usage like every day so I usually don't pay any heed to it
i never had issues even the last time they put limits in its fine, codex always was generious ..
everytime i complain like last time it was cc i think eric saw me coming here .. anthropic made an anouncement and they booted openclaw
i monitor my stuff extremly haha
cc seems balanced now since last time.
with 100 usd current week resets today on 53% ..
i suspect they drive out users from 20 usd to 100 usd .. then they adjust the limits from 100 usd back to the previous 20 usd limit from feb.
All you can really do is make a bug report https://discord.com/channels/974519864045756446/1070006915414900886 or open an issue on the repo unfortunately
Or upgrade 😛
will do that now. And yes i would like to upgrade but not getting rekt again haha. Wanted the 100 usd plan for a while anyway
I sorted the bug reports by codex and theres literally one post
so I think every one just reports the issues on the repo lol
no permission to post there
thx
did get a refund of one of my accounts .. nice .. will then get the 100 usd plan for one month until the rugpull comes in haha
You have been granted a full refund. Your active subscription has been cancelled. The refund will show up in your bank statement within 10 days.
Could one of you briefly explain to me which plan makes the most sense from plus to Pro 200?
Um currently a plus User
And I didn't quite understand what changed at Plus now that there is one Pro 100
If you use GPT Codex a lot and you use GPT Pro model more than a very little, ChatGPT Pro $100 looks much better than 5 Plus accounts.
It makes the business accounts look quite poor value based on the cost of additional usage. If you spend $100 on Business + extra codex or GPT Pro you seem to get quite a bit less than you would on the ChatGPT Pro 5x account for the same $100. You get workspace/admin/privacy/team features, but less Codex or GPT Pro use for the same $.
https://super.engineering/ the background on this website is sick
Anybody know what you would even call that I might have to itterate on it 😛
Where I live it's currently:
- ChatGPT Plus for $20 + VAT
- ChatGPT Pro x5 for $100 including VAT
- ChatGPT Pro x10 for $200 + VAT
That makes x5 really good value. I don't know if it's the same in other places.
The limits are dropping crazy fast
That was for upgrading my plus account. For a new account, 5x is $118 and 10x is $200. Their pricing is deeply confusing.
If my limits are reached, would the session still continue generating or would it stop generating mid?
how long can someone use codex in plus in 5 hrly limit?
That is basically an impossible question to answer there are so many variables
5 hr at most
lmao
after 5 hours it resets so thats the max
there is a mcp on claude what will get you a clone 🙂
so for token waste .. i found something user told me on reddit: handoff.md
you create under rule that agent should always handoff
and i found this what you think ? https://github.com/rohitg00/agentmemory
since i use handoff and codex 5.3 .. i did way more work used only 75k tokens .. and 8% on the daily .. but still sucx
i always feel like those kind of things carry baggage that isnt needed for the current task
agree .. this time it helped a little bit
I’m not sure
ok i think i found the bug .. compact instant draines immedalty your tokens ..
so never do compact open new session ..
before i was like 1 hour under 50% .. now within 3 min 43% down the new session and it also drains the 5h ..
because compact will pass some important detail to next context window.
but way more then yesterday way more ..
And probably because cached input tokens are ultimately cheaper
... I'm not surprise any problem, because I already have billing issued while using some credit....and contact support to refund later ( that's wrong auto-recharge the credit ) so just let's go, haha
No, it’s night
35s · ↓ 19 tokens - dream usage
Skill issue
Why is codex now completely unusable? The 5h usage is draining like a leaky sieve
The 2x has made it LOVELY
didn't they literally just resetted the usage and gave 2x until may 31st? I don't understand the usage complaint, go try claude code and see your usage absolutely gone in a blink. the number of people complaining about usage is like 10x there
The 2x is only on the pro plans, to my understanding. So plus users won't be seeing the benefit
anyone here tried the 100 dollar option, how is the 5x limit?
Because usage is now half what it used to be?
ahhh okay that's unfortunate for them then, thx for info
I paid for this product, the product has now materially changed
The absolute lack of clarity on ai billing is absolutely criminal
haha
trust me when i say things are worst in claude code land this past 2 weeks, hence why i am here now
yes lol
sam style
still ?
tbh openai does what they want, they change models on the browser like new clothes often and codex is just an extra to the chatgpt sub so they can do that to codex
they gave free quota resets often too
yeahh like the past 3-4 days everyone's hitting their weekly usage for the $200 plan by like midweek and also the quality of outputs are noticeable worse, they probably have a really high load on their GPUs for other "projects", although some people are saying today is a bit better. I haven't tried, still waiting for my weekly limit to reset on claude code
they get free bail out too soon or later
hook up the world .. abuse vc money .. get tax money bail out
he try to get bank status ..
But im not paying for claude? This is nothign to do with Claude
i was chatting with the other dude
thats crazy
Its a bit of a wild west atm
“Usage limit, usage limit”, break out the wallet and get that pro subscription
And what when they suddenly kneecap the pro sub limits?
true, but tbh the whole ai thing is wild west and quite jumpy, they had issues with ai changing personality often so they couldnt control that and nerfed the models a lot, looks like they cant control much of the stuff lol
soon or later first KPI is offload 20 usd into 100 usd
Buy another pro account
i will say i tried plus plan for a bit and i burned through it pretty quick but pro plan feels unlimited or it's really hard to bring a few percentage down for normal use
ngl it was draining yesterday a lot more the 5h cap, not sure what happened but looks like its draining much less now
im on plus
Yeah but what happens when that isn’t true anymore?
cry in corner or load up gemma 4 or smth not sure
Exactly, so keep throwing more money and get less
Or learn to code, idk
never could imagine that some evil corp now control my codding .. i never could go back line by line coding
I paid £200 and in the past 30 days hav euse around £3700 worth of tokens. I'm pretty happy with the pricing
i have 0 cc issues .. damn always cc <-> codex eather one the black sheep .. thats why i like competiton
healthy
meta just got a half decent model as well, not sure they are going to try and move forward with coding though
Can’t imagine they’ll focus on code. I guess they’ll want to use it to start processing the vast quantities of user data they hold as well as generating content for their dying platforms 😂
Wasn't 1M the context size limit for Codex in VS Code? Seems like I can set arbitrary number, although I am not sure if that's really used or not.
there was a 1mill ..??
I thought 1M is available to codex cli as long as you set the flag / limits for it? but I haven't bother changing the default
aaaah so the old had 1 mill ?? in codex
5.4 has 1 mill i think does 5.3? They are terrible apparently
In Codex and the API, GPT‑5.4 is the first general-purpose model we’ve released with native, state-of-the-art computer-use capabilities, enabling agents to operate computers and carry out complex workflows across applications. It supports up to 1M tokens of context, allowing agents to plan, execute, and verify tasks across long horizons.
thats crazy ..
so they swapped it out of plus up to 100/200 .. and i get hit with non stop compact
i guess 100 usd is the way now ?!
I'd be careful with context rot though, even with claude opus that has wayy better need in haystack benchmark scores, i still try to keep my usage 300k-500k max, don't want to risk the quality of the outputs so for codex ive been sticking to the default 200k-ish
In VS Code you can set the context size via config.toml, but to my surprise it - at least superficially - accepts values larger than 1M. Don't get your hopes up too much, once you close in to 500K the chat starts misbehaving and once you go towards 700K it can become senile/lobotomized.
500k is fine
so in config.toml we can set the limits ..
also for codex in cli ?
This is my current project when I start a new chat prompt and tell it to read in all necessary instruction files.
I don't use CLI (and don't even know how), so I can only speak for VS Code.
i use the store app
not sure i did not try but i thought i saw docs / comments about it. But someone correct me if i am wrong
lemme figured that out bc the compact killing it for me
you figured ?
nah still havent tried
Question remains, why can we set sizes larger than 1M and do these actually exist in reality then?
1M window context still in beta test... So use at your own risk. It could less performance than default context.
If you want to try, just start with 500K context.
Yes, but the question was about being able to set sizes much larger than 1M. Why is that and is that real?
extra context is sent to the shadow realm or gpu servers on the darkside of the moon
its about training window size
more context = more scan project , but it might incorrect results when you edit the whole project instead of split by milestone.
What does this mean?
when they feed the model during training the size of the input in tokens is the possible size of the context you can use.
It doesn't make sense to use context so large
But there is degradation at higher context, so its not like you can just train on 10 million and have it just work
So context size can arbitrarily set above the 1M limit and that is not surprising behavior? And the context size is real then, not just a GUI confusion?
So in default context, agent will automatics split into milestone that suit with current context size, but more size doesn't meant you can finish the project in one promt, it's just make AI learn your relationship of your project.
it cant use above what it was trained on
I mean there is some ways to do it, like some context extending methods
I cant remember what its called, but they have to support it.
does steering prompts make the model focus on the new task and forget the old task?
it feels like its forgetting the old task for some reason and focuses on new task completely.
where do you set it in gui?
config.toml
And the GUI then displays a size value corresponding to what is set in config.toml.
you can set it to abcd if you like
I'm not sure what i am looking at
Ok you mean how it says 5273K?
Yes.
I see what you mean now
And my original question was: Wasn't 1M the context size limit for Codex in VS Code? Seems like I can set arbitrary number, although I am not sure if that's really used or not.
It wont use over the 1 million - assuming 1 million is it's limit.
Both the VS Code tooltip and /status appear to read the same runtime session field. Your config sets model_context_window = 5550000 in config.toml, and your current session started with model_context_window: 5272500 in the session log. That is the exact 5,273K shown in the GUI, so the UI is definitely using it.
Inference from the local logs plus the official docs: this is a real client-side budget value that Codex uses for status/accounting, but it does not change the backend model’s actual hard context limit. For gpt-5.4, the official model page still says 1,050,000 tokens: GPT-5.4. The tooltip text “Codex automatically compacts its context” is generic; it does not mean the model truly has 5.273M usable context.
Has anyone noticed if with the recent update of the OpenAI accounts, the rate limits per 5h has decreased? Mine appear to be decreasing a lot faster
they already said it is so
halved pretty much
allthough they seem to be nice in the sense that if you have a long running task they will let it finish instead of crashing it during the middle when the usage limit hits (unlike a certain orange competitor)
plus sub now has the limits of free version and 100usd sub is the new plus
how to upsale 100 million customers
That is happening from one month ago: https://github.com/openai/codex/issues/14593
5hr decrease but weekly limit still same. this prevent spamming the massive usage on multiple account during $100 plan campaign.
oh?
that makes no sense what
There is no plan to make codex available for Intel Macs is there
nope, Intel Macs are done
Yo anybody using Hermes? If so how does it fit your workflow and what model do you use?
so sad that they once again forgot about business plan users with the latest rate limit reset…
2x usage only for pro lite and pro plans
not for plus
cli better
Has anyone noticed that the usage of the plus plan has decreased significantly? I only used it for less than 2 hours, and the available amount of 5 hours is only 11% left.
You can sandbox codex to WSL so it isn't affected by command length limits (which isn't a simple process if you aren't familiar with Linux) or wait until OpenAI to fix the bug. Last I heard they were working on a workaround.
I've heard good things from people who spend the time setting it up to work in WSL, but as someone who works in a windows dev-env using a lot of windows-only code, I haven't spent time trying it yet.
YES YES. plus sub now has the limits of free version and 100usd sub is the new plus.
how to upsale 100 million
i say no to linux
so they just do this for the 100usd pro user
theydo what?
Cut the plus plan’s limit
yes. they do
that’s ridiculous, I already run into 0 for 1.5h
I love how on Windows literally everything is a workaround
Well, OpenAI seems to be a mac-first dev company, for whatever reason
It's a total mystery
@boreal holly I mean companies are able to make those decisions. IE we made the decision to not support Windows
think every company is mac first for the most part
i got a mac just for coding so if your a dev get a mac you can get one for like 800 bucks
I just received this email. But when I look at my credits I don't see the one-time credit boost they mentioned. Anyone got the boost?
hier
Hallo
I would be skeptical
I would check the sender domain. My guess is if you load remote content (e.g. images), one of the things that gets loaded is a "did this person open this email?" (did this image get loaded) and they'll know they reached a real person, at which point they'll send you phishing attempts.
I have a personal Plus account and I did not receive that email
"hier", is that real email ???
hi guys!
because mac is superior
also why am i constantly getting stream disconnected before completion: websocket closed by server before response.completed , this didnt happen before now its almost all the time
2 spelling mistakes, probably scam
but with 5x or 20x we also have gpt 5.4 pro?
Mac is a staple in any tech company in the bay area, unless you work remotely for big enterprises, like the day my soul died when i received a few year old HP laptop for my internship for a fortune 500 company (actually probably closer to top 10/20) because big enterprise was still stuck with cisco webex (couple years ago) when everyone is using zoom/google meets or at least MS teams
Cisco 🤮
I have not-fun-memories of Packet Tracker, and I condemn Cisco
everyone else on plus is getting crazy high 5 hour usage movement last few days right? I think the party's over sadly
are the steer and queue buttons currently swapped? enter and tab enter seem to be inconsistent, sometimes a 'queued' prompt won't actually wait for the turn to finish. also, would be a good idea if you could queue a separate thread so that it starts as soon as there are no active threads, i could queue a build and then a fresh context review/cleanup.
So I was on the free plan and reached the rate limit for 5h/weekly (both were the same) and it said to upgrade to plus to use more. I upgraded to plus and I still have no usage... what.
Maybe try logging out and logging in again? It’s probably that it thinks you’re still in free plan
I think that's fixed it, it took a few attempts but I think it's now just a visual error, saying 100% usage but I can send messages 😂
100% usage should mean you have 100% remaining
look what i managed to do
codex programmed a bridge that allowed a fully automatic randbats game (sd hosted locally) between two agents
pretty cool right
Assuming you're using VSCode and the extension:
Rather than doing a queue from the same side panel, you can F1 for "Codex: New Codex Agent". That opens a window that's just like the sidebar - with a new thread.
The window can be repositioned, and some might prefer it to the sidebar. In earlier versions we could open more than one window. I recent fix of another issue locked it to one for now.
The Tab/Enter thing has always been wonky. There's also the weirdness of the composerEnterBehavior setting.
There's zero chance something hasn't changed starting Sunday/Monday. On plus I'm at 0% 5 hour left with 3 hours remaining. No matter what I would never be there before this week, same work flow, same stuff - I'd be at 50%, at worse.
Party's over
@frosty zealot
How did you know I literally just woke up
I have access to your webcam
Inchresting
so is it cheaper or more expensive with it being tied to the api now?
weirdo
I’m not talking to you
"Rebalancing" I too will be rebalancing. Fun while it lasted, but I can't spare the extra cash for a hobby that's bringing me no income
Hello, I’m working on improving my setup and I’m looking for a way to inspect the actual context being sent to OpenAI.
Ideally, I’d like a tool (possibly with a web UI) that shows the full prompt/context, including things like system prompts, skills, MCP data, and any additional injected content. This would help a lot in understanding trade-offs (e.g. skills vs MCP, or when sub-agents are actually useful).
Does something like this already exist?
If not, I’m considering building it myself (likely without modifying Codex directly). Any suggestions, ideas, or prior art I should look into?
where are we supposed to send feedback on the codex gui app? cos the codex repo is for app server and cli
got the same email
no spelling mistakes
I guess he had it tailored to his language but not fully? lol
from
noreply@email.openai.com
also no "one time credit boost" yet
nothing in history
unless thats supposed to mean something else than actually getting more credits
but then I also switched to pro few weeks ago
👆 um, we're already here. 🤔
So, is everyone else getting hit by this reduced 5-hour limit too? Do we know whether this is permanent?
yes honeymoon is over
I havent hit any limits at all
It's OK, when you buy credits they work on the 5hr limit too
This technoology is the greatest thing ive ever used
Anyone noticed that Codex Plus allowance went noticeably down even though "As a thank you for being a power user of Codex, we've provided a one-time credit boost that's available now in your account."? Consumes all the 5 hours worth of credits in like 10 minutes
Was thinking to go PRO but 10x / 20x this is still incredibly low
I litteraly give ONE prompt and then I run out
Oh I'm not the only one then
is possible use gpt for change stuff on website, download filess from cloud etc just like claude cowork?
or really i have to pay Anthropic
What model are you using??
wait guys
I keep seeing people
say they use GPT 5.2 instead of 5.4
like for coding
why?
Extra high 5.4 mini is good imo
is it good?
Got 5.2? Not sure
Dang dude Codex is sloooow right now
I guess we're gonna get a reset soon when they hit 2 million active users or whatever lol
I find gpt 5.4 mini to be able to do a lot on extra high reasoning for little token taking
ohhhh
maybe this will help me
Did a bunch of backend work today and used like 5% of my weekly limit
Hourly took up I think 30%?
idk bro i type 1 prompt and poof
and also
i dont even have a 5 hour limit or whatever
why
Depends on your use tho because it can take a bit longer than other models
Really?
yea i only have weekly
Plus right?
I have plus and I got a 5 hour
Last night for the first time I got a 'This model is at capacity try another one' message
just people hyping the 100 plan, and with all the extra usage theres probably a lot of overlap which normally wouldnt happen
I've been using the Codex app on MacOS heavily and I am getting a but frustrated by all the subtle and not so subtle bugs. Also, there seem to be bugs in the feedback system itself! Sometimes I get a failure to submit the report, sometimes I get the report-ID (good), and sometimes it seems to take the feedback, but there is no acknowledgment.
TL:DR; I wish the Codex team was on top of all the feedback and pumping out new versions to address basic issues. Thinks like 'Can't generate a commit message' in a lengthy thread.
"Command failed successfully" lol
- Check the GitHub repo for recognized issues.
- The team is very active there.
- Create a new issue there if you can't find it.
- The team does occasionally peek in here but this isn't an official vector to the company, so don't count on notes here to translate to product changes.
filing github issues is da way
you could even use codex itself to reproduce the issue
@frosty zealot lunch looks yummy
want some?
No
How many times higher limits does the plus plan include than the free plan?
For Codex? Free doesn't have Codex access. For ChatGPT access, around 15x.
I used Free Codex in VS Code once, and the limits were insufferable. It's only good for really small tasks
So compared to the ChatGPT free access, it's around 15x more right?
Give or take.
Thanks
np
"free doesn't have Codex access"
"so if I pay for it I get about 15x more?"
"give or take"
🤣
Don't let dreams be dreams
Everyone at some point should also consider using an API key. For hundredths of a penny you can process a lot of tokens. No subscription. No minimums.
If you’re lucky, sometimes you can find them on GH, as a little treat
When is codex 100 plan, the 200 is too steep
It’s out already
Released yesterday
Thank you
You can tell precisely when they changed from message-based to token-based 🤣
I wonder why Plus folks keep running into "1 prompt, 0% 5h" if that's the split
Wait a minute, it says the difference between 5x and 20x is twice as many messages
maybe its one really big message lol
Sueing for false advertisement
Ayoo where the 80x plan at? I want twice as many messages as 20x
I wonder why code reviews cant be done by 5.4
Switch to GPT-5.4-mini for routine tasks. Using the mini model should extend your local-message usage limits by roughly 2.5x to 3.3x, depending on the model you switch from. I
I wonder what they consider a 'routine task'
umm its always been token based... gh copilot uses messages...
Negative
As of April 2nd, we’re moving pricing to API token-based rates. Credits remain the core pricing unit that customers purchase and consume, but usage is based on tokens consumed, calculated as credits per million input tokens, cached input tokens and output tokens your workspace consumes. Read about tokens here.
This format replaces average per-message estimates for your plan with a direct mapping between token usage and credits. It is most useful when you want a clearer view of how input, cached input, and output affect credit consumption.
Under this model, actual credit usage depends on the mix of input, cached input, and output tokens in each task. The new rate card is displayed in the table below, and is currently applicable to new and existing Business customers, and new Enterprise customers.
New and existing customers on all other plan types should continue to use the previous message based rate card, until we migrate you to the new rates in the upcoming weeks.
Okay thank you! I still would love to hear Roberts side <3 (they know more about what I do <3)
Hey @oak trellis how's it going
Did you manage to fix your issue with codex?
I'm honestly wondering what was going on with it hahahaha
You guys notice any showstoppers using Codex Hooks on v0.118.0?
The MCP tool layer is erroring my screen reader calls as if "speak the final response to the user" == "agent hallucinating to random external service", but I don't wanna spend time plugging it into hooks yet if they're too early-on and super jank or smth
prob.
and these people has the most advance in house ais and still get this attack
no u can still use it just prob hit the weekly instantly
thats what i mean
if your gonna use it alot 20 plan isnt worth it
like how much is alot
i'm just doing side projects as hobby
like developing small to medium projects
not enterprise grade
I think they said weekly limit was unchanged, the only one they changed was 5 hour one, so you now get more sessions with less usage
1 prompt waited 5 min 50% gone of 5 hour usage, and week usage is from 100% to 70% with 1 prompt. I will definetly cancel subscription after this move.
@placid cairn Yeah, it’s really like that. After just 3 prompts I got a notification that the 5-hour limit was reached. I was like “what the hell that’s impossible,” but it looks like the Plus plan has basically become the Free plan—just paid now 😄
Well two things happened.
The first is they stopped basing it off messages and started basing it off tokens consumed so I'm guessing shorter messages == less usage, longer messages === way more usage (since you can go up to like 50 turns without a problem).
The second is that Pro $200 changed from being 6x usage to 40x usage so they probably had to modify the base unit in some ways
same here brother
Claude is even worse right now, I have the $100 plan and I can finish a 5 hour session in 3-4 prompts (it's supposed to be about 6x $20 plan on 5 hour sessions and 8.33x on weekly limits)
I would try the $30 plan from copilot, I've heard that one is pretty generous right now, but besides that yeah GPT is probably still the best
Yeah np, if you do test out copilot tell me how it goes to see if I should stop suggesting it to people who are currently on GPT $20
how about google?
is gemini's plan better? like they give gemini-cli, antigravity etc.
Did they increase rate usage for Plus users? I'm somehow already approaching the weekly limit and I've only been using it for two or three days.
token based approaching causing this prob
everyone is complaining about this
do they have any plans to fix it? I have a feeling that they wont because they want people to upgrade
nah
they cant because if they wanted to why they even switch to token based approach at first
ai coding agents problem is not enough computing power sadly
they always reducing usage limits to handle more users
Hey, codex has a 100 dollar plan now
Well theres nothing to "fix", they intended the change (unless they are somehow miscalculating something)
they forced probably
5x rate limits as plus, and now through may 31st I believe it’s 2x that so 10x rate limits as plus
message-based approach is literally not making any sense
i would be more open to buying tokens if it wasnt $40 for like, a few messages or whatever it is
try minimax 2.7
whats that
like 1/30 cost of tokens but it tries to mimic top tier models performance (people says its literally doing great work not as much as top-tier but great)
Yeah agreed, specially as agents become more capable we are going to be moving from 1 message -> 30 minutes of work to 1 message -> 8 hours of work
True, think they reduced plus
stupid than top-tier models by %20-%25 but way cheaper
Sucks it was a major selling point but now they going full anthropic it seems
hm
if you dont need top-tier work its great to consider
i mean... im trying to use it to help speed up programming for game dev in Godot so
and you can also just make it do heavylifting and use your premium weekly limit to review it
idk how well itll fare with that
Codex becoming like Claude in limits is a nightmare hope it doesn’t happen
it can be mental blow on this specific work i think
it's optimized for general workflows but using godot and programming a game is even hardwork for codeex and claude
I mean it works ok if I give it the documentation as context
but gemini is better at game programming and overall godot
but the harder game dev problems is kind of difficult to nail it down
how much is it?
gemini?
it's half price for first 2 months
but whats the price tho lol
it's full price is nearly 20$
I mean i use GPT for more than just coding so I kinda need the subscription for that too
i dont have a ton of money, GPT Plus is one of my only subscriptions
idk if i could do two of them simultaneously
yeah gemini is good at game developing and godot overall but not good at other things like asking questions, architecture etc.
but thats good to know regardless
Its not even close to anthropic drop in usage but I get what you're saying, I guess for me it just feels like everyone knew this was coming they're giving $4-5k usage on a $200 plan
gpt is overall better decision if you're having questions etc
ya ya always good to consider options
rip
is that a local model?
if you want to. but windsurf offers it tons of billions parameters model as cloud
for free
for limited time ofc but it's still active
hm
not giving top-tier work but gets the job done
OH WAIT i forgot I have GitHub Copilot accessible too
thanks college lol
i can probably use that to help a bit
gh copilot is the worse in my opinion
but if its making your work, then its up to you
idk ill see what happens
mainly bc the thing im adding is getting pretty complicated, idk if i could keep working on it on my own
current ai coding models are just try-and-see work model
lots of moving pieces
godot...
?
i thought you were talking about godot cuz i used it one time and it was hella complicated
i mean its the thing that im trying to get working IN godot thats complicated
oh
im basically trying to make a mario-maker styled game editor in 3D
currently tryng to fix collision detection problems with rotation
i mean its gotten me this far lol
yeah the problem always begins after that far
in my cases
I make things, it goes so good then suddenly my highly intelligent super developer ai model goes dumb
yeah without ai me too couldnt even make this project this far
i'm looking for good subscriptions but codex,claude everyone complaining
I dont mind having AI help with/do the coding bc there's only so many ways you can program something to have a certain outcome
i'm stuck with gemini
definitely dont want any Ai generated creative assets in my project tho. There are infinite ways to be creative.
Yeah ai is hella good for developers that doesnt wanna learn all the language for simple job
Like I was trying to do some project but it needs me to use another language in that job so i use ai to make it and it's done
I mean I mostly get the language, its just the "spending hours trying to find the exact configuration of features in the engine" part I want to skip lol
GDScript is a lot like python
yeah I was using kotlin and project made me learn rust/go instead i just asked ai and it made both of them seamlessly i tried one and picked best result, no need to learn all that for such small work
if its python-like actually minimax 2.7 can do a good job, cuz it's overly trained in python and optimized for it mostly
but the thinking phase, i pass that.
can it get context from txt files?
It's designed to be an agentic model like codex or claude
so it can take all type of context
I have godot's documentation downloaded as a 10MB txt
It can do it but i'm suspicious about it's applying phase
hm
even claude struggling to keep up with context files
it just forgots about it after some prompts
probably need a skill file or another approach
codex is good at context things
it doesnt forgot anything you say
lucky guy
Can we see how many users codex has? I want to plan my usage for the 1m resets
maybe i should reduce the reasoning effort a bit when using codex
im assumign that reduces token usage
I would avoid that, if you really want to reduce usage there are better strategies
GPT 5.4 high has a high improvement from medium. (dont use xhigh though)
bro using xhigh and complaining about tokens
xhigh is like -> come and take my tokens
I didnt know they changed it 😭
Oh wow I'm actually flying through usage today on $200 plan, yesterday it felt unlimited
Xhigh was one of the only ones that felt like it would work half the time
what work are you doing bro i've done all my rust/go projects with medium and never had any problems
also this is what i was working on. basically working now
good work but its not really lookng like it needs xhigh
🤷♂️ idk when i was testing it early on it felt like xhigh was the only one that would get it done right and fast
maybe its about godot idk
in my rust/go projects it was doing everything right with medium
idk, if it works it works
and at first, xhigh was the one that worked
maybe the right approach is reducing thinking after milestones
like at first when doing first basic skeleton of project -> xhigh, then after modifying it a bit -> high after everything done and working editing it with medium
maybe if i do it like this i dont hit weekly limit that fast
handoff.md works actually for me never let it compact .. just use handoff.md .. design your own template .. works well saving tokens for me
never believed in that stuff but it works
What about an Android app for codex? I wanna get things done on the go!
I mainly use ChatGPT Plus for hobby programming, not for professional work, and the recent change from something that felt closer to per-message usage to token-based usage has made it much worse for me.
The problem is not that I am working on huge projects or doing anything extreme. This happens during normal everyday coding use. Even with regular prompts, ordinary back-and-forth, and standard coding help, I can hit the limit much faster than before and then get blocked by the 5-hour window.
That recent change is what frustrates me most. Before, it felt much more usable and predictable. Now, with token-based usage, even normal coding sessions feel cut short too quickly. For hobby programming, that seriously damages the value of Plus.
I am not willing to pay more just to keep casually programming under these new restrictions. For my use case, this recent change made the service feel much less worth the price.
has anyone found out a way to get rid of all the bloat code the models generate? i dont want the best practices i just want the smallest code that works. verbosity didnt fix it
keeps resetting the usage
whoops, we need to lower max daily/weekly usage
what's model you using ? Medium / High / xHight ?
High will act as professional coding.
If you want to use the high, you need to promt the detail to generate small code, don't need model/controller style.
Tried with all versions after 5. Used it with all except low. Don't think medium was better but didn't use it much
The cheap solution is to ask it to make the code as small as possible without breaking functionality
Ask for minimal indirection with as little abstraction as possible. Give it a limit of 3 for cyclomatic complexity unless it it unavoidable, limit nesting to a depth of 3. Have an architecture that you adhere to. Build a topical-critical-architecture-guardrails skill that the agent must read every time and build it with reference files. As you encounter common mistakes add them to the skill.
In agents.md or in config.toml developer_instructions add instructions:
- when planning searching or implementing it should step through relevant code until it reaches concrete implementations.
- For planning and implementation follow local coding conventions - find 2-3 local code exemplars that match the implementation target if possible.
This is short hand for what i do, but essentially this is how i get idiomatic code.
Or just : /plan mode with small detail "create feature...." with small code : that's fast solution, and check if agent planned to create model style, then you can change the detail here.
yes it does
Isn't Copilot AI?
I thought that Copilot is AI, like ChatGPT
Or Gemeni
What's cyclomatic complexity?
Are you using codex in copilot ?
No I meant the vscode extension
oh ok
lots of conditions in a method - this is a question for codex 🤣
this one?
No I meant I used copilot and now try codex
a methods with a cyclomatic complexity of 4 and nesting level of 4 is a real pain to work out whats going on when you look at it
I use only GPT, because i have a Pro subscription
Honestly I like the ux of copilot much more
yes, or even in project root is enough
Alright
I think that Pro subscription is too much expensive
103 euros/ per mounth is crazy
maybe 70-80 euros is enough
yeah just tell codex to create it and explain what you want
It's simple isn't it?
Why two files?
Don't they both get fed in the prompt anyway?
it's probably because of the God-awful VAT
now, imagine 24% VAT on top on a digital service
yes, it's simple, agent will automatics implement the rules into AGENTS.md
PROJECT_STATE.md will keep some detail/planned of your project.
🤦♀️
Ok, but i think 70-80 euros is enough
it should be
Yep
I wonder if it's really not possible for OAI to not have VAT applied for EU customers on something that is literally digital
paying 24% (or wtv is your country's VAT size) on top for access to AI models is just unfair for all of us EU users
and then it's giving NA users an advantage in price
Vat is clear thievery it should be outlawed
in the api you get the vat as credit
add 50$ but pay 60.50$ and 60.50 gets added to your api
It's the question of, "why should we pay more for the same service?"
I wouldn't mind... i don't know... A 9% or something VAT, but 24% in 🇪🇪 for example...
It is the question of, "the hell am I paying for out of pocket?"
My neighbor country
let me guess... 🇫🇮
The only solution is to use a company
No
Yes, as if any hobbyist had a company
Guess again
🇱🇻
yees
Maybe if a friend or family have business vat
It's pretty nice
Thanks, and i love zabka
zabka...? the POLISH zabka? that store?
YEEESSS
i love poland so much
No(
Gliwice is such a pedestrian-friendly city it's insane
compared to the disgust that is Krakow
I was in Warszawa, Krakow, Gdansk
I was in Kamieniec, Gliwice, Tarnowskie Gory, Bytom (for like a few times just to go elsewhere), Katowice, Krakow (twice--never again hopefully), Warszawa, Pyskowice, Karchowice (once for a few days)
moutnful, innit?
Ok
you get the public transit really nice
PKP Intercity, Transport GZM (often seen as ZTM locally), and so much more
It's... like second home to me

I may not know the language, but I can easily navigate most of it as if i lived there
you need to codexificate
Bye
Also you can find me in #prompt-engineering #dev-chat
did they just remove the fast mode (also) for pro in the latest vscode prerelease codex extension?
its gone
also from settings
also /fast is not a slash command now
https://developers.openai.com/codex/speed
Guys, did Codex introduce lower 5h limits? Last week I never hit one, coded like the whole day. Hit weekly limti, waited for reset and now after 20-30 minutes I have that message ~25% 5h limit left
thats what people have been mentioning if you scroll up, yes
Bruh, moved from claude because of the limits, now here too :X
everyone is hammering the coding tools 😄
yeah, Plus's 5 hour limit was reduced so that overall it lasts longer in the week, but at the end of the day to increase the quota for Pro 5x and Pro 20x they had to change the balance
As always, I am late for the party 😄 Subscribed to Claude, introduced the awful limits after ~2 weeks, now here after 1 week ahah
oh it just suddenly reappeared without even updating anything -.-
must be serverside deciding to make it appear or not
at least the limits on the Pro 5x and 20x plans are very good, 20x at the moment feels like it near enough shouldn't bother showing quota bars 😂
it's still work.
it was completely gone for me for like an hour after I updated to latest prerelease vscode codex
oops, I only use stable version.
is /fast type of command that makes codex think faster than ever?
Okay
so what is the idea with business codex accounts now? the limits are kind of not usable if you want to use it for software development work
Is it free?
when you dont expect to use your usage fully then you can just enable fast mode to use it more
that is if you care to have it faster
talking about business, we ordered 20 business accounts, but they became basically almost useless now, quite a few people need a 5x Pro subscription, but it is not available for business
Thanks
nope, it's not diff, just 1.5x and priority queue connection, and 2x credit usage. Look worst than normal speed.
Example : during days time, there have many user using gpt5.4, if you using "/fast" mode, it's will send command to priority server instead of global server.
wow
well its not a big deal, I think you still get plenty of value, just can't work with this normally, 5h limit is depleted after 1-2h of work
I'm just trying to understand why the 5x Pro account does not exist for business, we just want to be able to manage everything at one place
20 accounts and they empty after few 10 minutes..
as a pro 20x user, I don't really notice a speed difference usually between normal and fast mode. Possibly because pro has priority processing which makes it faster than without anyway. But between the choice of fast mode for twice the cost, and using more subagents or concurrent work, I prefer not using fast mode and just utilising subagents or concurrent sessions where possible
That's because Pro x5 and x20 are x2 until the end of May... So enjoy it while you can...
even when 2x ends, that means I've only used 4% though, I used to use about 10% to 15% per day 😛
Then you are golden, hopefully they won't nerf things after the promotion ends.
🤞
Guys do you know if using gpt 5.4 pro in the openai platform use usage that we see for codex ? It's wrote general usage in codex but idk if it's specific to codex or no
separate quota, ChatGPT's quota, whatever that might be for GPT-5.4 Pro
Ok that's crazy
Okay guys, let's talk about MCP, i made a guide for this using ChatGPT:
- What is MCP?
The Model Context Protocol (MCP) is an open standard for connecting AI models to external tools, data sources, and services in a consistent way.
It defines a unified interface so that models can:
Access files, databases, and APIs
Call tools (functions) in a standardized format
Retrieve structured context dynamically
Work across different applications without custom integrations for each one
Originally introduced by Anthropic, MCP is now widely discussed as an emerging interoperability layer for AI systems.
- Why MCP exists
Before MCP, every AI integration looked different:
One app uses custom function calling
Another uses plugin APIs
Another uses bespoke JSON schemas
Each tool requires separate glue code
MCP solves this by:
Standardizing tool + data access
Making AI systems portable across apps
Reducing integration complexity
Enabling reusable “AI tool servers”
Also, MCP is that protocol, that has code, here is an example:
{
"name": "search_files",
"description": "Search local project files",
"input_schema": {
"query": "string"
}
}
-
Handler logic:
Accept request
Execute operation
Return structured result -
Transport layer
Usually JSON-RPC over stdio or HTTP -
Best practices
Security
Restrict file/system access
Sandbox tool execution
Validate inputs strictly
Design
Keep tools small and composable
Avoid “god tools” that do everything
Return structured outputs, not raw text
Performance
Cache frequently used resources
Minimize large payload transfers
Stream results when possible -
When to use MCP
MCP is ideal when you are building:
AI-powered developer tools (IDEs, CLI assistants)
Enterprise assistants (docs + databases + APIs)
Multi-source research agents
Long-context systems that need dynamic retrieval
- Summary
MCP is a standardized protocol for connecting AI models to external context and tools
It introduces a consistent architecture: host → client → server
It unifies tools, resources, and prompts under one system
It complements OpenAI-style tool calling rather than replacing it
It enables scalable, reusable AI integrations across ecosystems
And that's not all the features!
are you pasting from ChatGPT again?
No, i make it in Word, by choosing sentences in https://developers.openai.com/codex/mcp
ChatGPT helps me with prompts, coding and the problems that i have in my life
And also, I have experience. I have 5 years of working with ChatGPT and programming languages such as C++, Python, C#
ChatGPT hasn't existed for 5 years
I mean AI
which AI did you use before ChatGPT?
whats your goal with this?
Tell people what MCP means
No one here needs that though
GPT-3, Instruct-GPT
Ok
If they want to know about mcp they can do the same thing you did
I'm sorry
And they all from OpenAI
not for 5 years though
how did you obtain API access 5 years ago?
I can't remember that, but i'm sure that i have worked with GPT-3
check rule 12 of the #server-rules
Okay, and?
tell me you have worked with GPT for 5 years or apologise for making a mistake
i have? how can i trust it?
Send you an API-key?
I can't be bothered now, carry on
Okay, nevermind. Just forget about it
Script that analyzes CODEX usage
Did codex die?
working fine for me right now
OpenAI released the first version to the world in 2022...
He got a point
It's funny how the 20$ plan users are crying. at 20$ don't expect for much.
they actively reduced the limits for Plus, I could do a lot more with Codex on Plus than I can do now
Plus was never supposed to be used for daily coding, only simple work. I wouldn't expect it to write full features and new modules a day.
It's there for simple stuff, and for 20$ it's still good.
Doesn't change the way expectation were managed.
I was here kinda warning about it... My worry is about being able to use models not about if 20$ will give me the power to write fully featured apps.
No, look for GPT-3
I read your comment about ChatGPT which was released in 2022.
Ok, just nevermind
better than competitor for 20 bucks
these people complaining have not been on Claude Pro || Gemini Plus/Pro which are about the same-ish price, but way less limits overall in many areas
OpenAI is doing a huge thank here
What's with this reduction in CODEX usage of at least 80% or more? Isn't that a bit much?
the problem is the business accounts have no 5x or 20x licenses... if they fix that everyhting is fine but now business is literally useless...
💔
might buy the glm coding plan just for this
eh
i was on the $20 but canceled my plan a few weeks ago
I think i have a few days left anyways
Pro 20x 🚀
I'm not gonna complain. I use the heck out of Codex.
sheesh
yeah im like that with claude x5 plan
and geminis free 15 mo pro plan for students
The way I see it, I have til May 31st to adjust to the new quotas. Gotta find more token savings. Always happy to share the findings here, but yeah the burn is tangible now
The burn is real, and as i said yesterday we are still on the x2 🔥
Exactly, that's why May 31st gotta adjust for token efficiency. I read about this thing called extended prompt caching that lasts up to 24h instead of 10-15 mins which might be useful (exists in the API, idk about subscription)
Previously, only paid credit will count as token base usage, now subscription will also count as token base usage too. That's why it's burn.
yeah I think when I had it run out, it still finished task as well
codex is great, just prefer other models for everything else chatgpt offers
Yeah I noticed on Pro when I hit 0% weekly and started using credits, the credits would evaporate very quickly. Now the quota is about the same as credits! Makes sense!
I think the short term for me is more gpt-5.4-mini. It appears to use 3x fewer "credits" on output and ~6x fewer on input. It's not a bad model either, just requires more frequent corrections
I have been on gemini pro
And the quota on gemini pro is better
Same 20 dollars per month
Jules alone has better usage limits, and im not even counting the cli or ag
Hello, try this "config.toml", this should reduce the token usage a little bit.
model_verbosity = "low"
model_reasoning_summary = "none"
If you "/plan" and don't care about verbose/summary from agent.
Nice, I have not changed verbosity but sounds potentially helpful
I have just been using mini on high mostly and only like 5% of 5 hourly per prompt
So
Ntb
but their models are marginally worse
True, last time i used Gemini Pro 3.0 i turned back fast, it forgets things doesn't stick to instructions too. I guess not much improved with 3.1?
nope lol
last time I tried 3.1 it was awful for agentic work
Also gemini 1m token context window is a big trap.
Source Anthropic when they genrally released 1M context window: https://claude.com/blog/1m-context-ga
Release the super app
I think you can try to build it with codex no? OpenAI slogen is "Just Build It"?
should i get the zai code plan
"you can just build
Thanks for correcting me
I believe there are some benchmarks around, but then again those are synthetic and doesn't give the whole picture.
its coming like
next weeik
week
with openclaw agents leaked
and all that
gonna be insane
Now seriously what is this super app? does it come with a new model? And about next week, any hard proof?
are you mentioning the "super app" that merges codex and atlas
or smth else?
I don't know to be honest, Maniac has said something about a super app.
i only know about the "super app" thats gonna merge atlas and codex
now i'm interested too
I am interested in any news of a new model, OpenAI been quite. Wondering what they are cooking to counter Mythos from Anthropic.
there are rumors about they're making mythos opponent
5.4 hasn't been out very long
before 5.4 openai was so quiet so 5.4 cant be considered new probably they're cooking smth else background
just a simple 1 prompt with Codex Plus, it feals like a Codex Free 😄
switch token based open src models
and use bigger models for just reviewing
or just pay more.
no, because they as a provider quantise the GLM models to lobotomy
you get extreme garbage at around 30-60K context window
ridiculous
You speak wisely, but I still think that sometimes it’s better to say nothing than to say something that doesn’t make sense.
Tibo, An actual dev at codex, and every other codex developer soft leaking it for the past week
That’s a small part of the big plan btw, which is also super nuts
so peak
You guys gotta get twitter even if you hate it it’s the only place you get super up to date ai news and devs dropping stuff
why openclaw is even there
Because OpenAI I believe got the maker of it to join them
I think
But openclaw will be super useful with OpenAI safety because it will be able to help you do so much stuff
i wish we could have a more natural benchmark that shows a model's competency
might go try to make it
do i get like this if i set codex at low ?
You made a statement about quotas, not model quality.
gpt 5 released on a thursday
?
No just a pro plan
Spud is gonna go against mythos for intelligence
like gpt 5 released on aug 5 which was a thursday
im manifesting that gpt5.5 will release on a thursday
yes, and they give you more quotas for crappier models
so it definitely isn't worth more
GPT-5.4-mini is really not that bad! I think as far as the code it writes, it's really solid. I think the only thing mini does a bit worse than the big model is in a stressful situation (e.g. tooling issues, tricky troubleshooting) it gets "stressed out" and will sometimes go in circles trying to figure out the issue. The big model handles stress a bit more gracefully. So the key to getting the most out of mini is informed steering during stressful situations.
Thanks, now i am interested what are the big plans xD
to name a few possibilities:
- GPT-5.5 or 6
- GPT-Image-2
- The all-in-one "super app"
Is there any way how to fix it ?
It used 30% of limits and did nothing
switch models ig
it's extremely annoying
GPT Plus and Pro x5 feel pretty unusable for programming right now, unless you’re willing to work very slowly because of the new token system. Does anyone know of another AI that’s comparable, or nearly as good, but with a setup more like the old system? Now it’s basically: work efficiently for half an hour, then go old-school and spend four and a half hours digging through code before you can be productive again. After that, you have to wait a few days before you can work properly again. At least with GPT plus.
Has anyone achieved good performance on using more than a 258k token window?
yes it’s useful to increase the window for huge repo’s but it’s token costly and can take a lot longer
Token cost I am not worried about, quality and drift. I use very little tokens, codex can run for hours and use nothing.
it’s worth it to just upgrade your prompting instead the more variables you fill out in your prompt the less reasoning it has to use to figure things out the window usually isn’t the issue
I have heard that it can be poor results though, if boosted to the 1M limit.
My prompts are large scale projects, its needs the patterns of the architecture.
I have reduced my token usage by 60% through "methods".
it can but codex as far as i’ve been using it i use it a LOT, its very dependant on the operator to fill variability, efficiently use tooling and focus on doing sliced runs rather than huge 1 shot multi action prompts
increasing the window is only worth it IF you can actually cover majority of the issues that cause drift at that scale
I am doing slicing runs, just need the context to isolate the problem. I feel like context compaction is removing history and then causing drift based on rules.
what i’ve done for things like that is a custom skillset that organises repo context PERMANENTLY as it works so that even if i reset its knowledge mid task, without spending a long time or risking drift again just trying to find everything
it checks the organised logs that it generates
this is also really good for long term projects that may a lot of the time require you to double back on things
Yeah been doing that maybe be more ridged, are you specifically asking for those md updated every turn?
compactions works like music compression you still hear the main song but OVER TIME small details leave, that’s the purpose of the custom skillset context is LOGGED in a specific manner that’s effectively lossless and not a giant 100k word log of everything
Yeah, I know exactly what compaction does, I do doctoral research on AI memory. The issues are:
- I do not know how the latent space was trained for chat GPT, so we do not know if it natively uses 1M token are is doing Matrix broadcasting across vectors.
- The Agent loop is hidden, so what are thier prompts and instructions and algorithms for compaction.
i’m tired af forgive me if i answer indirectly but in its instructions there’s specific triggers i have set for when it acfuallt does the log writing, it’s set to pause while it does that it usually doesn’t take long regardless but the only downside is you can end up with GB of codex run logs that pile if you don’t obviously manage them if they’re not relevant
Now that's valuable. I do see your cascading and data sprawl.
This will answer your question, accuracy suffer badly.
Find it here: https://openai.com/index/introducing-gpt-5-4/
All the while anthropic have quite good accuracy for their 1M context implementation.
Source: https://claude.com/blog/1m-context-ga
exactly what I thought, the Model is not trained on that size of a Latent Space matrix. Therefore, they are taking a 1M token matrix and folding it on itself and "hoping" that they weights of the vectors stacked on top of eachother improve latent variabel search.
Which this seems it does not.
see those i’m not sure of currently that’s interesting but i do know it’s been more beneficial on my end to assume that certain functions are not agentically perfect and creating skills and triggers for them that replace functions in tailored ways
It works for Diffusion models sometimes. Comfy UI does that, the token window of a model is 77 Token, if you put more than that, they overlap (broadcast) the numbers on top of that 77 token window. Effectively adding them together.
has effectively replaced certain things
Yeah no offense if you’re going past 170k tokens because you made more than one prompt it doesn’t matter the model you should’ve started a new one, keeping them at 100% is important
it’s just a bit tedious because you do have to figure out all of that on your own accord
I feel the tooling is doing the heavy lifting.
That's not how it works, read Rune messages.
My 1 prompt is that large. Its Examine this code and do this to it.
Depends what you need Claude might be better for your work for recall from a big context window.
Though make sure to look at Claude IFBench it's not so good xD.
oh yeah 100% tooling integration is like the core foundation of its success imo, for people advanced enough to use every feature you actually can really abuse codex for just about anything, it just shifts a lot more of the automation away from codex and onto you
when i’m bored my favourite thing to do is make it do really non general things
today i tested its capacity to be an anti virus and im not gonna lie it did pretty good (cli)
Yeah, I just think the models are not there yet, they have not been trained for large projects, its not in the Stack overflow dataset lol, its all one fixable item. Models are only as good as the example training data, and the one thing we dont have is 1 person looking at a entire repo documented. Its not something humans do, so its not training data we have.
https://openai.com/index/harness-engineering/ its OpenAI's own philosophy.
Trend of improving the determanistic tool suite.
blah.. let me fix that.
one extremely powerful thing i’ve noticed the codex cli is actually remarkably good at doing (tasks like this you do wanna make the context window larger) but isn’t a mainstream use is full system tech support, it returned my old laggy laptop from a rust bucket to almost running brand new and tuned the cpu voltages, it even managed to diagnose an issue on my friends pc remotely that multiple computer people and part swaps were not able to solve, it was some bios thing and a few other weird caching logic issues
provided your trusting enough to give it full agency and just sit back
Yeah, thats cool I have been wanting to experiment with that.
trust me do it 😭
i’ve found more enjoyment in making it do niche tasks than actually developing with it
Sweet I will, did you have it run Powershell commands or just run through codex desktop app?
mine is cli set up at my computer root using pwsh 7 and it’s able to run anything including powershell commands yes
i have the app too but i use cli a lot more often
I use the VScode plugin so I can work and check things in the CLI, is there a benefit to the CLI over the plugin? Right now mine is working in a pure linux environment WSL. Thats where I Code and create stuff.
i also suggest installing certain things MANUALLY that aren’t default on windows and baking the increased command capability into its knowledge by default
like for e.g there is a lot of pwsh functions that have existing alternatives that are much more efficient
like when it has to scan big file paths for certain content certain pwsh commands are HEAVY as f on tokens despite other command types you can add doing the same thing much more efficiently
i noticed on 115 this was actually a major issue that standard codex would end up in where it would end up doing MASSIVE uneventful get-child chains to find information it wasn’t sure of
i only even thought of replacing certain command functions because of that
If i am not mistaken the plugin just runs Codex CLI behind the scene, but i might be wrong. Also the Plugin might lag behind.
DM me we should swap solutions, I have a suite of tools I am building for Linux for this exact reason.
my experience with using it through vscode wasn’t fun your better off actually using it externally and just pointing it at the repo when needed
It does, I just havent been leveraging the / commands much, I think its a habbit when you use the CLI directly.
I feel I have to ask the AI why something is happening how its different before the agent loop actually does that itself and then adds that back into the context. This seems like a specific flaw in the Codex agent loop. OpenAI if your listening, you may want to fix this.
For anyones information, the Agent loops are desinged to navigate a LLM latent space which can find answers givent determanistic text. Therefore, we are kind of doing gradient decent withing the latent space of the LLM which in turn finds latent variable (IE, the answer we want)
default codex behaviour is a bit like that yea but these things are changeable manually using developer instructions and the agents.md files atm
like i wouldn’t say it’s very beginner friendly yet
unless your making a random number generator
True, but have you seen their default codex prompt at the top of the session file?
ITS 15KB !
Anyone else having issues with GPT refusing to do basic things through Codex? As an example, I asked it to do a review of an issue with a directx shader producing dimmer output than expected, it ended up responding to a question it asked itself while looking instead of what I originally prompted. I ask it to provide a response that fits my original question and it directly tells me "No."
It then took thirteen messages back and forth for it to quit refusing the request and actually complete the initial task. I had to walk it through stuff like "What do you think is worth refusal here" and "why exactly would that prevent you from completing the task" before it finally acknowledged it could do the task and began.
I'm having this random denial behavior almost every single session I start lately and I can't figure out why. I'm getting it across repos and with no AGENTS.md configured.
It's been doing this for all kinds of random tasks over the last ~2 weeks. Using defaults, 5.4 on medium thinking.
Uhm, why? The website shows 4% aswell...
That next prompt with the current tokens you have on that white circle on the bottom right there would go over the 4%
you probably have a weird twin cli setup issue this is actually interesting of all issues i’ve never seen this
there is no way, "test" would use 4%
maybe some weird sub agent settings
but honestly to diagnose your own issue
you could fresh install at a new location
another brand new cli install
and have it investigate 🤣
Do you have Pro subscription?
Codex agent loop is realy build around goal chasing and planing. Ask for a plan to fix something, then execute the plan.
no plus subscription