#codex-discussions
1 messages · Page 25 of 1
this issue is on openais end right
"stream disconnected before completion: An error occurred while processing your request."
?
dark clouds gather....
focus on coding. I doubt its a bug
Yes
A 100$ plan would be great. Might have to go 200$ plan soon
🙁
this cant just be the 2x promo ending...
it is a bug
3 prompts is literally the 5 hr limit
4.2 bln... Nice!
1896 dollars lmfao
and its just 2 sessions without subagents, something definitely wrong lol
or it built something really epic
What are you guys building
same as everybody, a system to build the system.. 😉
the buildings get so good, it all becomes secret
yes
next gpt model will be insane, hoping also remote control cli
New model would be useless if 1 prompt drains your entire 5hr and weekly limit
please dont start on the usage thing. the claude discord has been overrun by spam about complaints
Are you on Plus or Pro200 plan here?
pro ofc
3 business seats
usage lmts should be fine in those modes
A lot more than pro apparently
and generous
4.2B tokens
Not at all
Is there no way to buy more?
?
is there a way to purchase extra usage if you run out
ok thats good
ai is greatest tech world has ever seen. buying a slice of it for 200$ is a steal
cost to hire developer to do the same is hundreds of thousands
i wouldnt have an issue paying for 10x pro if they made clear tos for it
so im not complaining
or reasonably credits pricing
so would be priced over 1000/month?
Right
2000
its worth it if youre making serious money using it for your things
Enterprise
isnt enterprise api pricing?
Not sure
Probably is doe
It’s customizable so probably better than paying for credits
they changed the pricing model after april 2. credits are now exactly same price as API. soon for plus/pro too. https://developers.openai.com/codex/pricing?codex-usage-limits=plus#how-do-credits-work
how fast do credits deplete when in the middle of a project
this is probably too individualistic
fast
i wish i could afford gpt-5.4-xhigh-fast at API rates. not more worries about limits, the dream
bring back 2x 🙏🙏
this was always the case. its for credits you buy after you burn through the limits
its much better to get a second, third, fourth sub (cough) than pay for credits/api
not worth potential ban in my opinion. ai coding too valuable
do things the right way and sleep soundly at night
if you're sleeping at night you're not coding enough 🙂
not a service you want a ban from
i hope they fix this entire thing not just reset the rate limits
Best thing most peeps can do to start is to just disable Fast mode - that nearly doubles token usage ... so there's your 2x back. You trade time for money. Want to use less of your time, pay money. Don't want to pay money, it will cost you some time.
Frankly I don't even notice the difference between Fast and Standard mode anyway.
not even using fast mode
i never use fast mode
literally
but 5.4 xhigh
i use medium and it drains
As I said "most peeps" ... that eliminates many here ... but perhaps not "most".
its so good lol
yea even going down to high i often big difference
i dont use anything other than maximum overdrive
same.. "update the readme -> 5.4 xhigh"
U use claude too? or only codex
Devs here need to recognize the Good/Fast/Cheap triangle. https://www.google.com/search?q=triangle+good+fast+cheap
Pick Two.
i downgraded claude max 200 actually after 8 months or so yday. had gpt pro + claude max in parallel. but i just cant deal with claude anymore
max5 plan has generous limits imo. what couldnt you deal with anymore, its replies?
which produces better code in your opinion
also much better at following instructions, no "production ready" BS, and most importantly, the codex cli auto compaction is just magic
codex no doubt
Another way to look at this is to back away from this new code fad where everything needs to be done by some third-party provider. Do your own coding, the way it's been done for decades, and rely on the new tooling for details, higher quality, and tougher challenges. Balance tasks more equitably between human and bot and you'll save all of those precious tokens.
Im here to spend token like a flamethrower in vietnam
i'd be like a beached whale without AI tbh. mostly using rust and ts now and never wrote those by hand
I had to tell claude to not conserve tokens for the sake of efficiency, because i noticed they literally had claude back off on token usage from the spam feedback they were recieving about free plan/20$ plan usage limits
it was giving me half-thought through replies and crap depth before i instructed that
it would take hours to write a single script that connects to everything correctly while an AI would be able to do the same task in 5 minutes
if the script is like 800 lines long
AI is kinda like a gateway pill. If you let yourself get addicted then you will forever rely on other people for your own products. That's NOT good for any of us. It's like forgetting how to make fire ... which most people can't do. Yes, at some point AI will write all the code and we won't need to see it, and coding languages won't use syntax that's meaningful to humans - it will be purely functional "machine code". But we aren't there yet. YOU must understand code. YOU must write some code. So the more token costs increase the more you should realize that maybe you shouldn't depend on it so much.
-- Patronizing lesson of the day 🤣
i think of AI as a translator
like telling me to speak another language when i could be using google translate
or ai
whatever
funny thing about claude is.. i use super strict linting settings, and have those in my posttooluse hooks for fast feedback loop (in my own codex fork because for some reason they are a bit slow to release). and after a while claude just stops. not even a response like "this is too hard", just stops. and codex powers through.
Learning and using other languages is good for the brain.
I have noticed the laziness too, imo its directly caused by the spam feedback they get about complaints
its also takes alot of time
yea probably it thinks the write tool is broken because it constantly gets error
its fun too
Um, time for a better brain? Sign me up! ( I speak about five languages, mostly pretty bad, but my brain is pretty healthy ... except when I open the refrigerator and look for the morning coffee, or I look around the house for my glasses which are already on my face. But those are exceptions really! My brain is healthy I tell ya! I aced an IQ test! I correctly identified the giraffe!) 🙄
only negative thing about codex is that it seems very focused on its last response. you cant iterate as well as with claude to build a understanding over multiple turns. but i think this is what actually makes it so good in long running tasks on the other hand. so just have to understand how to talk to it. i also miss doubled ESC in codex to quickly jump back.
I think this will improve wrt time as well. The pace of full codebase generation is unreal in its current state, so next models and codex updates are going to be insane
They* went from autocompletion in vsc using ai to now thousands of lines in minutes
i missed the autocomplete era and went straight from writing everything by hand to not reading most code anymore lol
I did not use it during autocomplete either
i speak english and mandarin both fluently and again its not about the brain its about the time it takes
You have to remember that on every turn it needs to look at the entire project like it's never seen it before. If your project isn't organized or document, then it's duplicating wasted effort in every turn -- consuming time and tokens. For way too long the FOSS world has taught newcomers that comments don't matter. But they always have, and now even more-so. Craft AGENTS.md to instruct the assistant to always document its code and to update project development docs, and then always to refer to those docs for details. The project is so much better for it, and you also save a ton of tokens.
same thought lol 😄
No
his profile says hes been coding since 1979 and what hes saying makes sense
i try to turn every line in agents.md into automatic enforcements/lints and make it as small as possible
Have you thoroughly tested Claude Code and Codex?
i'd say so yea, using both all day for 6+ months now, but just downgraded claude code to only pro from max 200, used it less and less
Back then, I used Codex and then Claude, and I was amazed, but what’s happening with Claude right now like the usage limits, the Claude code leak, and the somewhat poorer performance is making me skeptical.
oh god please not here, that unofficial claude server is overrun by "usage limit when" spam
I think I'm going to switch back to Codex, but I want to make sure it's really worth it or if I should just stick with Claude after all
you mean like telling everybody how your new model is too dangerous to release because of its cybersecurity capabilities, and then having your flagship product source code leaked... yea 🐑
sub to both and make determination while using
You mean Mythical?
What do you think about that?
Not understanding the reply to my statement
What do you think about that?
I dont think it requires much forethought. Just start using both right away
youll know which one you prefer from using them
just jump in
Codex has way more tooling than i realized
and you can build anything you want on top of it.. i'm trying to refrain from building an orchestration system though 😄
do it gif not allowed here
i'm working on my pre-AI business, expand it a bit, legacy code work. the bigger project is sth new in data analytics space
Does anyone know if it is possible to run Codex on your VPS and message it from the Desktop app?
If you or a loved one have been effected by AI Psychosis
call the number on your screen now
As you may be entitled to financial compensation.
ah shucks
So my rates reset an hour ago. Just got back to the pc. Doesnt look like it completed. hard to tell but 3 of the agents are still awaiting instructions.
Is there a way to trigger it to pick up where it left off without undoing all of the last changes and rerunning my last instruction?
You can sneak special signals into the system prompt. I have one called **DRIFT**, that tells the orchestrator they failed to handle something urgent. I like how the orchestrator knows precisely what the drift was without even investigating 😏 no more typing lengthy "you didn't do this and you need to do this", just slap em with **DRIFT** and they recover automatically
unironically gpt6.7 could release if openai keeps following the monthly incremental drops
"Robdex" 😂 💕
I should get the agent to save log detail as "Captain's Log, Stardate -297144.29 : Enhanced API to include ..."
[ Edit: Corrected for actual TNG stardate ... cuz, details matter ... ]
Hmm, It did say like a few days ago "renew 1 april" for the tokens, but now its still 0 ?? but says renew like 8
whats going on
tehre was rest on the 1st whoch put everyone to the 8th. Maybe you hit a bug with that?
nice
Naturally my next question is - what does your drift orchestrator look like? And because you've ended up at that solution, I presume there's no way to simply continue, or rerun it easily?
Am I right in thinking if I undo the last bunch of changes - which I believe was a fraction of the work I needed doing, and then rerun the prompt, then I'd also be in good stead? Ofc it'll burn tokens doing the same work it's already spend 17misn doing before I ran into my rate limit
https://help.openai.com/en/articles/20001106-codex-rate-card Maybe this? Note: As of April 2, 2026, we’ve updated Codex pricing to align with API token usage, instead of per-message pricing.
The **DRIFT** means at least one the following conditions are not met
- All blocked QA agents have a worker implementing a fix
- All unblocked QA agents are actively working and not idling
- All unarchived workers are working on something and not actively awaiting one of the workflow gates (pre-implementation planning, pre-merge, cleanup)
Rather than typing it out I just send **DRIFT** and they know they need to reconcile the invalid orchestration state
skills can do the same thing. It's a pity the skills standard never came with a way to configure a skill to be on command only. Lucky openai.yaml handles it. Anthropic sort of half cooked it though. They deprecated slash commands in favour of skills, but slash commands dont inject anything into the system prompt and skills do. So if you had a whole lot convenience slash commands they begin to pollute the context.
Skills are for agents that do a lot of different things. Orchestrators have one extremely well defined and bounded job. I don't wanna be like $do-your-job skill on a general purpose system prompted agent, or be like $orchestrator-recovery for situations when they are failing to orchestrate.
You can configure skills to be just simple inline commands with codex. So they don't get injected into the system prompt. They are only invoked when you explicitly called them. It's a replacement to the "prompts" that they used to have. So skills wear both hates.
I get what you're saying, but it seems similar to me. In both cases you are giving the agent instructions to do some thing
Yeah, skills make complete sense if your agent is designed to do absolutely everything, but I do not want my orchestrator writing code. The built-in system prompt literally says stuff like "prefer writing ASCII instead of Unicode unless you see unicode there or the user explicitly asks for Unicode". Mine says "Absolutely no file editing under any circumstances", so all of the skills for normal agents become irrelevant, and so does the section about "how to edit code properly and use apply_patch", because they straight up to not do those things.
So their entire purpose collapses into a very simple job, and I have a keyword to remind them to handle unfinished business
I understand what your saying, if it works it works. skills don't = writing code though, they are just prompt instuctions, any prompt insturctions you want.
like "handle your unfinished work"
Yeah, and after compaction they have to re-read a skill file. They are forced to consume the system prompt, and it takes precedence over everything. They treat it like Isaac Azimov's 3 laws, absolutely shall not be infringed. Skills are more like suggestions.
For better adherence after compaction makes total sense. That is a little bit of a sore point with codex, using xhigh doesnt usually leave enough room for the task to happen in one hit and often it's too close to compaction after thinking about it that it's not reliable.
On big tasks*
So orchestration is the solution
The other reason I did it that way is rooted in how LLMs work. Their "attention workspace" and neural network is designed around positional embeddings. That's why you can feed a model some massive document (haystack) and ask it to recite a particular passage (needle) with almost perfect accuracy. For my orchestrator, **DRIFT** is a positional embedding inside the system prompt in the beginning of the chat history, and it makes their attention workspace apply extra focus to that area without re-reading the prompt or anything. It saves tokens and redistributes their system prompt into their attention workspace.
Did you see the work kimi dev team did on this?
They worked out a way to prevent the diffusion of attention as the context grows.
Interesting! I have not read about it
It's directly about the point your making
all the new ideas that are coming out of research are showing there is tonnes of room for large step gains
not just scaling
like every point of the process is like its just held together with glue and gum atm
and they are going to refine it
Wow that's incredible 🤯
turboquant is another big step - on efficiency though
anyone else feel like it’s easier to hit weekly limit then it is daily
i have 2 accounts on plus plan already at 70% weekly
What is daily limit?
Nice stats on Attention-Residuals
Like RAG, I see all of those NIAH issues being resolved over a short period of time. The tech is evolving very quickly and there are a lot of eyes on these common issues. I think limits on context are artificial, based on these very problems. Once the issues are gone, context limits can go up significantly. The next frontier is solving this ongoing issue of suggesting vs command and "why didn't it do x?". There are too many files in too many locations over too many platforms (system, user, project, codex, repo, context, turn) and most people don't recognize that there are so many tiers, nor what happens at each tier. I hope that the tech will evolve so that we can control everything under the system instructions in one dashboard, and see exactly what directives a model consumes (and not) with every prompt.
seems like you are looking for afk agents/ralph loop, but I don't use those so I just send continue prompt
I made a script a while ago you might like: https://gist.github.com/robertmsale/8aa81706735621165efb9262e187da33
It mirrors the AGENTS.md loading pattern in Codex, so you run it inside some CWD and it will compile down all of the AGENTS.md files, including AGENTS.override.md and any fallbacks following the same patterns as codex, lists the loaded paths, load order, and context cost in bytes. I don't use it anymore (no more AGENTS.md) but it gives you at least 1 of the 3 pieces of the puzzle
Will look. Thanks Bud!
I'm gonna code a ditty to notify me when Codex finishes a task, like send me a Discord or Slack message on completion. This is probably a simple hook.
Does anyone already have this published for Node? TY
Simple codex_hooks feature for Stop events are still not complete or stable, especially when using VSCode extension. I'm gonna build a little more around that and will probably publish something.
My 20x CC runs out tomorrow, and cancelled my sub, just reset today, be free
Maybe stop using high xhigh for everything
After working quite intensively within unity and Codex, I felt like as if Unity was being an obstacle thats blocking Codex's capability to function. So I tried a new project and requested it to use Rust + Vulkan + SpacetimeDB and recreate my project in the custom engine and it did in a few hours. It's fully functional and now this game runs on cloud 24/7/365 using SpacetimeDB, you can play it now (account creation not required as it even has guest mods, and it even supports creation of your own world-dimension and you can travel to it instantly )
but its just so good
I suppose that should work too. I was trying to overthink it. I sent a prompt to acknowledge the incomplete changes, audit the files, and then continue with the earlier fixes and report back on the state of the repo
yall think i can make it?
It will finish it unless context compacts inbetween
Which it probably will
I am making a Rust CLI with Codex right now. It is an all-in-one toolbox with a plugin platform.
I'm adding AI features for it and u can just run
cat error.log | tinfo ai fix
tinfo ask "why is this slow"
to get a solution quickly in the terminal.
By the way, it's a fully open-sourced project licensed under the Apache 2.0 License.
https://tinfo.1234567890.dev/
holy, it was a heavy change and it only took 1% and %3 of 5hr usage 😱
are we cooked chat?
unless your doing something truly revolutionary, nah 🤷♂️
Codex xhigh is so damn good
Does /fast apply to gpt 5.4 mini
its really not
gpt 5.4 xhigh does not perform significantly better than gpt 5.4 high
lol that's wrong
Can someone explain to me, why 4 prompts on a team account got my 5h window down to 10%
and the same 4 prompts on a plus account got it down to 90% ????????
like team account drops as fast as free account
this is ridiculous, for a higher price
whadafak
that depends on what the prompts were and how much Codex worked on them
nah man, I said "hi" on the team account with gpt 5.4 low, and it dropped by 4% of the 5h window
did the same with the plus account, it did not even drop by 1
this is not normal
team account is supposed to have higher limits than plus
I don't think that is correct
well I don't think either
but I did the test on 2 team account from 2 different teams
and it's the same
I can believe that the usage limits arent x2 anymore
but this is just non sense
80 prompts on 2x in 5 hours sounds about right for some half decent prompts
did you make it? XD
if the goal is to use the 10% before the reset i think you can do it.
I'm cooked for 4 days 😫 do I gotta like.. type characters again? I thought altman said I don't need to do that anymore
there is cheap alternatives that can fill the gaps, minimax, glm, open router has free good models from time to time, currently it has qwen 3.6 plus, but you have to add 10 credits to your account to use free models for 1k request per day.
a $10 minimax account gives you 15000 request per week with 5 hour of 1500
Does anyone here only ever use GPT 5.4 on Extra High?
Has anyone got experience of not always using Extra High - how does it perform? I've found myself using the 5.4 / Extra High combo 100% of the time - not sure that I'm giving up if I drop it to High or Medium. Would be good to know if anyone's done any side-by-side comparisons (+ real-world token usage differences)
yeah it's good for persistent bugs, not centering a div
whats the limits for codex on the free trial tho? i have access to gpt 5.4 xhigh which feels illegal on the free tier
we can use kilo code that gives access to qwen 3.6 plus without any rate limits, open router is good but the rate limits on the free tier are deafening
personally I find no noticeable improvement in intelligence with xhigh, possibly a slight regression as it may overthink. It noticeably consumes more tokens and takes longer to complete tasks though. I just use high
Also the benchmark at https://voratiq.com/leaderboard/ aligns with my perception, high overall appears to be better than xhigh. Give high effort a try and see how you find it
My subjective feeling is I like the code that xhigh writes more. But with the usage limits as they are now.. 🙁
codex made this
Testing out 5.4-high for implementation and 5.4-xhigh for planning/research today, I dont think the difference is that big.
it is
it hallucinates way too much
you using gpt5.2?
yea
osu?
the circle clicking game?
osu!droid over to osu!cocoa (android to ios)
i bought an iphone 14 just to do ios development lol
i've never used an iphone before
so...
i now have the entire mainstream dev roster
mbp 16" w/ M1 Max (apple silicon)
mbp 15.4" w/ i7-4870HQ && R9 M370X (intel -- runs W10 on bootcamp) -- macos + windows supported natively
samsung s24 (android 16?)
iphone 14 (ios 26.4 but 15.6 set minimum for dev work on Xcode)
my main is the samsung, though
and the m1-max mbp
What is the MCP codex_apps ? It seems to try to boot without me having it configured 🤔
Ya that’s so annoying.
And u can’t switch model when its doing that
so you have to sit there and wait for it to boot 😐
Nice
[features]
apps = false # buggy, burns tokens
or open the codex APP and go to settings, disable there (github app is enabled automatically). apparently this carries over to the cli and burns a lot of tokens with the mcp tool descriptions
Is this only if you have the codex app downloaded?
idk, might also be a default setting otherwise, havent look at the code. probably lives in the codex app server somewhere
jw - i never downloaded the app, just use via CLI or VScode... how can i check for that?
i'm not sure but i think its also in the 0.117 cli
"The app-server-backed TUI is now enabled by default, and the plugin/app rollout flags have been flipped on in normal builds."
xhigh is used for all benchmarks
for me xhigh performs so well for me, i dont even change level, i set xhigh and forget
just ask codex what tools it has access to. if it talks about a bunch of github mcp.. you're pwned. try the apps = false flag in config.toml and ask it again
why would they run evals on high when xhigh is there?
do you look only at first party benchmarks?
why would you ever look at first party benchmarks
"5.4 xhigh is overall 2.1% more accurate while consuming double the tokens. High is ~0.3% more accurate overall than medium while consuming ~33% more tokens"
where is that quote from?
i dont understand you o, if you love using high then so be it,
I dont love using high
I primarily use medium for most tasks
and xhigh I use on gpt 5.2
because its good at long horizon tasks
2,1% for double the tokens oof maybe i shouldnt use xhigh all the time
then so be it, no need to argue over it
then good for you
are we so fr
It's the data bits an LLM uses, like words but a word can break up into more than one token, and punctuation can also be tokens.
you are just objectively incorrect
that in your own view
i know what xhigh has done for me, so you have your view and i have mine
Usage limits are often counted by tokens processed, and bigger models that use more resources server-side often use more of your usage limit.
Yeah, statistically if it takes gpt-5.4 medium to do something in 2 shots, when it takes 5.4 xhigh 1 shot, it still costs 33% more tokens to do it with xhigh.
token bs and usage limits are blah
what
like
two shotting something on medium
takes less time
and costs less
than oneshotting on xhigh
That's what I'm saying
does gpt-5.4 still have that adaptive reasoning thing that 5.1-codex or whatever had, on medium?
xhigh and high are diminishing returns for this model
high makes sense if you need high exploratory because gpt 5.4 already seems disenclined to do that and med made it slightly worse in my experience
like comp to 5.2
and 5.3 codex
Think about it like this, reasoning effort is the upper limit on reasoning. It still has the ability to choose less reasoning. xhigh is no upper limit at all. It could literally reason itself from 100% to 10% context window in one reasoning shot
gpt 5.4 tends not to looka round
although robert
could u link the bench u reference for thinking
I remember looking at one
but I cant seem to find it
After having used Claude code on web, came to Code with high hopes but
- you cant select latest models on codex web
- you cant directly commit in git repo from codex app
- you cant select model but cant directly commit in git repo
Whats the best way to use Codex where we get all the options available at run time ?
@chrome raven linked to it. The introduction to GPT 5.4, in the SWE-Bench Pro stats, the difference between medium and high in terms of accuracy is 0.3% at the cost of 33% more latency: https://openai.com/index/introducing-gpt-5-4/
But I actually prefer to look at this one https://openai.com/index/introducing-gpt-5-4-mini-and-nano/
Latency, they measured to include tool call time, which doesn't really tell you tokens or requests. In the introduction to gpt-5.4 mini/nano writeup they have an Accuracy vs Cost measurement, where cost is directly correlated to tokens.
In any case, the difference between 5.4 medium and high is so small in terms of accuracy it makes zero sense to use high. And xhigh, I think that's only really useful if you want the agent to invent new physics formulas and not write any code.
ah yes more context bloat and compactions please
I use xhigh for plan mode occasionally but not with 5.4
I usually use 5.2 for that
Oh yeah, reasoning tokens do not get used in compaction, so all of that thinking they did to reach 10% gets wasted on the next agent. Reasoning tokens are ephemeral, and only useful for the agent producing them
like ive had 10 dollar single prompts with gpt 5.2 before
Exactly
exaccctly
Maybe Niffylord is using xhigh to invent new physics formulas which might be an ideal use case
thank you, i dont know why this is hard for him to understand, i am not coding with xhigh.
codex does these little updates though while thinking, so not completely wasted
my bad for thinking people in the CODEX chat the CODING harness were CODING
im sorri
in a bit of a bad mood
had a frustrating day so far
Commentary, sure, but all agents do brief single sentence commentary. The 200k reasoning tokens they made between commentaries disappear into oblivion during compaction, as in not included in the mental state blob at all for the next agent to gleen from
Interesting i didnt know they werent part of compaction
Yeah, the way it works in the codebase is it replays the rollout log to /v1/responses/compact, which is I guess a special agent designed to produce mental state artifacts that normal models understand. The rollout log does not include reasoning, only user messages, tool calls, commentaries, etc. Reasoning is purposely excluded from chat history for security reasons or whatever. So unless the agent is dropping extremely insightful commentaries and not just "I'm rerunning the test to confirm if the problem is gone", all that thinking is wasted
You know if it includes tool call results? I'm a bit worried about poisoning context with posttooluse linting. In my opencode plugin I prune them from context myself. Might have to do that in my codex fork too
Probably need to check it directly 😄
You know what, I guess they updated it since the last time I looked https://developers.openai.com/api/docs/guides/compaction
The returned compacted window includes an encrypted compaction item that carries forward key prior state and reasoning using fewer tokens. It is opaque and not intended to be human-interpretable.
in the darkness
But this only applies if you send reasoning tokens to the endpoint. codex the CLI/IDE/App does not store reasoning tokens in the rollout log, and it sends that rollout log to the compaction endpoint, so the way it works in codex it will toss reasoning tokens.
the trees are full of starlight
So could potentially make better use of high/xhigh with a fork, hmm. I put codex on some self-investigation 😉
Seems like reasoning does go into history and compaction, and some older tool use results are already automatically pruned.
But what the endpoint does with them server-side is another question.
don't kid yourself - nothing is free or without rate limits. qwen is pretty terrible in my experience. Rate limiting and just doesn't work at all for anything that isnt super simple
lots of this and 429
Isn't it also training on you always? Or is that DeepSeek, or both?
its a chinese model ofc its stealing everything it can
the free openrouter is 1k requests so it last 1-2 hours a day if you use it all up.
I mean, the US companies aren't much better if any, but I already have enough strikes.
I'm not worried about them training on my data
yeah. but the are the only ones that 'just work' right now
GLM-5.1 : totally broken. GLM-5: nice but too slow
Ive heard that and I think I only tried via chat not agentic
me either
if its that sensitive you shouldnt trust anyone
The individual effects on me aren't what I'd be worried about.
what is your concern?
AI x-risk long-run, inequality escalation short-term.
I mean lets face it chinese companies dont have the hw resources the usa have, so they have to be innovative. and they are doing that pretty well
Indeed.
Hiya guys
My codex takes about 3minutes average to respond to basic questions that aren't even overly complicated
It's a bit annoying... anyone else experiencing this? Anyone know how to fix this delay?
maybe not the best place to ask question like that but anybody has experience and can tell if codex works well with C++ code? I get a lot of anecdotal evidence of codex doing either well or badly... depending on technology etc
i will probably go for some paid plan of coding agent in a month of two... claude looked like well rounded option that isn't exactly "bad" at anything but now there is absolute clown fiesta with usage limit problems of claude subscriptions
so it looks like opencode subscription plan with GLM model or codex are the only sane options left for "home use"
Yeah if you make it unit test its code and provide it with debugging capabilities it does just as well with C++ as any other language.
i consider using it for Unreal Engine game development so that may be complicated... 🤣
sorry if i should have provided more context in the first place
i saw bunch of opinions about LLM's generally having worst time with languages that are not exactly mainstream or are not for web development
because there is less focus on making stuff work right in these areas and there is less emphasis for providing training data there
Debugging maybe outta the question, but testing I think you can set up a harness for that. The models know and are trained on C++, it's just C++ gives you a lot more opportunity for footguns than most languages so testing is going to be a hard requirement
In any significant project you end up with some edge case to handle where the agent has a gap or does something wrong the same way every time. It's a bit of an art to handle. Every solution is just a few words away, and finding those words while being efficient isn't always intuitive.
Error running remote compact task: stream disconnected before completion: error sending request for url (https://chatgpt.com/backend-api/codex/responses/compact)
Anyone else getting this?
@delicate gulch I've been getting that all the time
My codex session keeps trying to reconnect session and always repeat the loop from the very first
It's been pretty unusable for the last couple days
i got a error, that his context size is to small to compact: ```■ Error running remote compact task: {
"error": {
"message": "Your input exceeds the context window of this model. Please adjust your input and try again.",
"type": "invalid_request_error",
"param": "input",
"code": "context_length_exceeded"
}
}
■ Error running remote compact task: {
"error": {
"message": "Your input exceeds the context window of this model. Please adjust your input and try again.",
}
}```
Does nice web companions exist for codex? so codex can visually show and present me designs or ideas. where i see the design boad, the todo list or the status of the tasks?
https://x.com/iruletheworldmo/status/2040508733944668408?s=46
is this confirmed anywhere else
lines up i guess, apparently they called the big daddy model spud. So i guess they are distilling that out into what they need.
Ugh
my usage jumped to 0%, no warnings or anything. normally takes me all week. 🙁
stealing my token for AGI I see
complaining so that we get our mid-week credits reset I see. keep it up, we need you!
mid week resets are actually bad
tbh
hot take
i would much rather have a better weekly usage limit
than getting weekly resets whenever
also it would be cool if there was a usage multiplier so like if you are doing stuff off peak hours it only consumes like 0.75 or 0.5x
Knowing these corpos they'll do the opposite.
Yeah. I've got a lot of automations and long running stuff I would queue up for off-peak if they gave me any incentive to do that.
I'm going to start running everything on Monday mornings just to spite them.
Malicious compliance is always a W. XD ❤️
to what end?
Messing with "Open"AI is a W by default at this point. 😛
Its a bit overkill though. I have pro account, and I eat through it all in 3 days, building only a single app atm. that is nuts. I used make multiple things at the same time and would barely touch my weekly rate. 5.4 fast must be really expensive
They raised like a bill i think and then also there public so they get money from that
Bro, 96 hours later and my orchestrator is still maintaining an unyielding and adversarial stance on all code changes. I flippin love 5.4 🤯 first time going 3 days non-stop and finding real forward progress, waking up in the morning to 30 PRs and stuff getting done
sounds like gpt6 is going to be chaided
yeah i mean gpt 5.4 is super contrarian so if that works for your usecase it seems like an excellent choice
doesnt have the autism that gpt5.2 have
are you saying I am super contrarian? Because I love everything it's doing lol
lol no xD
but like
if you want llms to be able to say that other llms are wrong
i guess then it being contrarian is very good
whats your token cost?
/how many subs
~6 agents 24/7, and a couple agents in other projects
on a 200 dollar I presume?
what if on top of compaction
LLMs could call a tool to search thru the chat
for keywords
that would be cool
if that is pro something is really wrong with my account. I eat mine in 3 days, using just 1 agent, only during daytime
thats odd
that what I thought too
I used to barely touch my weekly budget, now It gone super fast
Fast is gonna bite you, but you're welcome to compare your configs to mine: https://github.com/robertmsale/.codex/blob/main/config.toml
I have a really conservative configuration, v0.116.0, and some heavy optimizations so I can fan out swarms without eating the quota
it also says it will reset 8th of april, even though the week ends 5th.. I am confuse
yeah, its probably due to fast.
thats the main change in how I work
weekly resets are based on the last reset day and when u started to use
yes pro
I thought losing the 2x until April 2nd thing was gonna hurt more than it does. I think if I'm at 50% weekly by tomorrow at noon I'll be on track to reach 0% before reset time
yeah
the 2x
ntb
well
I have like 180 dollars of credits left
so im gonna burn those b4 upgrading subscription
I might move to claude code if they fix their darn quota
cos id rather do a 100 dollar sub than switch between 5 20 dollar sub accounts or pay for a 200 dollar sub
I cannot justify 200 dollars per month right now
Its probably the fast use. we had a lot of bonus tokens recently, prob got used to that
The only reason I didn't to Plus + credits is because the 5 hour limit is so small you can't spawn a whole bunch of agents. I used to hit the 5 hour limit in like 1 hour or less lol
I dont have enough mental bandwidth to do a bunch of agents
when I need to get lots of stuff done I just turn on /fast and pray
I just did not want to bother with things, but from what I understand getting multiple plus accounts is the cheapest option, oddly
yes
I think
unless you want 5.3 codex spark
I could see that low latency being useful
I only really interact with 1 agent in each project (orchestrator). The workers are so sandboxed they can't do any damage. Worst they can do is waste time. But if 1 out of 6 agents drift, that's still 5 agents that made forward progress so it's worth it
anyway, grapevine says agi in 2 weeks, so I guess we can just spawn entire apps for 5 bucks
we already have agi
imo
like
opus 4.5 was agi imho
it can simply do whatever single thing you ask it to do
whatever that thing is
I don't think so until it can properly learn.
we have human level intuition machines now, but real work needs more.
artificial general intelligence
idk
I wouldnt say learning is an important thing
it is generally intelligent
but at this point more glue and optimization will get us very far indeed
learning is key for multistep tasks. many real world tasks simply are not solvable without the ability to gain deeper understanding
loading md files into the context can only get you so far
and it makes the bots crazy
but oai is working on it I guess.
research requires this ability. its basically the missing piece for research level agents imo.
the reasoning is there, it now needs to use its reasoning to find more reasoning patterns and absorb them so it can dig ever deeper into these reality patterns
peak vibecoding
#Techdebt
oh yea brother
Do you guys run your unit test suite sequentially or in parallel?
testing?
I only do 3 things
ask codex to do thing
git commit -m "stuff"
git push
👍
💀
Is it just me, or has codex use started being much more limited?
I'm on business, and the last few days it feels like my rate limits remaining drops like a rock.
its cuz the 2x is gone
or ur usage may have increased without knowing
Damn.
What's the most cost effective option now? The 2x was putting me on enough to work ok.
Second account
Yea, I just came to the same conclusion.
I have a PR coming with 150k lines of code xd
can i ask what you could possibly be cooking in that kitchen?
lol should find my PR when I refactored my entire app
Changing language for a big project (testing some things out to see if it’ll work out)
copilot is trash, though
and my agent framework does not support copilot
maybe i should add support for copilot next tiem
really need some reset relief after the 2X loss 😒
5h limit on top sub shouldn't be a thing imo. I can't see how it would affect them since the 1w limit averages it out.
By using it a lot
its ez these days. just single agent app dev with pro account is gone in a couple days
I have it running pretty much every waking hour though.
do people make it run auto?
I have pro subscription and will regularly hit limits. But I use it in my full-time job
can go pretty quick, especially if you use subagents and fast mode etc. I need cheaper smartz
I'm on Pro and use it quite a bit too, generally every day of the week, GPT-5.4 (high). Never have managed to hit the weekly limit yet. That said, I do use other things that may help save on tokens, such as rtk
did they remove skills or smth, i cant find it anymore
it shows just New Thread, Plugins, Automations for me
ok, something deffo broken. I just hit my full 5 hour usage limit with just 1 prompt, thats a new one
It's skill issue.
IMO: mostly wrong prompting, I've seen people continuing threads for 20+ messages, always trying to "fix" a bit more; this fills up context/usage quickly.
im pretty sure they are tweaking usage atm. this feels much much faster than normal. maybe need more compute for agi release
I was able to do a full week with just plus if I used it normally. now I can barely make a couple days with a pro account.
that explains literally nothing
this might be? I think so too. I think people with no skills just keep requesting "fix" and the codebase just grows crazy and the AI has to read everything
You’re trying to assert that the only people who are reaching limits have it on “auto” or are trying to “recreate google”.
I’m explaining through normal use, it’s easy to hit limits.
“I don’t hit limits so there that do must be doing something wrong” 😂
what I am saying is of course you use pro
of course you use it for your personal use or job
those informations are not adding any value is what I mean
it's pretty much "duh"
LOC isn’t a big problem if the code is still structured and doesn’t eat up context just trying to find the relevant part. It’s more problematic if humans still interfere or if it’s a legacy codebase (still IMO)
its just a case of use, use it a lot, use expensive models, etc, you will run out quicker, and it can scale dramatically. that said, there is no good way to actually see your usage, its all abstracted away. probably so they can dynamically adjust when they need compute etc. if you want hard numbers you should use api. but its way more expensive
You literally asked how people hit limits. My reply includes context and adds a bit more than “by using it…”
lol...
I’m using pro for my job, personal things, side projects and the occasional dumb thing. I’m not reaching limits.
I hope you are just massing with me cuz that's literally trolling level of answer xD
me too
same, but atm I ran out in just 3 days, normally I barely touch my week limits, unsure if I did soemthing weird or usage changed
What kind of answer are you expecting? Different people have different uses. Who’d have thought. But by all means keep circlejerking around the fact that you must be a better user than someone else because you don’t hit a limit 🤷♂️
2x ended 2 days ago.
I think they are asking for use cases where the limit gets hit. Maybe we’re just missing something.
yeh, that is my guess too, a combination of lots of fast usage, and getting used to the 2x bonus, which all skewed my expectations
He was asking what specific usage patterns burn limits fast.
“People use it a lot” is too vague to be useful, and getting defensive made it sound worse.
A better reply would’ve been:
long coding threads, heavy agent use, expensive modes, and repeated edits can drain limits quickly.
Even shorter:
Your answer sounded dismissive, not informative.
He wanted specifics, not “different people use it differently.”```
I hope it helps
I switched back to non-fast, but I think that I could have left it on. I’m far from hitting any limit.
that said, I switched to a plus account I have laying around, and a single prompt used the entire 5h budget, and 20% of the weekly budget. that seems steep. just regular 5.4 high
I do have a feeling things changed more than just 2x rate, or something is borked
but lets see next week
What nonsense. Look at your other messages. You didn’t want legitimate conversation or answers. You instantly started to deride others as “people with no skill”.
Have a conversation with your LLM if it gives you the answers you want to hear
You’re still arguing against something nobody said.
The point was never “people who hit limits are better” or “worse.”
The point was that your original answer was too vague for the question being asked.
were asking for examples of what kinds of usage burn limits fast. Once people gave specifics, the conversation became clearer immediately.
So this isn’t about “wanting answers I like.”
It’s about wanting an answer that actually addresses the question.
What is going on how come gpt5.4high burns up my complete 5h limit in just 1 prompt not even using 2x
You: “how do people reach limits”
E383838: “wrong prompting”
You: I think so too… people with no skills just keep requesting ‘fix’”
Hardly arguing against something nobody said when you literally said that
Yes, and that still does not change the point.
My issue with your reply was not that you disagreed with me.
My issue was that your answer to the original question was vague and then you got hostile when people said so.
The discussion got more useful the moment people started giving actual examples like long coding threads, agents, fast mode, repeated edits, and large context.
That is what I was asking for.
Not “people use it a lot.”
You weren’t asking for anything. It was purely rhetoric then you went straight in to attack others who may use it more than you.
No, I was asking why some people burn through limits so fast, and your answer to that was vague.
Yes, I made a separate comment about bad prompting. That does not magically make “people use it a lot” a useful answer to the actual question.
Once people started giving concrete examples like long coding threads, repeated fixes, fast mode, agents, and large context, the discussion became actually informative.
That is the difference.
Hahaha. I’m not arguing with an LLM - keep copy and pasting the replies.
You can keep pretending that instead of addressing the point.
Whether I wrote it myself or got help does not change anything I said.
Your original answer was still vague, and the conversation only became useful once people started giving specific examples.
So if you want to respond to something, respond to the actual point.
It’s an opinion/thought. I never said it’s true or can’t be different, it’s just what I observed and think.
Because you don’t say it. You fed it into an LLM and pasted the results back. The only original thoughts you had is that other people must be using it wrong because you personally don’t hit limits. The others must be “people with no skill” or must be “asking it to build the next google”.
Now you’re using an LLM to try to spin your original comments into something that wasn’t as condescending as you originally intended.
Your LLM also quotes your own/
its own answers back to yourself claiming that’s when ‘people’ started being more constructive.
Wonderful
Can someone give me their experience of using a coding agent? How is it different from Codex?
I find Codex amazing as it can build an amazing game demo from scratch and easily integrate many 3rd parties. Is this something that these "claw" coding agents can do easily as well?
if you are chatgpt plus subscriber (20$) what models do you get in codex?
everything except spark I believe, and of course Pro but nobody gets Pro in codex unless you pay API 😛
so theres 5.2 and its versions?
yeah you should see 5.2 and 5.2 codex on plus I believe
ty
It all depends on what the agent's role is. I have orchestrators that have quite literally hundreds of user prompts, but they remain accurate for weeks because their purpose is tightly scoped and the standard operating procedure never changes. Workers building features into the software, if you have em continuing for a long time they can drift.
Also it doesn't burn quota having long convos if the agent is constantly active. If they are idle for 15 minutes or more then OpenAI will clear the cache for that convo, so having an orchestrator constantly receiving small messages from workers and making quick decisions gives you a cached input bonus.
You know what does burn quota faster though? Something totally sneaky that you would never think about? Code formatters and indentation! A lotta formatters do 4 spaces and do stuff like this:
pub fn some_function()
{
^
}
Putting the curly brace on the next line, you get a whole bunch of whitespace if the brackets are deeply nested. The agents have to read that whitespace, and they have to apply patches to it. Total waste of tokens. I've seen some measurable cost benefits to enforcing 2 spaces and making max line width more than 80 characters. Also if you disable code formatters completely, they never have to re-read that section of code after formatting before applying a patch again. Little optimizations like that can make the quota last way longer.
something is off with codex because it drains the usage like crazy with one single prompt now
i know x2 rate is gone but its ridiculous rn. i could use it much more before
many people tweeted the same
lolwhat
I use my subscription pretty aggressively
on 20 dollar
just allocate as many tasks to gpt 5.4 mini as possible
codex --version also what features out of this list do you have enabled?
when you allocate a task
use this chart
there is almost no reason to use gpt 5.4 xhigh
nor gpt 5.4 high
gpt 5.4 low is slightly worse than gpt 5.4 mini high so use mini high for those kinds of tasks
or mini on medium for a bit cheaper
of course nano on xhigh would do better, but that isnt available in default codex api
do you know if theres something on the desktop app to enable the llm to use the question tool during non planning mode
or what the config.toml would be for it
I'm not sure. If the question tool is a structured output response format then it likely only works in planning mode
I noticed that planning mode came out right around the time app-server began supporting structured outputs so I think that tool is probably inextricably linked to plan mode
although I have to say gpt 5.4 high is much better than medium if you have prompting skill issues
ah k
thx
It would just be super useful if it could just ask a question when my prompts are unclear
rather than me watching and having to steer
only thing i used was skills creator., everything else is disabled by default. i use microsoft store app so it gets auto updated or smth because i dont specifically update the app myself but i see changes
god
couple hours earlier it wasnt so bad rn its draining like crazy
im saying like 7-10% on basic 1 prompt that is to do with coding but previously it used barely 1% with 6 similar prompts
do u have plus or pro
plus
i dont use fast, 5.4 and high
cc?
if yk how to get that
Oh gotcha, sometimes the underlying https://github.com/openai/codex updates, and with it comes some undesirable side effects. For example before v0.116.0, with features.unified_exec=true, the agent would poll background terminals every 5 seconds and it would reprint the entire command output + new outputs since the last poll, so commands that ran for a long time would exponentially eat away at input tokens. So with the GUI you automatically get bugs like that when the underlying server updates.
0.117.0 had an undesirable side effect where suddenly MCP servers require approvals and stuff, which was really annoying because the workaround I came up with for the command output polling bug involved using a MCP server to make em wait for the command to finish, and suddenly every single "wait" call was requesting approval. They sometimes very annoyingly and drastically change the way it works under the hood which is why I think that's probably what you're dealing with
1 prompt drained right now the following:
from 5h limit - before: 25%, now 9%
from 57% weekly limit to 53%
quickly tested it, not going to touch it rn anymore
this is the 3rd I hear this now, 1 prompt eating all tokens
ive had 2 prompts drain 100% of 5 hourly
pretty sure its a thing
1 prompt is enough of a metric as a comparison, if the difference is huge.
my pro account was ded in 3 days
and my plus account dead in just 1 prompt
drain 60%
somthing borked
1 prompt was 20% of weekly plus tokens, that is insane
that it could drain 100% of 5 hourly after the halvening
15min of ai work and account is dead
is ur codex up to date
usually gpt-5.4-mini could be used without a thought of how much is left. Now it gets used up every 5h period, and the total in 2-3 days.
What account?
that's... interesting
model = "gpt-5.4-mini"
model_reasoning_effort = "medium"
personality = "pragmatic"
[features]
multi_agent = true
holy crap! I just noticed that multi_agent
It was not there before
I was really suprised by the token use today, but I thought it was just me. now I see multiple people complain about the exact same issue
something is wrong
but oai is probably celebrating praise jebus atm
Where would that setting come from? An update that set it as default?
Nah, they are still losing money
thing is, chatgpt account is not static, they can throttle how they please, no actual token cost like with the api
doesn't websockets make it able to cache things?
so if they decide they need more compute for their coming agi release, they will just throttle us to bits
either borked, or on purpose, but it deffo changes
changed*
🤷 it seems to drain quota fast
Weird
"Hey chatgpt. Change the quota stuff so we lose less moneys"
tbh oai can most likely make tons if they just stop agimaxxing
wut
They are competing with other companies that do the same. Customers are not loyal.
they would get outmaxxed quickly, that is true
if oai stops agimaxxing
they will become nonexistant
and even just inference oai prolly loses money on codex subs still
I just wish I could get gpt5.4-mini capability offline. I would spend unnecessary money for that. 😛
Does the Plus plan actually help with the limits, like is there a good difference compared to the free plan?
thats true, but they can make bank before that happens. its a choice
is a thing on codex?
there is
you get 0.1 prompt with the current usage
So any idea? Like do you actually hit the limits fast?
plus is much better is every aspect
kimi k2.5 thinking is above gpt 5.4 mini on high
From your perspective, how are the limits?
much faster, more tokens, smarter, etc
how much do you code
Does it actually work as an agent in vs-code, or does it just provide code?
Like 2 hours a day as an indie game developer (student)
I used to be able to code all week, if I did regular human hours of work, single agent, etc
now I just have it run non stop essentially, so I need pro.
you will be fine on a plus sub if you are smart about it
imo it is worth
2 hours a day is totally fine with plus
Like it literally spends a few % of the weekly limit each time I write a prompt regardless of what it is
not totally fine
I've gotten sick of it lately
you can prob do 6 a day if you dont use multiagent fast hyper mode
wut
no u cant
as a plus user who allocates time efficiently
well
I suppose if you do 6 every weekday
just youll really be hitting that limit
I do not plan on fully-automating my workflow, I apparently am actually capable of writing smart prompts
youll have to use medium probably etc
Thanks!
have to use medium?
dude
medium
is literally
worth
its the best price/perf
and its not even close
I've used medium forever and it works perfectly honestly
So 2 hours a week is likely to be fine as a student, right?
*2 hours a day 😭😭
if ur using 5.4 medium and 5.4 mini on high
2 hours a day is good
you might want to buy like 1 set of credits per year
I use 5.4 medium all the time so I assume I'm likely to be fine, thanks!
a day will be fine, you should have plenty tokens for that from plus
isnt there a student deal also?
no
thought so
you get 100 dollars of free credits if youre a student tho
thats a lot
anyways, if we have to believe strawberry man its agi in 2 weeks
have they went and defined AGI now?
AGI is undefinable
opus 4.5 is generally intelligent
just like intelligence, its just, you are more smartz
some people think agi = asi
^
when ai starts going things we dont understand anymore, we will be like, wow you smarts, you agi
ASI is what is assumed not to be possible in the near future no?
lol, I just looked at the requirements. At least a TB of vram. 😂
I agree on it being impossible
maybe tomorrow google drops a paper saying well we might have maybe developed asi
yes, but the gemma 4 models are highly capable
I think its the same thing essentially, buts thats the impossible definition bit.
imo agi needs to be able to learn on the spot, do actual human level research etc, but once you have that, it will essentially instantly self develop into asi
gemma models have always been insanely efficient
and the latest are a continuation in that trend
I tried using one as an agent, and it didn't work at all
it kept putting chain of thougts into the documents it was suposed to edit
right now
for gemma 4?
I think so, yes
Maybe I did something wrong
or what they did
anyways
gemma 4 26b a4b is insanely capable
for a 4b activated
moe
I will check it out
its about equal to gpt 5.1 on high
thanks for the suggestion
and you can actually like
run it on your gpu
rather than having to have a $20,000 setup with 10 5090s
I have to settle for one 5090, until I have success with a game. 😛
actually, 5080
I'm investigating now. If I only need openai for the occasional heavier problem I'm happy doing the rest locally
in what? I found it completely useless in any agent coding... opencode, ollama
did u enable thinking
which quant and specific model did you use
It just the usual hype, one a given task or bench it might be great, but its not even close at general work
b/c MLX is broken currently with this in LMSTUDIO I use
lmstudio-community/gemma-4-26b-a4b-it
enable thinking where?
it does well in llm arena
basically as soon as you ask it to scan your code base for anything it dies and goes into a loop others reporting similar things
apparently its only useful for chat currently. about what I expect from google
well 256k context window
and what quant
are you using
like
tool calling sucks if you arent using 6 or 8 bit quant
why can't it call tools? it does just badly
it doesnt matter. its the same with and without like I said
was only testing
right now the llama.cpp that lm studio uses is broken for gemma models.
yeah but its equally bad in ollama pi
will just wait a week, I think its just too new
also
moe models in general do worse at tool calling
recommend to push for the 31b if you can
31b didnt help . plus it was too slow
I notcied very strange behaviour with codex recently. It somehow looses "flow" and starts replaying to previous prompts instead to new one
this is an issue with gpt 5.4
you can prompt your way out of it
Yeah usually next prompt works correctly, it just didnt happen to me before (and i use it a lot :P)
yeah I see this all the time in 5.4
it seems to forget the last one often just repeat it and its usually ok. or start fresh
you can even ask it "what did I just say?" it doesn't know
codex is amazing
it completed what gemini via antigravity struggled to complete for a week in less than an hour
Why no "Open in CLion" button in the drop down still? You have multiple other jetbrains IDEs already, this is annoying for doing C++ work.
looks like this https://github.com/ggml-org/llama.cpp/pull/21418
just gonna wait for next beta
sh8t limits though
One would think a business account should be at least able to work 8h a day for a few days a week. 😛
but damn, is it good
I've spent a few hours trying to get a local llm to help me instead, but I'm not succeeding even a bit
on the plus side, I could have spent the time reaching my limits.
Is codex/plus worth it?
What would you use it for?
Vibe coding
More specifically, what kind of stuff, and how much?
alright what is the consensus on this new rate card for the monthly plans: https://help.openai.com/en/articles/20001106-codex-rate-card
Errr there's a ongoing GitHub project and I'm gonna use it a lot haha
But other than that.... I run my own llms
So hmm
Mostly try to fix the GitHub or add features
So I've not gotten local LLMs to work directly as agents, but gpt codex does a marvelous job. It's a bit limited though
The best you can do is probably get one months subscription and try it
Hmmm
Cause I did try Gemini cli
It's okay? But I feel it can do better
From what I've understood, openai and anthropic are the best
Atm deciding Claude or codex
Yea yea
The project is run on rust
I do wanna see if I can vibe code to handle and sdk
I've heard that anthropic is worse when it comes to price, so I guess how much use you get out of it before needing a larger subscription, but that's not confirmed
How does openai work?
you can get a plugin for vscode, and use it directly there
you have a few different models to choose from, that have different limits and capabilities
What about limit? Weekly? Or daily?
And it's not uncensored righht
I would still go back to local for that
I don't quite know how the limits work yet, so someone else has to chime in here. All I know is they recently got reduced.
No, not uncensored. But you probably don't need it to be. Content shouldn't be part of the code anyway, but rather read from files.
It will probably help you build jiggle physics, without you needing to talk about what is jiggling, for instance.
Speaking of limits, this is painful
I accidentally had it able to spawn sub agents, which quickened the pace this ran out
there is a 5h limit, and a 7d limit
So you constantly always use it up? 🤣
unless you're a free user, then it's just a weekly limit 😛
Yea, I do
limits though are pretty generous imo, well, certainly on Pro, you can get a lot done
Pro cost like.
Gulp.
$200 or depends on your currency yeah, not cheap but worth it depending on your use case and effective profit I guess
next week I'm really hoping we see the prolite plan (Pro 5x, unless they change that when they release it), which currently is set to be $100/mo. That said, I had to renew Pro today so it's too late for me to consider that plan, on the bright side at least there's four weeks to get it done
I am making a game. Haven't seen income in over a year. 😂
What game?
Don't think I'm allowed to talk about it here
Also a bit off topic
Feel free to PM me though
anyway, if it succeeds I suspect I will get pro. Doesn't need to be a huge hit for that.
I see I see
havent really used much this week so i get to give fast a run
Sign-in failed: failed to start login server: An attempt was made to access a socket in a way forbidden by its access permissions. (os error 10013)
Anyone know how to get past this error on the windows 11 codex app? Trying to login for the first time and getting this error
I'm having the same issue trying to sign in
does a business seat have the same usage as a plus seat for codex?
the fact that this isn't easy to ascertain is terrible
no, pretty sure it's less quota than Plus
oh
it sthe same as plus for usage
https://developers.openai.com/codex/pricing?codex-usage-limits=plus#frequently-asked-questions - compare Plus and Business messages there
Business has lower message count compared to Plus
i could have sworn i got a message saying the git hub code review usage was being folded into normal usage buckets on april 2nd, but it does seem to have happened
I remember talking to a guy that had 2 business subscriptions (obviously on trial) and some random plugin to automatically switch to accounts when the quota on the first one is gone so this is kinda fair tbh even though this is still possible to-do
In a discussion about a python library written in C regarding data-round-trips between the c code and the python, codex started emitting chinese lottery spam at me out of the blue O_o
ah I see, but it's not very clear whether the new rate card is better value or not 🤔
Well time to test out all of the changes, freaking miserable PR
that's... a lot
Yeah I'm testing out how it would look in a different language, it isn't even the full migration just the concept of it
Could that be the type of task that should be handled via the API, using one of them batch things?
Maybe but I'm still just using the pro sub
The mobile game gem economy applied to developer tooling. 👍
Huh they changed usage?
going to change in the coming weeks
Anyone analyzed what it means practically yet? (Like more usage, less usage, more targeted usage, etc)
give it to codex to do
**Verify this yourself:**
1. Go to Codex settings → Usage → "Buy more messages." Note the price: 1,000 credits = $40. That's $0.04/credit.
2. Go to the Codex rate card (Business token-based table): GPT-5.4 output costs 375 credits/1M tokens.
3. Multiply: 375 × $0.04 = $15.00/1M output tokens.
4. Go to the public API pricing page (developers.openai.com/api/docs/pricing): GPT-5.4 output = $15.00/1M tokens.
5. They match. Repeat for input and cached input. They all match.
**Conclusion:** Codex credits are a 1:1 wrapper around API token pricing at $0.04/credit, sold in $40 bundles through a UI that never shows you the equivalence.
Okok, and for the plus and pro subs? Will they be the same? I'll check it out if no one has done the analysis
