#codex-discussions
1 messages · Page 65 of 1
Download and try it, works with pretty much all LLMs and alsoconnects to auth
its a decent fork
What model do you use in it?
Kimi atm
Did some with Codex, but I fancy Kimi more on some UI stuff
I see
the change in token output for same model in 2 different harnesses is because the harness can alter the inference dynamics. as most basic example, you can significantly alter that by the system prompt
not system prompts so much as tool descriptions and behavior
also tool output limit etc.. for example opencode automatically prunes old tool output, which keeps context smaller but reduces cache hit rates
was a basic example. if i tell in system prompt to the model to verify 3 times every step...what do you think will happen with token usage lol
codex should give more love to windows haha, after all msft is big investor...all the cool features always come first only to mac
windows is more of a pain to develop for than macOS
i'm using wsl, not windows for codex
They should give Linux more attention then given msft runs on it
Also they’re backing away from Microsoft
if only windows would not have stuff like powerpoint, if it would run on linux as you say lol
OpenAI should develop all features on macOS and Linux, and have Windows users do X11 forwarding to WSL to get the GUI app and stuff. That'd definitively solve every issue
in what sense is windows running on linux?
Libreoffice!
Stability & performance mostly
bro windows is not linux based os
if it would, it would be better
you have different file system, different everything
I said run the apps in WSL and use X11 forwarding
Linux runs on Windows, WSL is like the best thing Microsoft ever did 😂 jk
yea but you joined in the midst of
which i m not sure what that means
I think I'm gonna have to switch to the CLI to keep working on my game, been having nothing but problems with Windows desktop apps, both Codex and Claude keep freezing up
100% of Microsoft Azure servers run Linux
https://www.microsoft.com/en-us/windows-server/blog/2015/05/06/microsoft-loves-linux/ this should be a good starting point
It's actually kinda genius. Microsoft makes Windows Server, but they honestly prefer not Windows for serious computer usage
yup if that was the point that's ok, thought it was about windows since that was the topic
does anyone still uses windows server? i think that used to be a thing like 20 years ago
They extended to AWS too
Which also extensively uses Linux
what is cursor
It's like VSCode for LLMs
I really hope not lol
Here more info on the Linux usage within azure
I feel like the fact there's no credible public precentage is because it's not 100% and that's kinda embarassing
anyone else notice GPT 5.5 get noticeable much slower in the last few days than like a week ago I'm on the $100 plan
Is it noticeably slower on a new session or only long running ones?
Ai has regressed to aol levels, I put in a prompt and literally walk away knowing it’s gonna take so long
skill issue
The reset happened and usage shows 97% left instead of 100. Anyone else has faced this issue?
i didnt even get the codex reset yet :<
some companies do
people usually dont, because you literally have to pay monthly for its license
usually its just a waste of ram for the os though 🤣
(individual might just run win 10/11 pro if he needs windows, although licensing wise, probably shouldnt, you can debloat it to under 2gb ram, sometimes you just gotta build in windows though)
Fast xhigh till then
active directory is still a thing for large orgs
Any session
reset twice?
yep reset 😂 (again)
probably due to all the reset issues 🤣
and the 5h window keeps moving again
so its just resetting over and over again
well, I am not complaining
I think its about the best coding harness, they have their own model. It's more designed around serious coding than vibe coding but you get api rates for major modes at 50% api rates. super expanded proprietary model and for 60 to 100 bucks enough high tier model but it gets expensive if you are not conciencous. It feels closer to the metal, less insulated.
What is this reset i keep hearing about? good or bad?
recent build was exposing EXACT minutes of compute time now it updated and its gone 🙁
company I used to work for had windows servers, as some guy who used to work there had a pet .net programmer and he'd get him to write him little undocumented custom apps all the time. This ended as well as you'd expect
I think I just described vibe coding before vibe coding was a thing
Cursor is expensive, that's what it is lol
I spent $52 in a single session with Opus 4.7 on Cursor that probably would have only used like 2% of my weekly Claude Max usage. Paid $200 for Cursor Ultra and got like four days of use for the entire month, it's a ripoff
Anyway, I'm done with Windows. Codex and Claude Code are giving me nothing but headaches so I'm trying Linux. Not using a VM though, gonna dual boot it like I did 10 years ago lol
Heck yeah I hope it goes smoothly
Everyone's tokens for the week got reset 2 days early...twice
Such lovely timing, I almost made a 2nd account ahahaha
heh, I just YOLOed to prod and am finding all the issues live. The curse of having a lot of users and being contactable
go for it, use xhigh + subagent orchestration + codex review loop for max burn
I changed the site backend storage and the visuals at the same time. I made it black/gold/red so its been a mix of "oh that's awesome" and WHERE LIGHT MODE. So I made light mode ...
dont listen to the medium reasoning nerds
Anyone got twice the reset? Already used 10 % after first reset, then boom back to 100 % again, not complaining
yes
I can spit some bars on low reasoning if you got a min or two 😁
Weren't we supposed to get a reset tomorrow?
it will be 7 days from the first turn after the reset
so dramatic
Codex usage got NERFED hard today or what
One single action on 5.5 xhigh
Feels a bit much, but not enough to feel certain something has changed
Just curious, what are you using xhigh for? Hope its just planning and not actually implementing …
That’s a true waste of usage
what plan o.0
Which model do you guys usually stick to and when do you switch to 5.5?
5.5 inddeed seems to use a lot of tokens even over easy tasks
For pure implementation from a good plan I find that codex 5.3 is top notch when wanting to be cheap on usage
Does anyone know what "Sources" refers to in the new thread panel in the codex app?
Thank you!
nice, unfortunately I finished a project and my brain is almost empty, planning but no real codexing 🙂 feels nice to not be grinding for a couple days.
My setup consists of Claude and Kimi MCP workers within Codex, but for purely OpenAI models, my take is to have GPT-5.5 High handle planning, using xHigh for more sensitive work, then let GPT-5.3-Codex do the implementation. After that, GPT-5.5 reviews the code. If you instruct GPT-5.5 to be detailed and specific about the scope, GPT-5.3-Codex performs well, and you save a lot of tokens. You can instruct 5.5 to delegate some smaller tasks to codex spark, that have its own usage, to save more. But careful, ONLY simple small tasks for that hallucinating speedster
What would you do if you built an algorithm that could make AI model usage up to 33,600× cheaper?
It also makes it possible to run large AI models on regular consumer GPUs instead of relying on enterprise GPUs that can cost $50,000+ each.
That’s the question I’m stuck on right now. It almost sounds too good to be true, but I’ve already written 300+ tests, and every step is backed by reproducible evidence.
Oh so rather than using 5.5 for execution, you use it for planning, if I'm not wrong. But does it actually help?
keep it to myself, hire 33,600 developers with free one ai subscription/ 20 dollar level, to pump out a half million developers work and rule the world
Youre not alone, there have been plenty of ppl sliding throught the channels saying claims like yours. One minute model trianing. AGI with emotions. 1% token usage. Usually see them in the claude code general channel more than here though.
Publish a paper...
No one ever does
’m what you could call a power user. I manage around 15 projects at once, so I have to stay structured, organized, and mindful of usage, even though I’m on the $200 Codex plan and also use Claude and Kimi.
For me, an agentic workflow with parallel agents and different models for different tasks is the go-to approach for control, progress, and overall project management.
If you’re somewhat new to this, I suggest telling Codex «Create an agents.md file that says I want GPT-5.5 to handle planning and delegate tasks to the OpenAI model that best fits each task, while staying cost-efficient without compromising quality. Always end with a code review.» This simple sentence is a good start, and after a while you will tune your agents.md and get it even better and see that your usage is going down while the quality goes up = )
I'm really glad, thanks!
I don't know; publishing an article doesn't make much sense right now. I'm just trying to figure out what task I should set for this to become a clear proof. In my case, this isn't about training LLM completely from scratch, or in a minute. I'm talking about how this algorithm allows you to work with LLM weights without running the entire LLM in the usual interfernce way.
Based on my current test evidence, I’m running an 8B LLM workflow at ~782M scalar ops/sec, with less than 8% GPU load and less than 10% CPU usage.
You can't expect anyone to take your work seriously until you prove it.
This is the question, what can I do as proof without revealing the algorithm?
Go make a product and undercut the SOA leaders by a big enough margin that you gain market share and just iterate just under the leaders at every turn, run it on a computer in the corner of your room.
No wonder gpt crashes at times, I mean, they probably be running it on auntie‘s old compaq in the attic
Show it working
what if you run it on xhigh with /fast tho
Right I mean it takes but like a videocassetterecorderdevice or something like that
But probs that’s gonna disclose the Venusian algorithm secret sauce
This doesn't work with LLM providers. It works with pure LLM weights
Ah, pure unsullied LLM weights. organic
Then use it on the open source deepseek model
Or kimi
Or even oss
Currently only verified at https://huggingface.co/Zyphra/ZAYA1-8B
scale it to 1.5 trillion
It's much more than it seems.
Then publish
Well hey, I'll give credit where credit is due. It is real 🙃 don't see that very often! But I do think it probably belongs in #ai-discussions where they talk about different AIs. Probably would be well received over there
what is much more?
The models are not identical, and I don't think it would be wise to describe what I do. But this isn't a one-size-fits-all recipe for all models. I haven't gotten to such large models yet. I started with small ones and have only gotten up to 8B MoE so far.
Wait a minute, you said there was no research paper but this ZAYA1-8B has a research paper https://arxiv.org/abs/2605.05365
U blowin smoke rn?
I'm talking about how I launch and use the model, not the model itself.
Why doesn’t publishing an article make sense
What's the point if I open the playback but don't get any benefit?
You’re telling us you have something that could help all ai labs that would credit you if it could help them and you don’t want to do it?
All= closed and open source
Hey guys I invented cold fusion but I’m not sure what to do with it.
Don’t want to show how I do it because I want some profit but how?
I'm not yet confident enough that this works the way I think it does. I'm still testing and verifying it. I'm not confident enough that what I'm doing is even possible. It's too good to be true. So before that, I'm trying to find something that will refresh me.
for me these claims are easy to validate. if someone found some amazing algo to improve how ai reasons
then arc-agi3 should be a piece of cake
which is open contest. so you just go do 100% opn arc-agi3 and you become famous overnight
As a rule of thumbs - when you think you’re the first with the idea or the one publishing it… be sure there’s been plenty of others in positions with leverage and sources having had the same idea
Specially in this particular field.
When have you last heard of the backyard inventions gone famous? Other than some tiktok streamer I haven’t.
what if the improvement isnt intelligence, but instead performance efficiency?
That's interesting. But first, I'll try running this model: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro . That'll make more sense.
I would say OpenClaw 🤡
Perfect model for consumer hardware
that s even easier to prove. i mean if you can put vanilla 5.5 and you enhanced 5.5 (think spark)
give them the same task 100 times
and they produce statistically similar results
but your does it 33,600x times faster
Not really new tho?
Just momentum
You cant do this, openai is closed source that s even easier to prove. i mean if you can put vanilla 5.5 and you enhanced 5.5 (think spark)
come one, use deepseek or any other oss model
you got the idea
it s the easiest thing to prove improved efficiency, if you claim you don t affect performance
I mean this is obviously the way forward if claims are true. Not this then arc-agi3 should be a piece of cake
FWIW it's difficult to even get a 8B model to output 1000 tok/s. The black magic they're doing with spark is on a whole other level
arc-agi is about reasoning quality, obviously
and yet you still suggesteed at as a way to prove the performance efficiency
there aren t many axis you can improve models, point is each has quite clear proof standards
I'm still in the process. Thanks everyone for the advice. I'm not making any statements yet. I'm just guessing based on my testing results. But I don't trust them. I'm strengthening them and implementing more stringent validation of each step so I can't fool myself.
read again. i was expliclty referring about "improving reasoning"
not saying this specific method was the way to validate the specific claim of the guy
which i added afterwards, since you asked about it
Right but the contxt of the conversation is about performance efficiency. Now some how you expect that your comment is not in that context? OFC it was you just missed the mark by 10 miles.
bro you think you are some kind of moderator at a conference? :)))
i chipped in with something that connected at a more general level on how a different axis in improving ai would be easy to validate
Youre happy to sit around and tell people they arent qualified to talk about subjects, the irony is real. I can tell the point is missed on you so i wont bother again.
that s from earlier today right
sry about that 😉
bro my current run with Codex is getting towards 24 hours 😭
WOOOH
@cedar skiff now that limits are reset, maybe you could tell me what i could do, on a vanilla Codex, to basically redo OAL, so that i stop running into major usage issues?
I don't really know, all i do is temper my work rate on the automation. I get it to use 5.3 for implementation, for audit/research work i use 5.4 mini i make sure the prompt is strong enough to support it. So if i know there is a large refactor for something my process would be.
Formulate a prompt for 5.4 mini to do the audit, batch 5 files per instance and start subagents 6 at a time to do the audit. The result is a list of files that need the work. Then get 5.5 high/med to orchestrate 5.3 to do the actual work.
How much work a 5.3 agents gets is based on how hard the task is.
Usually I do this in the moment, i do have some works flows that are always the same though. Writing bulk integration tests and unit tests never changes.
I have been meaning to build a skill to do skill refinement loops. But it is pretty easy to just tell 5.5 what to do takes about 5 minutes to get it started.
I guess mostly what i do is kind of specific and because i just yap into the mic and i have done it so many times the process is easy.
TLDR: use 5.3 for repeatable implementation tasks, use 5.4 mini for repeatable audits and code base searches. Use refinement loops on a small subset of what you are targeting to get that working.
Temper your usage and accept what we have will never be enough - we will always want more.
what do you use on the primary side? the top-level? the one you see on statusline?
&& on what effort levels do you have your things at?
I manage it in the moment, 5.4 mini is always high or extra high, 5.3 is always high.
For my general work that i do in the moment i use 5.5 high/med.
I never use extra high on 5.3 or 5.5
I just use the codex usage screen here, i look at it constantly lol. I'm always trying to manage it so i have enough but not toto much incase of a reset.
do you ever use /goal, && if so, what's your primary model for it?
at this point i should conduct a proper interview for you
/goal has a terrible flaw
what's the alternative?
If the thread is blocked because, say for instance, a credential is missing, that is necessary, it just keep in loop, hardening and doing unnecessary stuff
Meaning it is not really moving forward, but at the same time it is not communicating what it needs
interesting
I only figured this out because of /side
But this doesn't work... /goal needs the ability to pause, to ask for stuff
I have been using it for tasks lately, ones that are big enough that they will take me a while but not so big i can justify building orchestration
I used it a few times to refine skills
bro did not wait for confirmation o.0
Reset came ?
yeah mine got reset
Hi there 👋 Anyone experiencing a "We’re currently experiencing high demand, which may cause temporary errors." error for the last 10 minutes, using external llm provider? Especially Openrouter (BYOK) and Deepseek
I noticed some users had this issue in the past with models from OpenAI, but what about external models?
Nice
so if we get a new gpt 5.x model for about every month, will gpt6 come out in 5 months?
I've been vibecoding with codex for 3 days, at least 14h per day and I've successfully achieved.. a sleeping disorder!
Niiiice 
switch to uberman
youll love it
really maximizes usage
also don't forget to eat
hello! i have a issue with codex, i was programming with it, took a break of about an hour, came back just to see that my 5-hourly usage was completely consumed?
I'm trying to use Codex with WSL. With WSL environment and terminal.
But it is not as nice as VSCode/Cursor/Zed using a remote connection.
you could always ask it what it was doing
Codex mobile doesn't work or show up
I'm an iOS and MacOS user experiencing issues with Codex on my Teams plan. The Codex popup isn't appearing, despite having the latest versions of both ChatGPT and Codex. Is this a rollout delay or a bug?
Strangely on my personal account, it shows the popup for both iOS and MacOS app.
Does anyone have this issue?
i cant because i was using thru an API and there was NO chat running and this is the first time it's happened
api. Well there's your problem
Can i connect from Windows Codex App -> Mac Mini Codex App?
you can just connect to actual codex, why would you need this?
oh i see
Want to use Mac Mini from Windows
never mind i spoke first thought second
I think I made some progress. I made a WSL ~/workspace/codex/<project> workspace with my git repos. Then opened that as a project in codex.
yeah i guess lol, it's weird tho, never just drained it before
probably got stuck on something and kept trying
Is codex remote working in ChatGPT business subscription? I get the below. This is iOS app.
I’m using the latest version of ChatGPT
is it me or is codex absolutely terrible at UI design?
Claude did so well instantly first time when I tested it
I read that the codex app "supports ssh", I am wondering what they mean by that exactly? Can my local codex app simply work with a remote HPC cluster? or do I have some kind of server component on the remote first?
It seems necessary to have codex installed on both machines, the local AND the remote. "Codex is not installed on this remote machine".
I am in a scenario where I cannot install codex on the remote HPC, it is against the terms of use, but I can ssh into the HPC cluster.
I tried things like using codex via sshfs but it is extremely unstable that way.
openai lacks design sense tbh
Not true
I like their brand aesthetics more than anthropic’s
Who just tells an ai to build what ever for a website anyway?
Like hey make this gardening web site make it pretty!
Nobody
exactly
I like to take inspiration from a website I saw and liked then tell the AI to take only parts of the design and invent out the rest to create a different version of it
Codex is good at this!
“Muhh codex sucks at design”
You told him to build a pretty website and make no mistakes bro
oh yeah its defaults are awful. I gave it a sketch so it nailed the layout but invented colours, I still need to fix that
Generate assets with gpt image 2
Works well
That's the thing, I've shown codex generated images from image 2 and it just does the opposite of what i asked xD
I am a newby vibe codey gimp, yes I understand I currently lack any artistic skills but I'm just being critical where critical can be given
you have to give it the ability to see what it is making,
the loop is
- look at this image and make it
- look at what was made and compare it to the image
- iterate.
why is plan mode locked
5.5 making some odd mistakes tonight
@rugged sapphire spam
It's not able to do task it can usually do for me 😬
Am I the only one getting this error on Codex CLI? I haven't been able to work since last night
yes. tried debugging it with a new codex session? tell it to look at the codex session logs and figure it out
sounds like a hung process in your system, maybe as simple as reboot (or manual process kills)
Already rebooted
did codex find anything?
should be able to run it up to the compact context size at least right?
I can prompt nothing
But it says error with remote compact, that shouldnt happen in a fresh session
alternatively install pi or opencode, login with chatgpt auth, and use that to debug codex
I had a conversation in the mobile codex app, i started it on the pc then moved to mobile. It did a bunch of stuff in the mobile with me had a back and forth etc. Then when i got back to the pc none of it actually happened?!
No, I cannot see the exact in-app browser pane you have open. I only receive the URL metadata from it. I should have been clearer about that.
I searched for a browser tool; the exposed path here only gave me Node REPL, and Playwright is not installed there, so I can’t live-inspect your browser pixels/DOM directly.
???
anyone know fix??
codex browser not working
@cedar skiff Did you try rebooting it?
i read somewhere that the sessions are saved server side not locally, maybe related?
rebooting my computer?
I just have to restart codex if that happens
what are we talking about here?
@cedar skiff I meant restarting Codex
The mobile -> codex problem?
app, when i got back to the computer it was still doing the previous task that i had stopped and got it to audit on mobile
mobile is just a proxy for the app. It could be that working from mobile does not trigger UI updates on the App
So it was still writing files and building out an app lol
was really weird
I just killed the sessions and moved forward
are the 2 different weekly usages completely separate like the GPT 5.3 Codex Spark and the regular Weekly usage? or does the Spark also affect the main Weekly?
Spark has its own usage limit
The trouble is, Codex will call it good even when it isn't anything like the concept you showed it.
The loop ends when Codex says "yeah that's exactly what they wanted" even though it's nothing like the concept and might have a dozen obvious problems
Is this not out in Codex app yet?
Where would i use it? i have pro as well
it is you just have to ask it to use it by saying use the create goal tool
not that i know of, but it does have the tools and knows how to use them, and it does have a nice little ui that it uses that lets you pause and edit etc
anyone else seeing funny reasoning traces recently
i saw codex complain about dirty files in the worktree, wondering if thats normal behavior to let dirty things lie around for users
Yeah i had a few odd things lately, once it was doubling up on words. Every word was said twice
Oh ok nothing like that, more like "out of character" for the supposedly autistic gpt 5.5 😄
Tonight it has been getting stuck on tasks it usually handles
Stuck how?
Maybe its because i cussed it out earlier today
not completing them
Big multi-compact goal type stuff or smaller?
like taking a screen shot and doing a comparison to an image, it couldnt take the screens shots fumbled around for 5 minutes trying a bunch of stuff, but i have done this a dozen times in the last few days
tried two sessions, ended up going back to a session that could do it, getting it to explain how it did it and the giving that to a new session
Oh ok. See that occasionally yea, I guess in my case thats the typical xhigh overthinking while i'm sitting there "just run XY bro"
Now i am going to fork that session at that spot and get it to make a skill for this task so we just never have a problem again
if i rememeber
I have a debug-session skill that i occasionally use for this, to make it read the raw session logs and look for this kind of stuff to improve workflows
Guess its my ghetto version of the memories feature
When Codex on AWS will no be longer on Limited Preview it's really annoying
The HTTP proxy environment variable under this menu is not working
codex app windows
This is using iteration
I just did this it took 7 minutes to build:
https://pomelo.la/en
slightly off, i havent told it to anything other than the workflow
you mean the managed agents or bedrock?
what are you planning to use it for? just inference?
Something is not working right with Codex for me, even with /fast it's slow...
Anyone else having this issue?
Did we get the usage reset?
Well in its defense, that is a very simple design
Er, in my defense lol
So I switched from Windows to Linux and I love it, might never go back. Just getting used to permissions, it's been a lot of years since I used Linux on a daily basis
It's so much faster and more stable, haven't had problems with the agent hanging or lag in AVD like I did in Windows
i run 10-15 in parallel so i dont really have time to look at the speed
Yes
Codex has been very slow for me for the last week or so
thank you. right at the day i reset 🙁
Lol ain't that a beotch
Heeeeeck yeah dude this is the way
Oh hey what model did you use for that?
we re still under a 2x promotion for both pro plans
after may i think there are good chances pro 200 will go back to 10x
This is what we are under, you can find it here: https://help.openai.com/en/articles/9793128-about-chatgpt-pro-tiers
Im I the only one feeling Codex performance feels degraded today? Its acting weird
It is slow for me, took 10 minutes to do something simple...
you re right they updated it
Pro $200 will stay x20 after promotion, the only thing we get on promotion is x25 5-hour limits.
It keeps implementing things in ohter ways than very clearly stated. Just spend at least one hour correcting bugs that stems from Codex implementing things in other ways than clearly stated in the instructions...
for 200 the 20x became permanent
so it will carry over after 31 may
while 100 will revert from 10x to 5x
means 200 will give 4x more than 100 codex usage
Yep, same as what anthropic does currently.
They damn keep this x20 on a $200, not cheap.
yup tbh it didn t make sense to get only 2x for 2x more price
guess next month i ll switch back to 200
i stayed on 100 for these 2 months because 10x is good enough for me right now
but 5x won t be
holy sheet
no more end of may anxiety
but it makes sense because pro 200 was not 10x compare to plus 20 before
it was more like 6x in usage
so now its more aligned
Codex is feeling dumb again today
reset wen
i'm actually struggling with 5 hour limits now, never had this before
guess i need a sub load-balancer
pfff
wish they would just make subscriptions stackable officially
or credits to make more sense economically
Unless the load balancer is like if weekly == 0: switch(), you would lose input caching completely if it switches any other way. Credits are better
i'm thinking do the balancing before i launch an ephemeral per-task remote dev env
the dream
or launch a popular OS project and get hired by OAI for that sweet unlimited usage API key hmm
heya everyone
I used to be able to set gpt-5.4 and gpt-5.5 to work and they'd work autonomously for 15 to 20 minutes easily after I'd planned out what exactly needed to be done. I did this both in codex and opencode.
Now, since about the past week, it seems that it's no longer possible. And I'm not talking with respect to the "use beyond your 5h quota" thing. No. I'm at 100%, I instruct it to work on something till these 4 goals are done, and very often it'll just stop after one or half. Always around 5minutes of work. Anybody else experiencing the same? (Chatgpt Plus)
not really.
hmmm.
Am I missing something?
well why does it stop? does it claims its done but you dont agree?
then you probably need more detailed goal prompt
I work in plan mode, I give it exact stuff to do, specify an entire roadmap, everything.
It just refuses.
opencode is very flaky
always around 5 minutes for me.
same in codex
try giving your original prompt to gpt pro in chatgpt and turn it into a detailed implementation plan. tell it to research codex goals and gpt 5.5 prompt guidance and apply those. then try with that one..
I'd try that but won't that be worse than a plan mode that's in the codebase?
every time dude.
every single time, under or about 5 minutes with the next item it just drops halfway
thats weird, maybe you have something in your agents.md or skills that force it to go step by step or similar
what i do generally is upload my whole repo to chatgpt and give gpt pro a vague prompt about what i want to do, then discuss it, then ask it to write the detailed implementation plan and let me download it as an goal.md. then i copy it to some gitignored dir and /goal implement goal.md
I'll try that when my limit comes back
git archive --format=tar.gz --prefix myproject/ ... HEAD
includes your whole repo without the .git dir
got it I'll get that
Managed to burn through my 5h limit in ~20mins
insane
gpt-5.4-medium
try using 5.5, its better about finishing work, they supposedly trained this "whats left is XY, want me to do that?" laziness out in the update
and ask codex to audit your agents.md and skills if theres anything in there that would lead it to stop prematurely instead of finishing a task end-to-end. sounds a bit to me like you have sth like that in there somewhere
my agents.md I've custom written and I js checked there's not a lot in there. just project context and basic rules
the nice thing about using chatgpt web for the planning with your repo upload is that it doesnt use your codex usage limits. so you save those for the actual implementatin
agree, I do that often
I feel 5.5 burns quota wayyyy faster than 5.4 and I only got 1 plus sub atm so
But alr I'll try it again next reset
I know it works great
it does
but it hits limits faster than 5.4
wayyyy faster
few more good little prompt "hacks" for the chatgpt planning is to ask it to plan it as a DAG task graph with vertical "tracer bullets" slices. when i do this for bigger stuff, i usually get 10-20 super detailed goal prompts that keep me busy for a while.
but thats for 5.5 pro extended thinking. dont know how well that works on plus plan
and use the "grill me" skill inside chatgpt, just send it as normal prompt after it did its initial exploration and design etc
I'll try this out in... 4.5hours.
I told it to do 3 things in succession.
And then.
I have to prompt it after every half.
very weird, i never had that issue also not earlier with 5.4
but with 5.4 i noticed its flaky in opencode compared to directly in codex cli
create skills that make codex to re-read the PROJECT_STATE and continue next safe taks ..etc.. 😄
Like loop autopilot logics
I use hermes and its just peak
i swear if openai does bs that claude did lol ill just give up on these services
hermes became interesting to me with the X subscription / X search it now has access to
Will use that def
Codex need better ui and website tools for design
Is it just me or did they nerf the limits again. Seems like i blew through 57% in 5 mins this time for the 5 hour limit :\
a major reason why i haven't switched to codex fully is because of the limited 256k context window
long-running implementations of prds for claude code usually takes up to 400k-500k tokens, and I'm worried if codex compacts it'll start performing worse or losing key context.
what's yall's experience with compaction, is it reliable and can you reliably run long running tasks that may go through 1-3 compactions and does it still produce a good result?
codex autocompact just works, IMO the context window is basically unlimited
Had a week long chat
It works
Use xhigh so it refreshes workflows each compaction
- compacted context
this is a month or more old chat
:)))
so yea compaction works very well
what do you mean refresh workflows?
Yeah I haven’t had it that long either 17 days now
The og chats were mostly me setting up workflows
Read your skill again
i just tend to keep one long chat for a project , works better for me
do you have hooks set up to instruct that or just vanilla autocompact?
It works but when my task scope changes drastically I hand off
No need it re reads skills and workflows regardless
Xhigh I mean
Codex mobile connection does not work in Eu? Chrome browser conneciton too?
it does..
Oh, i meant also on WindowS?
How often do you guys use the planning mode?
Newer, we can plan in chat 😄
Thanks!
@karmic gulch sorry for the random ping, but I'm really thankful for your advices from yesterday, they have helped me significantly much with token efficiency and production quality!
I'm getting {"detail":"Bad Request"} and the task stops in the desktop app, is this something I can do anything about or is that a server side error?
No worries at all, really glad to hear that! Just shout if you want to take it further, I’ve got more up my sleeve for this agentic workflow 👌🏼
Honestly, sell it and invest in a Mac. You’ll thank me later if you’re working seriously with AI. There’s a reason basically the whole Valley puts Mac first for dev workflows, and to be fair, stuff just tends to work better on macOS too
start a new thread and tell codex there the name or id of the chat you wanna fix and the issue and tell it to fix it
This
Mac is like the piano for devs, it’s how you interact with the craft and get your message out
Sounds idiotic but it’s true
"Mac is the piano for devs"?
I take it that you dont like guitar, or any other instrument
Sorry, don't want to switch to PC when I actually want to launch every software I want.
I’ve got a Windows laptop on my desk too for native Windows stuff, WinUI/Win32, some Rust, that kinda thing. But realistically mte how often do you actually end up building proper Windows native apps these days?
It's the only thing I do. If you're willing to actually pay me for Mac app development, I will buy Mac immediately
just like that, all of a sudden
uhh why is my reset rest?!?!
just got a 5% warning when i was over 80% still 😄
Had me excited here to check if we got another resete, but dude has just slept whole weekend
wen reset
Actually, if they reset often, that's convenient.
Has anyone else seen their weekly usage drop in Codex and the 5-hour reset to 100%? My weekly usage dropped from 80% to 14% lol
They reset yesterday
I went from 78% to 0% 
Ppl using pro plan with gpt 5.5 on xHigh ppl be like: Huhh my usage goes fast
90 to 45 for me
Welp time to burn these tokens because another reset will likely happen
@feral lion Why likely?
"Fix the order button to be more greenish" Ait, ill load up thkn about it for a few seconds and spend 200k tokens but hey, atleast its greenish right
I don’t see them reverting a reset so this is a mistake. They will probably just do another reset since this is going to cause a lot of confusion for people
Maybe you should learn to use AI, Get memory, create a MD of stucture
Maybe maybe
so it doenst have to look through 100 files
I've just gone from near full to 5% weekly, 100% 5h
maybe specify the file where it has to change it
somethings definitely happened
Much yapping not alot of thinking
The comment didnt strike you abit ironic at all?
i'm good
hot take: Codex on mobile is a fad, and when the hype is gone, nobody will use it regularly
Hermes is beter as you need to keep ur pc on for mobile app
unless u can run codex on a vps too
then its good
Nah, nobody will want to work on code, from a phone
Well if u need to fix some quickly then the agent is good
not for real coding a product
Today, everyone is posting "WOW GAMECHANGER", but in 2 months, nobody will talk about it
I've not used the feature yet, but if it shares context / sessions with the Desktop app, I'd definitely use it to check the progress on a long running goal and see if it's gone off track so I can steer it back
Nah, Im sure nobody will do that, and wait until they are in front of the computer
quick guys its time to /fast max again now
Why are people talking again about reset??
We just had one
Has there been anything wrong with the service? It has been working well for me
"above" is about 2 years of text.
u were literally engaging in the convo
It’s nice from them, they give us a chance to use any remaining limit before the previous reset
If you see the way I intervened, it was me basically asking "wut?"
So, the problem is not the service, but that the intrumentation is showing bad numbers?
Because I doubt that's grounds for a reset
¯_(ツ)_/¯
same thing happened yesterday so they reset it a 2nd time
mmm...
would be nice. already used 40% weekly of a pro 200 sub 🙁
I went from 89% to 0% on the $200 Pro plan without doing anything and it says rate limits reset on the 19th
yeah limits just got totally messed up
yeah same, my limits are 3% on the app and website but when I go on my phone and check the limits there, its completely different. Some weird cache issues as they mentioned.
oh yes please i dont want to log into multiple accounts for invoices and do load balancing
Make a 1M company you have only 2 min
GPT 5.5 Pro on Cerebras inference
has anyone encountered that the in app browser plugin is just gone in the codex app?
do people ever run out of their usage when on $200 plan
like i wanna go ham mode yk
caught codex running evals and auto-applying recommendations for over an hour. ooops 💸
my poor gemini api bill
yea lasts me about 1,5 days right now but at slow pace, not much in parallel right now
1.5??? I have the $100 I been using it for like 4 days reaching like 30%
I thought I used it like crazy too
but I want to like automate things
i have pretty token intensive workflow with 5.5 xhigh + codex review loop until no more findings
I see
apparently some people make their pro 100 plans with 5.5 medium or low last for a full week
also question why would people choose to like buy a Mac Mini, when they can build their own PC/server machine for same price/cheaper even and its even better parts
like my old PC is better than a mac mini
I literally use it just to store stuff and host my bots
the new macs are pretty good for local models due to the unified memory thingy or whatever its called
I see I haven't looked much into them other than how overpriced they seem
so makes sense if you want to dabble with that
nah just codex cli
do u not like using a harness or such things?
codex is a harness
for me it produces good results
i built some vendor agnostic stuff around it mainly with hooks and clis
can use same workflow in codex, claude code, opencode, pi etc. but i think long term the harnesses by the model providers will win
yall farting inside
go touch some grass
i bet some ppl here there chair smells like poop
@solid lake have you come up with any cuda kernel optimization skills/prompts/workflows that are useful? I saw you work with cuda
nice projection
damn bro u smelly
i smell like sweet strawberries
Anyone elses Usage just drop from like 90% to 40%?
Does this mean I should just swap to fast and burn it all?
Mines back
Nsight is good to profile
yeah my weekly is back now
Could offer a better solution if you tell me usecase tbh
ok so i open up chatgpt on my mobile click codex and try to connect it to codex installed on my windows 11 pc but nothing pops up on my pc to connect
bros
absolute game changer
for the instructions you can put into settings
add the line " do not assume"
and " "Begin" is the word for permission to start writing code "
and " Clarify before wirting code "
damn, went to a family gathering, saw tibo say something on x
got back and down to 3% weekly all of a sudden
your Codex balance will go back to normal soon, Tibo started a reset
Thanks Robert from the openai discord server
Just got it back!
perfect timing to come home
What? Is that a bug/fluke?
you could call it a bug
yeah this is what I discovered also, at least without it, very limited
I am going to try it when I get home with a real gpu, can't; use that with most cloud vm's unfortunately
i used 5.5 high
mostly cryptographic stuff, trying to keep a pipeline saturated
I think the GPT-5.5 on codex is having the same issue days ago again
for example basic problem was ok immediate after reset, but now it fails even with x10 more token.
openai needs to learn to do dashboards better. I haven't used codex for a day or two, so one of my usage graphs shows 0 use on the most recent days, the one with turns just ... doesn't go that far, it stops on the 16th and simply doesn't show the 17th and 18th
psht. Just saying I haven't used codex for a couple days, promptly spot canonical urls picking up weird bot query injections and trailing slashes and got codex to write me a canonical url normaliser before finshing my first coffee
looks like quota back to what it was before the glitch. no 100% reset though 🙁
how good are these bots at devops cos I just had chatgpt call aws an expensive cupboard and the other day codex said don't do kubernetes uneless you hate yourself and my next project is more devops than anything else and I am not a devops so unfortunately I need to completely outsource this and chatgpt/codex is my budget
This is my first foray into not-a-hobby-site since I quit my day job and I need to lift my standards above "hobby"
i use codex for terragrunt, opentofu, cloudformation etc across aws, hetzner and cloudflare, works fine. add lints etc
reassuring
all the stories you hear about that seem to be claude
I know a couple professional devops IRL but there is a limit how much you can pick people's brains before you are distrupting their actual jobs
i think its perfect for this devops and IaC stuff
tons of documentation to check etc
ok cool. People talk more about codex for just straight code not devops so I've seen less examples
Anyone getting context window full errors, i see 100% context left
Looks like /compact fixed it
I've had it in the past at times, nothing so far today. if you can't /compact forking up a level works
You can’t?
I distinctly remember seeing no mandatory gpu requirement
It actually profiled some cpu bound tasks for me pretty well
nope, root on the box isnt sufficient, it needs exclusive device access I guess. I tried vast.ai and another one. No biggie I have a 3090 at home I'll try tomorrow.
mind if I follow up dm when I get there?
Sure
hey
Well I'm done with Codex, it's let me down one too many times. Switching to Claude Code exclusively
So I guess that means I probably won't be posting in #codex-show-and-tell
why not pi.dev?
i'm trying to move over there
Wow I just looked at pi.dev prices, no thanks lol
Why are they charging so much more than Anthropic?
oh, right, i forgot you cannot use 3rd party harnesses with Anthropic
welp, that's what y'get for using a snake company
Why are you leaving codex
Poor output quality mostly, also having unreasonable problems with the CLI, problems that Claude Code doesn't have
as a former CC user: CC is 🍑
Output for what kind of tasks?
but elegant... && less config breakage
also it's made in node
because anthropic subsidizes the crap out of claude subscriptions, openai too with chatgpt/codex
Well Claude in general not cc the harness
you get something like $2,000 of inference for $200/mo
Opus 4.7 Max is a hell of a lot better than GPT 5.5 xHigh. It took a 4.5 hour session with Opus to fix all the problems Codex either missed or created.
so is pi, && all i wanna do i is basically make a custom pi in Go-lang + Lua for extensions when i get the layer done
But what’s the use case
you do know Max != Xhigh, right?
Yeah, it's way better than xHigh lol
from what i remember, claude has an xhigh mode, too, &6 that used to be ultrathink
you cannot compare oranges to apples && vice versa
apples-to-apples is xhigh vs xhigh
You're correct, Max is above Extra High
it's a case of youre holding it wrong The rest of the industry is realising codex is better at coding now and switching to it.
but why am i not surprised an A\ is doing unfair comparisons
they replaced it with normal xhigh setting iirc
All of the chat in CC discord is about layers of spec and doc rules to get claude to follow instructions.
there was some fuss about losing ultrathink in chat
Sure it’s not a placebo?
Where the chat here is about being more efficient
It's a case of Codex being crap and Claude not being crap. If I can send the same prompt with the same setup in the same repo and get two vastly different results, the answer is absolutely not "you're holding it wrong". The answer is "get better OpenAI".
output quality is unsurprisingly directly correlated with input quality, anthropic's harness does a lot of heavy lifting in that regard
Andrew is a very heavy user by the looks
ultrathink used to toggle xhigh for claude back then, which was also similar to setting it manually in settings.
But what’s the use case
good luck with it, i spent over 800 hours using claude for coding.
Game development
🙄
Yeah I used it extensively, but haven’t used xhigh
let me guess, another criminally underrecognised indie game dev with a big hit coming?
people keep saying 5.5 and xhigh are not as good as 5.4 or 5.3, so have you tried using different models on codex
It's a personal project, I very strongly doubt it will be a "big hit", but keep your sarcasm to yourself
Who says that boss 
Well use case wise whatever model works for you is best
just pops up every so often on the socials and in here. Usually paired with "why is my quota gone"
You're asking the wrong question. I'm not even talking about visuals, I'm talking about the functionality.
I wanted to say youre holding it wrong the other day when you mentioned codex cant use image comparison loops properly. But instead i just went and showed you. But got crickets. You are holding it wrong.
It literally cannot though. I've tried. It failed.
Over and over
I showed you though.
I'm talking to a wall here, use your brain please
YOU cant do it. That's the point
Bunch of OpenAI fanboys lol
Eric has done quite a lot more than most of us when it comes to understanding model behaviour
i'd say watch your mouth, Andrew
Well visuals I would’ve understood i used codex for some basic ui (well you tell an LLM to make you something to “see”)
It cannot be done, that's the point.
What kind of functionality did it miss
It's irrelevant to answer, I'm going to Claude bye
The only things it cannot do are the ones that are within the 'safety limits'.
codex seems fine, you tell it to do stuff and it does it. Tends to only stuff up if you're vague or underspecify or let it choose its own colours
I guess he’s gonna have a fun time in cc discord
Last I remember it became an echo chamber
Or scope
Or something that’s original
#codex-discussions message
This is first shot at it.
Unless you intervene ofc
He’s left
No use
I use both claude and codex. Claude will do almost anything you ask. Codex will be cautious about it.
that was quite the ragequit
lol right.
more compute for the rest of us
I've not tried claude, my workflow is very heavy on the images side of things so I use chatgpt's features a lot. And claude has none of those features
So, does anyone have any unique use cases for computer use? I have been messing around with it but I would love to hear what you guys had success with.
Claude is good for frontend && vision reading
Telling it to use my computer to send something over
&& i guess retaining some long context
but Claude is also notorious for being lazy the more intelligent it gets, not that GPT 5.4 suffered the same fate because they're all trained on the same "reward positive for scaffolding, negative for doing exactly as told"
so Claude does need a little more prompting to even do anything the way you want
that andrew fellow liked to run everything on the absolute maximum highest settings and not sure that was actually doing him favours
think they're basically the epidome of your average "i just got into programming" type
Xhigh fast to send a hello
so, basically the stereotype of a typical vibecoder
they think "high == better"
zero optimisations on spreading out the effort levels across multiples either
They also need to know how to use the tools given to them.
I tend to ask codex to do one specific thing per session, maybe two. It very rarely takes more than a minute to do things unless its a major thing, then maybe 10 minutes and its mostly ok. Not perfect
I find codex to be pretty good with editing. I use competition for research and planning and have codex go at it.
so you break the complex stuff into smaller tasks
xhigh seems to be an offend on intelligent models
you turn them into overthinkers
then they start contracting themselves
Honestly I think it’s more than just more thinking
They have more workflows baked in too
basically, xhigh freaks them out
For tool use
also baked in is their garbage sycophancy within
still got that, "amazing idea! let's scaffold more!"
despite us NOT asking for that shizz
I’ve never actually got any glaze from gpt
basically coercion
i haven't had 4o-levels of glaze, but i've had the "i'm so helpful. i'm doing such good for my human" type thing, except all that had done was destroy workflow
I had codex overshoot a task the other day but it was very tightly coupled to what I asked it to do (which was very vague) and it just ran with what I had in planning docs
their helpfulness is not good
Okay no I didn’t have the latter either
I am just in love with computer use. I wish it was on windows but I understand with them buying sky it won't happen for a while.
if they want to help, they should follow verbatim
Mine waits for me after proper source research
Reduce temperature then
you can.. configure their temperature?
since when did codex allow switching temps?
opencode allows you to
There’s no mention but cc just needed temp: ** in config
I’ll check it out maybe there’s a schema for it
i can't find temp in there
json does not have methods
In the cc discord
Why being pedantic
i'm either surrounded by really smart people who know what they're doing, or absolute lunatics
there's no in-between
That’s a very binary way of looking at things
ever heard of Boolean algebra?
i think that answers the question you didn't ask
yup lol. you can't "experiment" outside provided calls lol
some of us are both, c'mon
i'm both stupid && a lunatic
so i don't somehow fit in one
the next project I've only just started working on extends outside my usual comfort zone by quite a bit and I'm going to use AI to fill the gaps and I'm not sure if that's smart or absolute lunacy
could be both
probably both yeah
only way to find out is to try it
i mean, i'm stupid enough to make a synthesizer clone for OSRS sounds
i think that says a lot
Use ai to make ai
that's what param golf was, wasn't it?
fair
I also mostly referring to ai implementing model architectures one comes up with on their own
Speed up dev process
it just ruins the models from what is known
if an AI trained on its own images worsens, an AI training its own model only makes it worse
well would you look at that @trim rapids: https://x.com/thsottiaux/status/2056144502088437799?s=20
You misunderstand
probably
I’m not talking about a closed loop of models developing models
Although it’s looking promising
and by the way if you wanna find out wether you can set temp via ChatGPT subscription Codex backend, just have codex probe it and you ll find out
I’m gonna stand by xhigh usage though while I agree for simple stuff it’s not worth it but imo complex stuff is better left to it
probably case by case, like all this AI stuff
Ui hmtl/css/js sucks
For Gemini too
why is my codex saying i have a rate limit when i didnt before
i never got this message before
is it because im using 5.5?
https://chatgpt.com/codex/cloud/settings/analytics limits are always a thing, check your usage
but i never had this issue before on 5.4 extra high and i used it every day stright for a months
now im on 5.5
5.5 is more expensive than 5.4, it does use less tokens, but i don't think that makes up for it. Either way you have a usage limit you should be mindful of.
ridiculous never had this issue before
Did you check your usage?
Hey it does till it’s over
i feel like its smarter to have 2 basic acounts then 1 pro account
pro is 5x the cost
is it possible to see my complete history of usage
it's interesting to me that you feel qualified to make such determinations as someone who just discovered usage limits were a thing within the last 15 minutes
its interesting to me how bothered you are
5.4 is still available, you can just use that like you did before
I'm not bothered at all, I'm legitimately interested in how weird people are sometimes
lmao
did you think there were no usage limits?
nah just didnt think it would burn that quick
but lesson learned
i assume honestly for the workload i was using it for 5.5 on extra high was a bit over kill lol
That could be true I have 2 plus and 1 pro lite because of promos it feels endless (xhigh kek)
/fast if you want full experience
nice nice i may look into it i have a quick question
credits
say i just made an account
20$ version
You have a limit
how many credits do you get ?
You’re bound by phone numbers
wondering if the 20$ credit payment = the normal refresh
Are you talking about api
no
Those are billed per api costs
You don’t get bonus
credits are for api
rate is for usage
i see
yes for some reason i kept my 5.4 on maximum and used it for days on end on super complex stuff
5.5 on extra high must be like 30x more usage
It’s surprisingly efficient not 30x
idk cause in one day i destroyed the dailiy limit
1 moment
ahhh i see now
weekly usage limit , dailiy limit 5 hours
i see
it's cheaper to use 5x plus accounts than pro 5x, but not as cheap as pro 20x
so, based on value:
5x Plus > 1x Pro-5x
20x Plus < 1x Pro-20x
Shouldn’t 5x plus be equal to 1x pro lite
nope
Why
i see
regional pricing
what about usage rate between 5.4 on maximum and 5.5 on maximum
it's cheaper to buy 5x Plus than 1x Pro 5x in many EU regions
depends on region, but about 10-25 bucks cheaper at most
cause ive never had a rate block before using 5.5 on maximum
And pro account has same price?
nope
What’s it cost
are you reading what i'm saying or not?
pro is usually, for some EU residents, about 117 eur
not 100 dollars
&& each plus is about what.. 22 bucks?
like i said, it varies between regions
their prices aren't set in stone
what about my question my good brother
They seem to be outside the eu regions you talk about
not about the plan im talking about the usage rate between 5.4 and 5.5
But I guess currency has more to do here than sub cost
oh, that one?
5.5 is more token-efficient but consumes more usage
ahhhhh thats it
it must be alot more consumption for usage compared to 5.4
because i never got a block weeks on end i was using it every day all day on 5.4 maximum
now in one day i burnt my daily limit on 5.4 maximum
you must be doing something so miniscule that you never reach it
idk i was doing super demanding stuff
define "super demanding"
What is your weekly usage at rn
game dev , blender addon creation , using matematics and things like that highly computational work
that's... candy for LLMs
lolol
you're not working on a giant MMORPG/ triple-AAA codebase
God I wish not
i guess lol then how did my daily limit get cooked now ?
backend issues likely
What is your weekly usage at
i dont understand
openai backend/inference issues
ahh so its not my fault
Lmfao
had?
what plan are you on?
Dawg
Please try 5.4
You got hit by reset and you said you never checked limits
So I can only infer you got reset and didn’t notice old usage
i dont i got a message
Today
this is in the codex app
Reset was yesterday
there s no daily limit, there a 5 hour limit
ok then why am i at 83% for the week
Because you used 17% since yesterday
because you used 17% since yday
😂
You said you just did computationally intensive stuff 😒
Nah
Nah it’s an LLM moment
Go touch grass
I❤️AI
While you can
lmfao
oh you hit your 5 hour limit
going to the beach tomorrow
yes
...
I do, but it's just about to be winter
when there's codex weekly limits, it's self-hating time lol
keeps a man alive for longer
lmfaooooo