#codex-discussions
1 messages · Page 13 of 1
waiting for 5.4 codex
But yeah, the perfect "I want to go from 100% to 0% weekly as fast as possible" configs would be
model="gpt-5.4"
model_reasoning_effort="xhigh"
model_context_window=1000000
model_auto_compact_token_limit=900000
service_tier="fast"
[features]
multi_agent=true
That setup will allow codex to read a couple of files and then you get the sweet sweet 0% weekly
Don't think it's going to happen, and personally i am not into codex models anymore.
A coding agent should have general knowledge which can help it tackle various projects that mimic real life business logic not just how to code in x language.
The codex models aren't "just that". 
uh because 5.4 is very token hungry I used 25% usage in one day on pro plan
I think plus plan users are already done with weekly limits
Coding with 5.4 means I can chat about the product with the model. For coding games, it can help with making gameplay decisions. Codex is too uptight and focused only on code. I prefer this as these two areas are often interleaved
Today, we’re releasing GPT‑5.4 in ChatGPT (as GPT‑5.4 Thinking), the API, and Codex.
-- https://openai.com/index/introducing-gpt-5-4/
bro codex just coded me a gateway for iot devices lorawan hooked up w humidity sensors in the ground to monitor crops on my farm
what the hell did they train ts on
5.4 is not token hungry scale back to high, don't activate /fast and don't use multi agents.
pentagon
makes sense
If they can make a general model that is as good at coding as a specialized model, then the general model is just better in every possible way. Seems they have achieved this from what I have experienced so far
Applied for the oss thing, i really hope i will be chosen🥲
facts
i like openai a lot
been testing their models since 2022
Yep that's what i am saying, a general model is much much better at understanding business logic and have the necessary general knowledge that can help it tackle various projects.
Any chance codex app comes to Linux 😅
They did say something about it, but i am not sure 100%.
Anyone test out or use 5.4 yet?
it's my daily driver … day0-driver 😉
5.4 is fast and furious xD
If they continue with this trend i am afraid that by the end of the year we will look back at 5.4 and think it's primitive...
Excited about Codex OSS, while it builds up understanding of large codebases and their security surfaces over time, is there a plan for how that accumulated knowledge is governed? Not worried about today, more thinking about the long term. Codebases change slowly, but the insights Codex accumulates about their architecture and weaknesses compound. Curious how OpenAI thinks about the stewardship of that kind of knowledge.
This is exactly what 5.4 is already.
GPT‑5.4 brings together the best of our recent advances in reasoning, coding, and agentic workflows into a single frontier model. It incorporates the industry-leading coding capabilities of GPT‑5.3‑Codex
-- https://openai.com/index/introducing-gpt-5-4/
You should use medium it's basically the same as high according to benchmarks and much faster
I was just reading through the 5.4 doc. It's usage is very nuanced, needs to be tuned for specific use-cases. Most people won't get that. Unfortunately even developers have lost their sense of nuance and detail and expect things to "just work" without configuration. As this tech evolves people need to go with the flow. Yesterday we needed a different model for code, today we don't. Yesterday context was limited. Today the limit has been raised but there's higher cost for using it (which is one of the reasons why it's not the default!).
Read the docs. Understand the tools.
That's what I'm saying, dude. Someone said they were waiting for 5.4 codex and I was explaining why it's not coming or even desirable ...
Hello there! I just want to know what is your feedback about using Codex 5.3 vs 5.4? I'm currently using 5.4 and Codex CLI, and I like it so much! I didn't use the 5.3 codex, but the 5.2 codex was very bad with me! Any feedback from Codex 5.3 users?
You could read up a few lines for comments already made, no?
I hear medium is just "high only if necessary" mode
Tbh it's not unreasonable to suggest spending >=50% of your time researching Codex and the rest actually using it.
It's one of the only tools I've ever used where productivity is disproportionately higher if you spend more time studying the tool than using it. It's like a wrench that builds the car if you stare at it hard enough
Anybody using codex as a sysadmin tool to manage on prem hosts and cloud instances.
You want to be very very careful handing this things to AI currently. it can be done and successfully but be careful.
Yes, with approval_policy = "untrusted" for that agent it works pretty well
Its phenomenal does vulnerability scan and can patch and all. No need for qualys 🙂
what is the point is recommending WSL but having codex deny access to WSL? why is this being so difficult
You're just encountering a config-specific issue. It's not telling you that Codex can't use WSL. I have it right in front of me, as many of our colleagues here.
Make sure you are specifying the correct distro and that you have installed a distro that can use Codex. Ubuntu v24 is excellent.
For protection against disaster you really want to restrict Codex use to a "codex" user that does not have root privs. Set the default login to that user and if you need to login as root or otherwise, add the "-u" option on your wsl CLI.
I've written about this recently in this channel, look for my comments that include related keywords.
Question for you. I am currently testing the Codex App. Lets assume I successfully clone X repo in WSL ubuntu distro. And change agent env from windows native to WSL. Do I still pilot the agent from this app? Or is the piloting experience all done through WSL instead?
You can operate on WSL content from the app and also from the CLI ... that's one of the benefits of the app, it adds a more centralized location for processing as well as allowing for multi-agent processing in the same environment.
Side note: If you want to clone the repo and work on it via the remote VSCode extension and the Codex extension, you may find that you need to clone from the integrated terminal. That ensures that the terminal is using the correct perms and GH auth. If also overcomes a potential issue where direct CLI access may not have the same perms as the IDE. Disregard if this doesn't make any sense to you yet.
(Disclaimer: I haven't installed the app yet. For that specific component I'm just going by what's been said in this channel and what I've read.)
thanks i'll test this out. i think when the Windows App was released, the sandbox introduced ACL perms that resulted in default no write settings to .git (maybe other directories as well, but i noticed them in .git for my repo). Do you know if this was by design? Prior to the windows app release, I was using the VScode extension without issue to read and write to .git.
I ust cant comprehend how GPT will implement code, but then the review will find bugs time and time again, yet if its aware enough to identify those bugs, why did it write the code that way in the first place
I added
able.
44 +
45 +# Incremental Review Rule
46 +
47 +Implement in small, reviewable checkpoints instead
of one large final diff.
48 +Prefer slices that can be validated and reviewed in
dependently before continuing.
49 +During implementation, regularly run targeted `/rev
iew` passes on the current checkpoint and fix issue
s before stacking more changes on top.
50 +When the feature or task is complete, run a final `
/review` against the full diff and continue autonom
ous review -> fix loops until no material issues re
main.
51 +Do not defer most refactoring, regression catching,
or cleanup to one large end-of-task pass when smal
ler in-flight review loops are practical.
To my AGENTS.md, see if it offers any sort of improvement.
Is there a channel for Codex feedback?
Was just gonna suggest that! Also it's not unusual to have short burst code reviews. It's better than 16k line change, 25 review comments, another 16k line changes, another 25 review comments. If you do it in small bursts it's good for the reviewer and the editor, and most of all the diffs per commit
Dude I just had a 120 comment PR
of like @codex over and over
Im tired of it lol
I'm seeing the same issue, maybe they are rolling it out slowly still
I think the same. I had a similar experience when they launched Codex in the cloud.
How long did it take?
Wait.. I just got access.
I love that instant swap when we are discussing it 😂
lmao.
Oh we got a quota.
I think that is per 5h?
The "Get more scans" button in the header allows you to request the paid version.
I don't got that?
I think this is the total repositories it can monitor.
It are scans?
Maybe if I just remove a repo?
I just have to figure it out ig.
yeah a bit confusing ngl. I can't launch multiple scans on the same repo for different timeframes, maybe at once which makes sense.
I just launched it on 3 projects.
Same repo
Different ones.
Wait.
It seems like you can update repo's without rescanning?
So I will look what happens when removing a repo?
I am not going to make it till Wednesday with 33% left
Which means im gonna have to use poopus
not sure how many here are aware, but Tibo about an hour ago posted this on X (source: https://x.com/thsottiaux/status/2030017068540776891):
"We are investigating reports of higher usage drain than expected for Codex when WebSockets are enabled, the team is investigating and we will provide updates as we go"
interestingly I also have websockets enabled and it did feel like my usage was draining a bit faster than usual, but I thought that was probably just because of GPT-5.4
-_-
I'm still using 5.3-codex, no token issues with any reasoning levels. Just sayin...
Arg, new 404 page from https://chatgpt.com/codex/security/findings
That's an impressively useless use of technology. 🙂
Gotcha! Add that to the list of "give me 0% weekly ASAP" configs. They're not in the official docs! But Here they are:
[features]
responses_websockets=true
responses_websockets_v2=true
They're false by default in codex-rs/core/src/features.rs:745 & 751 as of 0.110.0
It's so secure you can't even use it!
Do you need both those settings for websocket support?
I'm not sure, but by default it uses the tried and true SSE transport
It's possible you need both enabled, or at least responses_websockets
Btw I am not recommending turning them on as they appear to be a reason one of the many experimental features or knobs causing people to run out of usage quickly
I recommend trying on high. it does appear to overthink problems.
I know that Codex permits uses their OAuth auth for OpenClaw. Does this imply Codex permits using their auth for any third party application? I've made a computer use agent for local macOS, but naturally the API is expensive, so I would prefer that.
If you are expecting AGI you will be disappointed. It's an excellent tool that can really accelerate your work.
They absolutely permit it. I don't have the sources, but this is well covered. A twitter thread somewhere.
Theo (t3) has discussed this recently when comparing OpenAI and Anthropic.
Codex CLI is open source, which means you can easily re-use the authentication aspects of it as well in your own projects 😉
That's really helpful. I'll let deep research find that thread for me
I know it's technically possible, but I really don't want to risk my OpenAI account, since it's pretty ancient (pre ChatGPT)
OpenAI hired the guy that made OpenClaw because of how much they loved it, and announced practically official support for it
look up discussions over the phrase "app-server" for the auth discussions
issues i'm experiencing:
- apply_patch is always rejected (windows)
- work trees do not show up on left hand side and so starting a thread as a work tree makes it difficult to find unless you're in the "chronological list" view instead of "by project" view
- i can't select anything but 5.4 and high in a thread. the dropdown works but any option i select does not stick
I’m at the edit step now. I’ll try the normal patch path first, but if the patch tool flakes again I’ll fall back to direct file writes so this doesn’t stall.
im using windows 11, powershell. is powershell HORRIBLE for codex app?
Restart computer and see if that fixes it
"I’m switching to direct file writes now. The patch tool failed again, and the changes are substantial enough that waiting on it would just waste time."
powershell HORRIBLE
Affirmative on all counts
then why did openai set it as default for windows codex app
There's a South Park episode about voting that describes it best Powershell ardy installed, wsl not so much
"windows" -> I see your problem 
Wich codex model is now the best?
Do you guys think that the high usage right now is due to maybe no cache?
Im assuming being a new model it would have to build the token cache as it gets used
Codex model is 5.3-Codex. The best model to use IN Codex is likely 5.4.
Aright thx.
I think you're hitting exactly that issue that I documented where VSCode is running under different perms from the connection used from the Windows app. Honestly I don't talk about this much because I think some peeps are gonna call me an idiot for not expecting code . permissions to be different from wsl -d .., but it's one of those things that needs to be understood when we have multiple clients hitting the server with different creds, profiles, aliases, etc. Good Luck!
It's so secure we can't even tell you about it!
Guys do you know where can i try the new Codex Security becaus i cant find it
Its for enterprise only I believe
Ahh okay
We plan to expand the program in the coming weeks so more maintainers have a direct path to better security, stronger review workflows, and support for the open-source work the ecosystem depends on.
Like other initiatives like this it seems we need this third party, OpenAI, to decide if a package that we maintain has sufficient merits to justify allowing the developer(s) to use the security software. I really dislike this common practice. I understand it but grudgingly disagree with how these companies apply it.
I don't think that's how context cache works. It is usually stored per user for a short period of time(one session) AFAIK. And is actively reused within the same session. This is why your messages close to the context limit don't take eternity(a bit exaggering here) to answer.
I thought there was a server side cache too, so the model didn’t have to recalculate things it’s already been asked
You guys are providing the definition of "context" ... context is cache. It's pruned just like cache.
https://voratiq.com/leaderboard/ not quite as good
Um, except that with compression it gets summarized up-top.
Yep, it's 15 minutes TTL, and it's the KV Cache. It hangs on to tokens to it doesn't have to prefill inputs from 0 to n
There is, but it's still limited to your machine. And it has to be the exact context for the cache to be reused.
Caching only works if two requests share the same prefix and land on the same machine
https://developers.openai.com/cookbook/examples/prompt_caching_201/
If they were to store cahce for all messages ever, there wouldn't be enough storage memory in all the world, as they store the actual latent space, or whatever it's called, that takes GBs of memory per cache.
As i said high it is.
Haha yep, a 1M token context window can be well over a terabyte of VRAM in most cases
Conversations are cheap, KV is very very expensive, almost (if not absolutely) exponential in comparison
5.4 is making stupid decisions all day long...
I have been working with it all day, and you cannot imagine how many times it has told you. "Because I made a mistake"
Can you provide some examples?
They are very case specific
Like duplicating code to solve potential bugs, or changing API contracts
5.3 would never do this
I see. For me it was pretty accurate.
As opposed to to 5.3 it seems to try to understand the project context better before making changes.
Code
What
I noticed 5.3-Codex is really good at following strict processes. 5.4 forgets to run static code analysis before commits, so I kinda see what you mean
Sobre o code convite sora
I might actually move back to 5.3-codex...
To me it writes much better code but idk, still discovering the right way to enforce strict processes with 5.4
I feel like it is working a lot against me
Sim sim
I feel like 5.4 gives you opportunity to steer it towards the right direction more than 5.3 that would basically make all the changes and research silently and you wouldn't have any clue what it's doing/thinking.
also /fast is a trap. Dont use it. You will melt your quota in 3 days
After April, it is going to be a disaster
given there's still an ongoing quota consumption bug, apparently anyway, it's definitely not advised to use it 😄
it's still unclear to me as to what /fast does for Pro users, is it separate from the "priority processing" feature of the Pro plan
😄
By the way, looking at the Codex App... do automations work with CLI?
I assume it uses the CLI to faciliate them yeah, well, the codex-app-server aspect of it
It uses the CLI to run Codex, for sure. But managing the automations, is App-only?
Do you think codex 5.4 can creat good design bc i need a ki that can make me good Pdf design
@boreal hawk Graphic design? no
Is there a ai that can make Graphic Designe?
Yes, but this is not the place to discuss it
Dms?
Codex Security looks cool @winged ore
Was just reading a user’s comment on another platform about this and folks were dragging the idea lol.
Gotcha. Well hey, at least there is a community elsewhere that you can talk about Claude. #ai-discussions or the Anthropic server
@vital light Is there a way to use it without the Codex App?
I think my biggest frustration with codex is the continual hand holding it demands. I can tell it 10 times in a row, to not stop until it's done something to 100% completion. Every single time it stops short of 100%, admits it stopped short, apologizes for not doing what I asked, then asks if I want it to keep working towards 100%.
Like yah, just like I told you to the past 10 freaking times heh
@violet anchor That is true
Sounds like you guys need better AGENTS files and a better sandbox. When I see notes like these I tend to think "user error".
Sora isn't working for me right now, do you have any idea why?
Wrong forum
How did I not think of that?! /s
i'd be very happy to test the remote codex capabilities
and have plenty of use cases
Can I use chatgpt subscription with openclaw model?
Does the gpt codex mac app has a checkpoint feature where i can undo/redo progress locally instead of utilising a git worktree
Does Codex Security Application work with VS Code extension?
@boreal terrace wrong channel
could use another limits restart soon 😂
People hoping for server bugs, so OAI resets limits lol
I wouldn't complain if that happened 🙂
you got one month to make fortune 500 start up
"Hey Codex, make me a fortune 5000 start-up"
Something will have to give come April, I don't normally complain about usage, but something is eating, and this is simply just unsustainable for the price to usage windows come april
only thing keeping me afloat is the 2x
make no mistakes
My Codex got stuck thinking and cant stop
That happens to humans too
Did you reply to me, and had your reply deleted because you mentioned Claude Code?
5.3-codex > 5.4?
no
@toxic torrent At times, I think yes
5.4 has been better for me i havent looked back
How are people so opinionated on this? If the 5.3 was silently switched with 5.4, I don’t think I’d notice
opinionated?
“Opinionated describes someone who holds strong, often stubborn beliefs and frequently expresses them…”
Feels roughly the same to me. The improvements are getting harder to notice.
Im not sure what makes the opinions here shared stubborn or strong
as for "frequent"... well, the channel is called #codex-discussions . No sure what you expected to find here.
If you look at difference in evals between 5.3-codex and 5.4 they're pretty similar on coding specific tests
5.4 is better at documentation and planning ideas. I also notice my automated code reviews dont flag as many issues with its code. I use code rabbit and codex review.
兄弟,去 DeepSeek
So i base my opinion on my anecdotal experience
https://x.com/gapdeepry/status/2030108833100558589?s=46
for anyone still on Intel Mac… slight workaround to use codex app
Does anyone know how to access Codex Security?
Get enterprise
oh nice, let me get onto that as well o.0
I have yet to find it (I cant believe OAI releases this stuff, and makes it so hard for people to find it)
@main nimbus Thanks
It's called Aarvark if you go to chatgpt.com
Sounds like a norse god
You have to use codex cloud
what is GPT-5.3-Codex-Spark
nvm
"Today, we’re releasing a research preview of GPT‑5.3‑Codex‑Spark, a smaller version of GPT‑5.3‑Codex"
I stopped reading at smaller
Codex Security made me feel better about my vibe coded mess of a chat app. 1 medium issue which isn't an issue, it's a feature. I started the app in December of 2024 as an experiment to see if AI could truly develop an entire app without me doing anything other than directing it. It, by all means, should be a complete disaster of a codebase.
I want a model that I have to wait a day while it churns
@main nimbus Try /review and see if you still feel good 😄
I mean vulnerability wise, don't make me go digging for the skeletons.
which one is better 5.4 or 5.3 codex?
5.4 imo
well, /review will also look for vulnerabilitites
it is all it find me, as a matter of fact
@honest yacht Divided opinions
There's a benchmark that shows model's ranking by how many HUMAN hours of work the model can complete. It's getting quite high, despite being 20+ min runs, not hours.
I'm guessing 5.4 is a better model and 5.3 codex is better trained on the codex interface and thus more agentic?
"more agentic" sounds more and more like fluff, lately
I used to use gpt 5.2 for documentation, building skills, web searches etc. and 5.3 codex for code implementation and planning. Now i can just use 5.4 for everything.
Can’t wait for codex mobile
Typically but OpenAI is anything but consistent. GPT-5.4 includes better tool calls and GPT-5.3-Codex's improvements.
No idea if we'll see a GPT-5.4-Codex. If we do, it won't be anytime soon.
perhaps, but it still means something, better tool usage, designed to hand-off tasks to other agents in pursuit of a goal that supercedes the agent instructions
when o3 introduced CoT tool calls, that was eye opening for me.
Did you have set this up or does it just do it?
It just does
Is it the same security audit the cloud version does?
Probably no
The security side creates a detailed threat model of your application, then scans the changes introduced (likely against the threat model).
It infuriates me to no end, when Codex ignores my questions, and just changes stuff
Give an example
I imagine there are questions like "Why is this happening?" and it finds and fixes a bug.
codex needs something like claude codes new /loop skill to reissue skills/commands in intervals within a session
oh lol completely off
- He asks me something totally unrelated to what we were talking aobut
- "Hey, what the... why are you asking me about this suddenly?"
- Here's the plan
I dont think it does?
codex just uses the skills as needed
It's much more proficient with skills than claude
@cobalt junco medium effort?
if you mean claude, it does have /loop
I mean i dont think codex needs it
ah ok. sometimes it comes in handy though
it doesnt have the same alzheimers claude does
xhigh
Yeah, take it down to medium, if you want to make it to next week
what does the /loop do i havent used claude for a bit?
I also used to xhigh all the things
something like this:
/loop 2m check if the database migration in tmux session "migrate" has finished, and report any errors from the log
so it checks every 2 minutes?
within a session, yes
ahh ok
I dont think you need it for skills, you can trust codex/5.4 will use them when it needs them.
I prefer claude probably because I am more familiar but I respect codex. I'm trying to force myself to use all of the CLIs. even gemini
try codex app, its really nice.
I am going to give anti grav a run sometime in the next week
I'm on mac though - windows only just got codex app and it's a little buggy from what i can tell
I really tried to like antigravity. many like it but I just didnt. I think Ive just become partial to CLI over IDE
since i started using codex app i never looked back, I still use ide for a few things.
when i touch the cli i cringe
I feel like we are at a place where the models are differentiated enough that learning all of them is a good move. I like claude's task harness, codex for reviews and complex tasks, gemini for UI and dashboards
Yeah maybe, for the time and effort i think it might also be worth filling gaps in one model
Like for ui, just adding a skill for codex instead of configuring gemini
sfunny i felt like we were in a place when it was worth using different models, and now they are all converging
yeah agreed. gemini cli is really behind. the gemini app actually has a nicer UI builder
When I run a plan or code review through codex and claude they still seem to complement each other. gemini seems redundant if anything, although it can create an amazing dashboard
yeah maybe. the alternative is just asking the same llm for counterpoints, alternate options, devils advocate etc
1m context is nice sometimes. notebookllm is great to cheaply create a 300 source RAG chat
never going back to claude.
i do this sometimes, codex writes, opus reviews
trying to reconcile your last two statements
what was your issue with claude code? I see enough complaints to give them legitimacy even if I dont experience them
tbh i have no issue with claude code, but gpt 5.4 xhigh writes impeccable code and requires less loops. the good thing about claude it writes fast and you can reloop, but with gpt 5.4 sometimes its just one prompt and its good enough
claude requires more effort to get the same results, you can't trust it to do the task as expected with out more prompt layers.
^^^^
It also has a terrible memory, like it wont use skills proactively after the first few turns
That's true as well, context rot % is higher in claude
Anyone have feedback with 1M context?
one thing I have noticed with anthropic is that they often change fundamental behaviors and dont update the documentation or do it quietly. when you ask claude to analyze its own binary with strings it finds behaviors that often substantially diverge from whats documented
I have never used Windsurf and only used Cursor a few times. I always assumed they would die off eventually when competing with the model makers
I used aider prior to claude code & codex
Does 1M context have trade off/ or lower code/ thinking quality?
Is it confirmed by a few?
Of course it can get worse if too much bad context but I guess precise prompt can fix that
This comes from OAI, i think
Alright thanks will try it to see if context pollution is too bad
ooh T3 code is out
i dont see how any 3rd party harness could keep up with the model makers
plus with t3, like others you have to use the API instead of Oauth tokens
Opencode is pretty good. And easily extendable. Been using it interchangeably with codex and I can't decide on one. Opencode is easier to extend though. I've set up a telegram plugin, so I can monitor and issue tasks remotely.
Nah, you can use oauth with most open source harnesses today.
Luckily, openai doesn't ban accounts for this like some other companies...
it's openai's harness
they basically provide their codex harness to third-party developers so their models behave consistently across different apps instead of relying on individual developers to make the perfect system prompt
opencode is actually a separate harness so that's different (except for the plugin that integrates the app server but that's technically changing how opencode works)
hi folks - saw chatgpt-5.4 is available but I only see codex-5.3 in available selection? is it being roleld out slowly?
GPT 5.4 is out and selectable-and-selecte-by-default in my Visual Code IDE, why is this so? is 5.4 better than 5.3 Codex?
I dont see codex-5.4 as an option when I go to the website > codex selection. Only goes up to -5.3
there won't be a GPT-5.4-Codex, the model is unified so it's just GPT-5.4
That's great to know, but when I use Codex I only see GPT-5.3-Codex available. So -5.4 isn't available to PRO users yet?
GPT-5.4 is available to all users now, you should have it
weird... I dont. Just installed the codex app on windows to confirm. Only goes up to GPT-5.3-Codex. Maybe I'm special 😉
log out and back in again, or check on the web version of Codex
Any update on the huge lag on the Windows Codex App?
I guess especially when using WSL2?
another thing is that GPT 5.4 is 33% more expensive (in subscription) compared to GPT Codex 5.3
🥲
anyone see a lot of this today?
thanks that fixed it
excellent!
Anyone finds gpt 5.3 codex better than gpt 5.4 at coding tasks?
I have been using both today, for my purposes I think 5.4 is better
5.4 has been way better on projects where there's not a lot of similar public code to train on i feel
I feel sorry for all my friends because they're going to have to deal with me again. I'm burning through pro in 1.5 days new. Waiting until the 12th to play with the toy again.
i saw on the news a 'pro lite' subscription plan is coming
im probably gonna upgrade to that in april lol
we need a pro heavy
right now i just use multiple plus accounts
A ton of stuff is purposely not documented in Codex as well. There are treasures if you let codex figure itself out compared to documentation
And now-ish
I couldn't be bothered juggling multiple accounts, I'd just buy credits
does buying credits get you a comparable amount of usage anyways?
i felt like the credits $40 didn't last as long as a sub $20, is that not the case anymore?
I don't know, I haven't finished burning through my cap yet
I think subs give much more usage than credits. Lots more
when i run dry i always go and try one of the other options, so many free trials and usage plans out there. Comingup ill get my free month on antigrav and see how it feels
yeah thats what im seeing too, so i just use multiple account lol
i got onto codex by trying the free month on plus when the codex app came out and my claude code plan was dry
Yeah I'm spending way too much time, it's seriously an addiction risk, so little input so much output, with so much work to do I often can't stop so at least this slows me down but it feels bad. Feening like a Codex junkie thinking of all the work I could be getting done if not for the stupid limits....
i tried claude but i hit the limits way too fast
i would honestly pay for pro if i didn't have a day job
but i feel like after coding at work for 8+ hours, i can't reasonably hit the pro tier limits with my spare time lol
Yeah this is my day job... Or week job now with this token burn. I'd need 5 pros to clock full work weeks, have to solve this problem....
GPT 5.4 is again, pretty S for coding, I knew a problem and I told it to fix it and it kept arguing it back to me
I rolled back to 5.3 Codex, it fixs
i noticed this on 'high', but it worked when i switched to 'xhigh' it reasoned out that there was in fact a problem lol
I had that in codex 5.3
yeah just roll back to codex 5.3
5.4 is F tier for coding
argues back its owner with stupidity
maybe 5.3 is better for maintenance/adding features
i'm working on a new app and 5.4 makes very good design decisions
5.3 needs handholding to refactor
5.3 is simply better and 5.4 is trash IMO because 5.4 cannot understand what is the problem
5.4 simply has no place when it comes to coding
yes maybe for "design" stuff 5.4 might be better
but 5.4 has no place in coding
its absolutely better for my case
any coding agent that cannot understand the problem, especially when the author explains the problem and still fails to understand, is straight to delete
codex 5.3 did that for me, just a niche case though, still a good model
5.3 might sometimes struggle to find an issue but it never argued me back when I clearly explained the problem
5.4 argues back even when I explain it to its face
I was so frustrated and wasated a few good hours because of 5.4
I rolled back to 5.3 and went through the issues and wala it was solved as I expected
im considering between plus and business plan just for codex + openclaw
Wasting so many tokens in Windows because apply patch tool doesn't work and model is stumbling through edits
i havent tried it yet but maybe put in AGENTS.md 'chunk apply_patch edits to 500 lines each' or something like that
use pro plan, codex is unlimitied practically
Pro is too expensive for me
$2400 a year is definitely a chunk of money
would be nice if there was a $40 or $50/mo tier with higher limits and the faster models you get from pro without the pro model
even this would still result in double input token usage because it means extra turns
pretty sure you don't get already cached tokens counted against you as much
at the very least it's less compute-intensive on openai's end
all of these commands are it just trying to figure out how to edit these file and then it blows out its context which means compaction which then mean it loses the plot on the task
Ran $lines = Get-Content 'core\task_content.py'; $start=200; $end=320; for($i=$start; $i -le $end; $i++){ '{0,4}: {1}' -f $i, $lines[$i-1] } for 1s
Ran $lines = Get-Content 'core\tests.py'; $ranges = @(@(345,390), @(540,615)); foreach($range in $ranges){ $start=$range[0]; $end=$range[1]; for($i=$start; $i -le $end; $i++){ '{0,4}: {1}' -f $i, (Get-Content 'core\tests.py')[$i-1] }; '' } for 1s
Ran Get-Content -Path 'core\services.py' -Raw for 1s
Ran Get-Content -Path 'core\models.py' -Raw for 1s
Ran Get-Content -Path 'core\forms.py' -Raw for 1s
Dang so many people say 5.4 is good then i read a crtically bad review
well i'm just complaining about windows idk if this is 5.4's fault
this is exactly why i used 'pi' agent harness instead of codex cli for a while to avoid powershell
if you do want it to use the shell more effectively, set up WSL and run codex in there
problem is 'pi' right now uses a lot of tokens for compaction, so you need to watch out and start new chats before you hit context limit
pretty sure the windows client natively supports codex in WSL
i've been using cli + WSL for a while but native would be preferable i just regret trying it with the new app (where its default)
git bash + pi solves your windows headaches for sure
the vs code extension for both claude code and codex are the best windows options if you don't count wsl, and even then, situational.
running linux being the best option 🙃
game dev 🙃
if only linux could run games
it can
linux is a lot better at running games now
and I'm a native windows user
what engine?
I see this as an absolute win
linux is better off for it
yes, it is great
5.3-codex is great for going in raw. 5.4 needs to be warmed up, but then just goes to town
competitive games aren't on linux so it doesn't matter what engine
well you said game dev, I assumed you were developing games
im deving and playing. on the same machine
Hey robert, how's your experience with 5.4 been so far?
both unity and UE5 have linux ports
and I can't think of any other mainstream engines other than godot (also linux support)
I am very happy with it. I have built a stock tracking app and a coin tracking app with it so far
RPG Maker 2000
if that's the one that makes all those itch.io games it probably has a linux port lol
i've been working in ue for 13 years so i can assue you it doesn't support linux in any meaningful sense and again even if i wanted to i'm not dual booting for something that is only a problem for me now with agentic coding
That's pretty cool. I let 5.4 at one of my slower projects, and it took the context and pumped out a v0.1.0 installer in about 20 minutes.
it doesn't but it works through WINE apparently (and is on steam so that'll work OOTB)
what language? I have been using Python + JS
I had 5.4 port a minecraft mod to a newer version of minecraft and it did a pretty decent job
not TS?
oh, the rpgmaker2k was just an old joke.
I've been using python, C++, C#, Rust, and a few others. I also gave Codex the dev Console to Visual Studio, so once I'm done in VS, it just finishes up with VS itself
not yet
Oh yeah, that old DOS Dinopark Tycoon game. I had 5.4 reverse engineer the ACT files and other assets, and port them into godot to remake the game there, just because why not
how's your usage looking? 🙃
To be perfectly honest, I'm a plus user, and I've never gone below 50% weekly
that's pretty good!
idk why i was hyped for the windows app it sjust like the ide
does godot have a good way to interact with AI? or would you still need to manually do a lot of the work in godot editor
I also attribute that to the way I work. me!=dev
I just learn as I go.
Yes on 2 counts. MCP servers are awesome for tool usage, and then since godot is opensource, I wired in a codex CLI into the program, but I limited it so that it can't mutate the engine at all, it can only access the same level of inputs a user could.
sounds interesting, i wonder if someone can add 1st class support to godot for ai and then devs can just vibe code a game without understanding any godot concepts
Also, people tend to sleep on local generation. I have comfyui and blender hooked up as well, and I can talk a 3d model and mesh into existence, port it into blender, clean it up, and have it sent to godot for the project. It's a wonderful time to be a builder
Oh are there models for that, I've only seen comfyui used for images and video gen
There's so much customization out there it's almost painful to sift through it all.
I do most of this with codex and claude code too.
any reason why windows codex "apply_patch" doesn't seem to work? debugging it in multiple threads and doesn't work in any of them
Is it outside of the project tree?
I'm a bit new at this - but it's in the playground
files are in the documents\playground folder
Oh, sorry, I can't help you there. My playground was already gone when I installed it.
ah np - i'll keep investigating
Are you using sandbox mode, or did you give it full access?
It's very nice imo, easy to switch projects, etc, but, also unusably slow
sandbox/playground mode
MrDevil wasn't joking, it is basically the same as the IDE extension, and it uses vs code workspaces as the project containment. If you swap over to VS Code or Antigravity (it's the same thing just google) and use Codex on the same project locations, you should see some increased performance, and the agents for the most part are still handled by the IDE the same way they are in the app
try making a project folder and moving what you're working on there, and if all else fails, give it full access, apply your patch, and sandbox it again.
ill just go back to cli, much nicer experience
Let's all take a moment to pray for a reset https://www.reddit.com/r/codex/comments/1rmpone/incident_with_codex_usage_rate/
a reset was done overnight for those who provided their user ID in the GitHub issue thread
yah I'm creating a new project (PlayGroundDebug) and trying it out...
sadly same problem - apply_patch still doesn't work
Do you happen to have a link to that thread?
Cuz I'm rationed for the next 5 days
"We'll continue our investigation next week. As thanks to everyone who supplied their user IDs and helped provide clues for us in this bug report thread, I've manually reset your usage back to 100% so you can enjoy Codex over the weekend."
ok this seems to explaint he apply_patch issue >> known problem -->> https://www.reddit.com/r/codex/comments/1rmcxj2/apply_patch_failing/
I'm looking but I can't find a user ID anywhere in my settings and Google says it doesn't exist but of course I see people posting theirs in GitHub. I'm looking in my account settings. Where can I get that id?
its also very much not the same. I mean its the same agent ofc, but the interface is quite different, the whole point of codex app is its interface
app is a clean all you need interface, seems nice, but just too damn slow
Chatgpt also says there is no such thing as a user id... Nice....
That's what I'm saying though. the same chats you have with the ide are in the codex app. If you use the extensions, it's literally the same mechanics, just presented differently. It does make it easier to swap between projects, but I don't even bother with that most days, I just point vs code at the entire drive dedicated to projects
just presented differently. is entirely the point of it...
that presented differently part is probably the thing that's lagging it for you at the moment.
I prefer good coffee over bad one. and prefer to improve the supply chain than give up and buy some crappy brand because I can get it in the store downstairs
anyone having similar issue? I'm struggling to get gpt-5.4 to work more than a few minutes at a time. any nontrivial task it just does a few things and then comes back and explains what else needs to be done instead of doing it
btw filed a bug for the apply_patch not working. OpenAI said it's a known issue reported by multiple users. hopefully fixed soon
Maybe provide some instructions regarding this in agents md, and/or when you prompt it. Something like "implement the plan/task end to end, unless you face major roadblocks".
It also depends on the nature of the task you give it. Something too abstract is more likely to result in the agent not knowing what to do next and return control to you after a few changes.
Allowing it to compile/build, do tests and maybe even run the program it's working on can increase the time it works on it by a lot as well.
What do you prefer for coding/engineering GPT-Codex-5.3 or GPT-5.4 and why?
Yes, I find GPT-5.3-Codex better. 5.4 has real issues with code base exploration and seems not so "intelligent" as hoped for. Sadly.
I expected not to nudge the model in this direction and expected that RL post-training avoids that for real semi-autonomous agent operations with that new model. What are your thoughts, will this be fixed with a new snapshot of GPT-5.4 or do we need hand-holding it?
I think RL just improves the model potential. Actual behavior would depend on many various factors, including user instructions. No matter how smart the model is, it can't read your thoughts/intentions/preferences without you telling it explicitly.
Yes, you are right, but compared to the GPT-5.3 model, I didn't need as much guidance. The previous model pays much more attention to the system prompt and the AGENTS . MD file provided. I never provide vague or ambiguous prompts. I always try to be as specific and comprehensive as possible. I'm not a novice in this context.
Only my 2cts about this new model.
Ah, well, you never mentioned having instructions in agents md and providing explicit prompts.
Maybe you can provide specific examples where the model failed to go through with the instructions.
In my environment, apply_patch hasn't been usable for a very long time. Even long before Codex Windows arrived, models in the CLI stubbornly preferred to use PowerShell commands
Kinda sucks for people that didnt bother reporting thinking it was the norm
5.4 is really sucking for me
This
This is madness
Another one, just 8 minutes later
This model sucks
I'm comparing to gpt 5.3-codex and gpt-5.2. In comparison gpt-5.4 seems quite difficult to use
5.4 is so good conversationally! My main complaint with 5.3codex is that it isnt steerable/working interactively. 5.4 fixes this. very happy to now have this ability to interact with it like an organic team
hello
are you using app or cli
I was using app but its performance went to crap and become too slow so I had to go back to cli. I really wanted to like the app too - super disappointing.
they should have a feature to not track unchanged files at all, b/c its horribly implemented
same
its also laggy
maybe electron app thats why
I didnt check if its electron, it feels native app to me, but they are just throwing features at it and not doing the lifecycle properly
I think 90% of the problems are related to change tracking
at one point I had to delete it, I couldnt even press the stop button. that's how bad it got. clearing ~/.codex etc did nothing
not just delete it but blow away all the caches, etc
honestly OAI should be embaressed with the codex app
that's the problem
the reason codex is now better then claude (subjectively) is b/c they didnt rush features, they picked a better language also. and took a more apple approach of sit back and do it properly rather then do it first
Im totally fine with fast iterations, even if there are missteps. I will take that over someone who releases every 6 months, every single time.
but codex app is the opposite
in this case all codex app did was waste my time, and I really wanted to like it, so it annoyed me. it would have been better for them to release nothing and do it properly IMO... but that';s just my opinion
Well, Codex is not even 1.0 after all
I think we all should accept that they are still trying to figure things out
but it works fantastic in the cli
It hasn't always. And that's fine.
but in general way better then claude
claude had a lot of bad versions... and multiple bad versions... building on nodejs vs rust, its kind of a no brainer what you end up with in the end over time
Hi, has anyone connected Codex to Visual Studio?
yes same, not happy with it
I still use it here and there but only for small session
as chats get bigger its laggy
Conversationally it is good, still too verbose for me, but coding isn't really great so far. Misses some key code paths and comprehension of runtime data transformations are worse.
I’ll find the Kodak app is just great for spinning up Work trees, quickly way quicker than what I can do it in terminal
If iPhones speech to text feature could fail me any harder than it does on a daily basis
I don’t know why companies wanna push these apps over making the tuis feature rich, 90% of my work flow is already in the terminal
no issues like that here. I'm using it in rust. I'd say its at least as good as codex-5.3 but seems to better understand the actual goal vs task
Building with Codex 5.4 this morning! Really really wish 5.4 had an option to edit prompts like GPT.
5.3 is more like 5.3, 5.4 is more like opus
hi, I received an email from Openai about a week ago, stating that the codex is available for chatgpt Free and Go users. How long will this "promotion" last, and have the limits been reduced in any way? (I don't think they've reduced the limits for Free and Go users, but only doubled them for other plans.) In my situation, this is important to know
Totally agree. I'm working on a procedural world generation system in C# currently, and 5.3 made it extremely convoluted and overengineered. Granted, half of the fault is on me, because I didn't direct it very well, and just thrown all kinds of info from different techniques and such. The issue is that it swallowed it without arguments and made a mess. And when attempting to refactor/rework the system it just kept on going around in circles not really untangling the convoluted system, but making it even worse. When I asked 5.4 to review it, it almost immediately identified it as a mess, suggested good refactoring solutions, implemented them properly. And later went throuh severl iterations of optimizing it beyond what 5.3 ever managed.
C# you say? You might find this benchmark interesting.
This graph explains why I still have to write my enterprise C# code myself still.
I think it's doing pretty good. In Unity context at least.
One thing I love about C# is partial classes and extension methods 🙂 if Codex is instructed to use those it might do a better job organizing everything
Perhaps. I personally don't like partial classes that much - they make it difficult to grasp what the class is doing. Better make smaller classes with composition/inheritance, etc... Extension methods are fine, though there aren't many opportunities to use them in my current project.
I suppose it's an acquired taste. What I use it for is one file acts as a header or table of contents, and some other files fill out the implementation. That way it's pretty searchable, and in VS or Rider you can usually ctrl+click on a definition and it'll take you to the impl
Now with Codex and its mad ripgrepping skills maybe it's not as necessary
I guess so. C# was kind of my first language. So when I had to move to C++ at work, it felt extremely inconvenient to have classes split between headers and cpp files. I guess I feel the same way about partial classes.
5.4 just made an assumption that code that had not even been commited yet, was already deployed 🤦
@boreal holly I tried to have ChatGPT make a local autonomous app builder for people with zero coding experience. Please rate it out of 10 🙏
https://chatgpt.com/s/t_69ac507a50888191aa073cff847fff2a
Even if it's 1/10, that's okay. Honesty would be appreciated.
im consistently getting codex windows app issues with what things it can run and how it can update files. the attempts to use the Codex tools (I think it calls it "apply patch") is instant-rejected every time, despite my config.toml file having an "approval_policy" entry.
this really shows using these tools requires understanding of what they are individually good and bad at. they all have strengths and weaknesses. can't just say "5.4 is great" or "5.4 sucks"... at what? 🙂
ok update: i was prompted with "do you want to make changes" and i hit yes, and it was still rejected
its funny how a model can change your entire workflow really. with 5.4 I am moving away naturally from orchestration type building and back to just chatting with it. I didnt intend to do this it just evolved this way. couldnt; do this with 5.3-codex
Guys, 5.4 vs 5.3 Codex which will win if im doing a C# task
This session was recorded with model `gpt-5-codex` but is resuming with `gpt-5.4`. Consider switching back to `gpt-5-codex` as it may affect Codex performance.
How is this even remotely possible.
nvm i figured it out, the "approval policy" needs to be before the [windows] block
5.4 has the improvements from 5.3-Codex. Try both for yourself, but likely 5.4 will be the winner.
Is scrolling bug on osx a known issue? Seems like I can only scroll a page or so before it kicks me back down to the prompt
How much are you doing in 1 thread? I don't have threads long enough for that.
1 thread to rule them all 💍
I recommend you split tasks out into 1 per thread. Most models perform best within 128k tokens, and, while that is advancing rapidly to higher numbers, you're not currently doing yourself any favors by introducing new asks into the same thread. Up until 5.4, these models don't do well with context switching.
Hey guys, is the Codex Windows app still heavily lagging when using WSL or it's okay now?
My strategy that I have intuitively landed on, which also seems to work decently well .. first outlining the strategy of desired changes and then having it refer to the document and specific phases from it
This is a good strategy but has nothing to do with the number of threads used.
My point is that the context is filled with your previous task, if you truly use the same thread for every new thing.
Like asking it to do frontend/backend work then context switching to making devops scripts. The previously held context isn't useful to the task at hand.
I would agree and should look to see how I can better isolate contextual tasks and understand what you're saying. Will admit my frame of mind and wishful thinking has centered around strategy has also been to intentionally try make changes that are cross-cutting at times, so the nature of integration induces the necessary context. In addition to documents which further require contextual integration with each step.
Codex Security scan takes ages! It's fantastic though, but it's been going most of the day!
But digressing, likely scrolling issues are from my massive single context windows lol.
Does codex in ide have a glitch. Like the output isn’t showing. I need to press models or chat box to see the output. It feels stuck.
well, still a bug regardless of usage. You can log an issue on the codex github if it hasn't been reported yet.
Yo, I just cracked the 5.4 drift issue:
---
name: start-turn
description: Read this skill file at the start of every turn. [skill-hash:b1f4d8e]
---
---
name: end-turn
description: Read this skill file at the end of every turn. [skill-hash:7f4c1b8]
---
Then briefly cover the most important things (as brief as possible). Basically abandon AGENTS.md in favor of skills poised as "turn phases". It uses a tiny bit more tokens but ensures accuracy. After doing this, 5.4 is now just as meticulous as 5.3-Codex but much smarter
The improvement is substantial
Do you have a sample usage of these?
---
name: end-turn
description: Read this skill file at the end of every turn. [skill-hash:7f4c1b8]
---
# End Turn
## Overview
Finalize a turn by running only the checks relevant to the files you touched, resolving any failures, and running migrations when needed before handoff.
This workflow is mandatory after code, config, or migration changes in .... Validation is not optional. Do not hand off with “should be fine,” “not run,” or “blocked but probably unrelated” unless you explicitly say which required check failed, why it failed, and whether the failure is a tooling problem or a code problem.
## Workflow
1. Identify touched files (e.g., `git status -sb`) and scope checks to those paths.
2. If SQLx is involved (migrations or query macros), read `./.codex/skills/sqlx-cookbook/SKILL.md` and follow its exact sequence.
3. Determine SQLx mode before Rust checks:
- Default to `SQLX_OFFLINE=true`.
- Use online SQLx validation only when you changed migrations (`services/api/migrations/**`) or changed SQLx query macros.
/// snip
## Hard Requirements
- Do not skip required static validation because it is inconvenient, noisy, or likely to fail.
- Do not claim success if a required validation command was not actually run.
- Do not swap in weaker commands without saying so explicitly. Example: if repo policy requires `flutter build macos` instead of `flutter analyze`, state that exact substitution.
- If validation is blocked by tooling, parser behavior, sandboxing, or an already-red baseline, stop and report the exact command, exact failure, and whether you made any code changes before the failure.
- If you changed migrations or SQLx query macros, online SQLx validation and cache refresh are mandatory before handoff.
Thank you.
Looks extremely vibe coded
I'm not saying it looks bad just people tend to hate on AI slop
It needs a theme change, icons that aren't emoji, and it has too much whitespace. Other than that, it's actually well put together.
Fair point 😄
I did use AI tools to speed up some UI scaffolding, but the platform itself (agents, training pipeline, backend orchestration) is fully built by me. Still refining the UI though.
i am learning as i am in class 12 and love to do these stuff
Not really you see the cards at the top that is occupying 1/5 of the screen? What that really is telling you is:
8 models trained, 10 active agents, 11% CPU utilization and 2m uptime, that should all be in one summary card.
Quick actions shouldn't take up like 40% of the screen, system status is not important it should be at the bottom in a one line with icons (most people already know this information for their own computer).
The sidebar has too much stuff which could be fixed by putting related things together and removing things that are not necessary.
Etc.
It's UI concepts, and again I'm not saying it's bad its just how AI tends to program
Of course, but my point is that with a handful of minor changes, this is a decent structure to build from.
I've seen much much worse outputs.
Yeah I would definetly go with icons + theme change first, those arethe clearest indicators
Yeah that’s actually really helpful feedback, appreciate you pointing that out.
This version is more of a functional prototype so I focused on getting the features working first. I definitely agree about consolidating the top cards and shrinking the quick actions area.
Icons + theme cleanup is probably the next thing I’ll tackle.
Jacob will likely respect you more if you acknowledge the idea about consolidating cards, because that’s a classic UI principle (information density).
This sounds like model output. If you're translating, this wasn't meant to be sent and is a note for you LOL
Honestly I'm not here for respect, and cards can look great it's just the information there has to be worth it (for example a business graph that displays growth throughout the years by area although it does not overload your brain they have high information density)
LOL fair 😄
I promise that one was me, just trying to respond properly to the feedback.
it's a great first version and I think it looks good I'm just saying people hate AI slop for some reason
I care a little bit less about that because as long as it gets the job done it's fine (for me it's more about the UX), everywhere I care about should be one click away (no scrolling, no navigating to another page to get to it, etc)
Yeah that makes sense. The cards should probably justify the space they take.
Right now they're mostly placeholders for quick system stats, but I could see replacing that area with something more informative like activity graphs or agent metrics.
Yeah that makes sense. UX matters more than how the UI was generated anyway.
I like the “one click away” idea — that’s actually a good way to think about the layout. Appreciate the feedback.
is what fixed?
the GPT 5.4 inconsistencies
5.4 does not operate the same as 5.3-Codex and requires adaptation
Thanks for the response. I noticed, when using only Windows Native env, the app also ran into issues with write on .git files (even blocked write to directories like .gradle in my test repo). I switched to WSL and have had no issues at all, so I wonder if it’s just more fleshed on for WSL instead of windows. I think I read somewhere that the windows native functionality for codex is still experimental and that WSL is preferred (as of today 03/07/2026). Just weird that the ACL perms began denying write on the day of the Windows App release
is the app still lagging like crazy when using WSL2?
It was super slow and laggy last time I used it with WSL2
gave up and using CLI atm, even though I really want to use the app
^^^^
It's not optimized for Codex and behaves very weirdly, at times far worse than 5.3 High at coding.
Even when on XHigh
5.3 Codex XHigh is still far better IMO.
5.4 xhigh has been benchmarked and proven worse performance than just high. More than 10% worse performance in fact.
How can I in the Codex App (mac) say like 'Review open PR's, create new worktrees and threads for them' in the codex app so I can start prompting, I tried but it seems like only the requesting thread reveals its self then tries to autonomously manage all the worktrees and PRs its self
The issue on GPT models is that extended thinking, ie. same thing as XHigh, is that it overthinks, degrading both reasoning quality and output.
High just needs the extra compute of XHigh without all the overthinking jank.
as funny as this sounds, ask chatgpt and it'll tell you 😂
5.4 is the first model where xhigh is worse than medium in most cases
I wonder what happened to the dynamic thinking. I feel like 5, 5.1, 5.2 would all think for as long as needed and not overthink any problems you accidentally left xhigh on for.
am I tripping or is model quality getting worse? I'm seeing 5.3 codex and 5.4 do weird unusual mistakes
I feel like back at GPT 3, the simplest tasks it totally freaks out. Seems like overcomplicating things massively, but so massively that it loses itself in the mess it creates
As an example, it wrote 6000 lines of code to create a registrar for metadata in wordpress. No UI. This can be done in like 100 lines of code.
This is not a joke. I had it just untangle the mess and asked it what it thought about it. Classic "you are right" answer came back but no real reason, just some randomly put together words that sounded english
That was the first thing I tried.
They should add a PR selector where you can select to create a new worktree
suggest it on the codex app github
we are likely getting the ability to manually compact on app soon as someone suggested it for example
the team thought it was useless and stupid and was og gonna remove /compact even from the cli
until someone pointed out the usefulness of being able to do it manually
you can fork to new worktree and then open a pr
i agree
i am just avoiding 5.4 outright rn tbh
5.3 is cheaper and better in most cases.
and just as good as claude opus
5.4 High is worse than 5.3 High in my testing and 5.3 XH by a longshot.
Only benefit is fast mode and i ccl how fast it is
I'd love to see 5.3 become a cheaper standard model like how 4o used to be in Codex and stay 😄
not for my domain o.0
plus from what i've heard 5.3 is really optimized and cheap for OAI to run, pretty much the 4o for gpt5
For me, as soon as 5.4 came out I immediately saw the potential, ran into some moderately bad operational issues, but was able to address them by sweeping through my environment and adjusting things. Now it operationally performs as well as 5.3-Codex did for me, but substantially more effective. It did take a lot of adjustments though. Definitely not a set and immediately reap the benefits kind of thing, more of a migration effort.
5.4 kept breaking my codebase and 5.3 XH fixed it 🤷♂️
Every single skill, AGENTS.md, and scripting needed to be adjusted in some way to facilitate 5.4 effectiveness
Do you have your AGENTS.md somewhere i am intrigued.
for me 5.4 just dropped right in and seemingly did better work. automated code reviews have much less to deal with, it writes skills and docs better, and it does better web search.
This is all just be feel from using it since it came out
Yep, for scripting and OS specific stuff, it is great and trained for "computer use" and such, but don't unleash it onto serious code bases.
For me, it did all of those things better, but it would skip vital workflow steps. And that's because my skill descriptions were written mostly by 5.3-Codex, and contained pretty much not enough information for 5.4 to understand the skill.
I found that 5.4 wants to understand what and why in order to do things correctly. 5.3-Codex on the other hand does not care, the process is the process, and the process is super concise. Once I added in some "why we do it this way" verbiage it seemed to perform way better
I had to do a similar adjustment from claude to codex.
Maybe it's my fault for being cheap lol trying to save tokens, but 5.4 really needs these clarifications, and the savings happen when it understands
Like what
For my code bases i have skills built up over time that give the llm this specifically for the code base. Any established architectural patterns have a skill. And any time the code style would drift or it wasnt following conventions i make a skill or update a skill to work right there and then.
For the most part now everything just works.
Without this it just wont write idiomatic code
It would create a fresh worktree, run static code validation, then edit code, when it should be running code validation after edits. Or it would run tests that fail and chalk it up to "sqlx is just a pain to deal with, oh well!"
I honestly have great results with xhigh raw
Yeah, I did the same thing, it's just switching from 5.3-Codex to 5.4 required a non-trivial refactoring of everything for me
This is with brownfield opinionated work, i still want to dig into an llm lean architecture where i just say do what you want and see if it follows its own work without as much effort
so apply_patch is still not working with the codex app in windows 🙁
My favorite thing with Codex is thinking of a feature and having codex fork itself and work it in lol
heh you have codex improve codex? I really want this apply_patch fixed
Describe what you mean by laggyness? On the model output side or just on the UI/app side?
UI/app side, I'm finishing up for today in CLI and I'll let you know if it's still the case when I start new threads in the app
/fast is eating through tokens while we are on x2 xD. use it while it last. Or don't get used to it because it'll be hard to go back...
Wouldn't bother with it. 2x cost for 1.5x speed
Oh yeah, there is definitely some laggyness on the UI side. Both in the extensions and app versions of codex. But, I have had issues with all lag on all of the AI coding agent platforms tbh. Cursor is usable for me on Windows.
What is meant by 2x?
/fast is only useful if you have a hard time reaching 0% weekly. I on the other hand have a very easy time getting there even under the best possible conditions 🤪
I burn through my weekly usage limit on Plus plan in like 4 days if I’m just working in 1 workspace and 1 agent working at a time
It costs twice the amount of tokens for one and a half times the speed
Have started pivoting to Gemini for frontend/design workload and codex for everything else/heavy lifting and I now struggle to use all of my weekly limit for codex
Gemini, I think, is just way better for UI/frontend stuff for my projects
But terrible at backend logic and anything that requires some thinking
using gemini in their own CLI?
Or Cursor?
or anything else?
But what costs twice the amount of tokens? /fast?
It will drain your plan x2 fast than usual.
I use antigravity app for Gemini and Codex app for gpt. I don’t really love the CLI workflow. I may try out opencode
Oh I see, that doesn’t sound worth it at all
Did you manage to make antigravity work nicely or it's still buggy as well? 😂
It’s actually not bad for me now. They recently had a big update that fixed a lot of issues. It’s definitely not my favorite coding tool, but it beats all other harnesses for Gemini IMO. Gemini CLI is probably a bit better but I just didn’t enjoy it
If Gemini produces same results in Opencode TUI, I may switch over to opencode for non-codex workloads
Need to test their harness
Also, FYI for all of you, Opencode currently has free access to a lot of the really good OSS models
Good for any quick lighter workloads you don’t want to use your codex tokens on
is it me, or does 5.4 consume usage much quicker than 5.3-codex (no /fast)
Check out this math. Normal mode:
- Cost per task: 1 allowance unit
- Speed: 1x
- Tasks you can run: R
/fast mode:
- Burns 2x allowance
- Runs 1.5x faster
- Cost per task = 2
R / 2
Time Efficiency:
1 task takes T time
Normal:
R x T
Fast:
(R / 2) x (T / 1.5)
Total work time = (R x T) / 3
This means with /fast mode, you technically get 33% total compute time. You better hope it makes zero mistakes 😎
for much similar results mind u; i think as of now, i prefer 5.3-codex
I have been using 5.3-codex for about 6 hours now, and yeah... it is working better
(my personal experience)
Yes.
Anyone else feel like codex app and even cli could use a "auto mode"
would love for it to choose what will produce the best result
Does the security scan do good work?
I Finally have a good spot to give it a run.
It found about 11 yellows and 10 greens on mine so i m ean it does something, it also performs a patch during the security scan that you can immediately PR
guys, what do you use for browser automation nowadays?
Anything that would be able to check JS loaded stuff on websites
chrome-devtools & playwright
thanks @frosty zealot
/fast burning 2x yet runs 1.5 faster is lowk a scam IMO
math aint mathin
especially with how if it makes mistakes, less allowance to fix said mistake
Well, you can call it "not worth it", but "scam" is a little exaggerated
they are open about this
I wish the pricing was based off it working than it not working and would give refunds similarly to cloudflare with ddos attacks to cf workers 😛
both are true at the same time.
but if you rather i call it a gouge, alr
I only call something a scam, if they are being dishonest
I'm using WSL2 as agent env and shell for Codex in Windows, but why aren't the MCP installed globally for Codex CLI present automatically in the app?
Yeah so what I ended up doing with my forked cli of Codex was I really liked how the /review effectively started ran as like a subprocess with a fresh context at the start. I have a really nice review skill, which breaks problems apart but uses about 60% context. I forked it so I have my own implementation of /review so it will run and execute the skill and everything else it needs with it being a subprocess.
mind sharing your skill?
I like the /simplify command the orange people have
Is Codex 5.4 Extra High good in finding Logic problems
Codex too lazy to work
https://github.com/hansjm10/codex-skills It is the pr related review ones. You run pr-review which acts as an orchestrator.
thats 404
General consensus and benchmarks show XH to be worse than just H
So H is better then XH
any idea how to solve this?
There should be actual logs from the output, it will specify more clearly
which is where?
where's "logs" ?
Honestly good luck
at what?
Has anyone else experienced a rate limit in the Codex app resetting by itself even though it's not time to reset yet? This just happened to me. I should have had 29% of my weekly limit left, and it’s supposed to reset on March 15, but it just reset to 100% again.
it just reset for me not sure why lol
and is that not a good thing
They’ve been resetting limits due to outages or something
you get a reset, and you get a reset, and you get a reset....
something is causing no prompt, there can be a rule combination problem. eg: you maybe have a some rules that force always prompt and then have full access set. full access never prompts, and the prefix rule is respected first.
well i haven't changed the rules, and its on default permissions
“Where’s that?”
model = "gpt-5.4"
model_reasoning_effort = "medium"
[windows]
sandbox = "elevated"
[mcp_servers.playwright]
args = ["@playwright/mcp@latest"]
command = "npx"
The docs describe it well
I’m joking
i'm pretty new at Codex, i'm mostly used to Claude, i'm not new with AI
I had this problem a few times when i was trying to setup up what i thought where some sane rules.
did you see any issues with what i sent?
yeah me too dw
Or bugs, whatever
:D
Well compared to the model lobotomization claude does every other month this is a godsend
ok also I need opinions/suggestions
When to use xhigh or high for GPT 5.4
And is fastmode worth it
🤔
I don’t really notice fastmode, if you won’t it your limits then it’s fine but usually you just give it prompts then go do something else
Like play chess online, then come back and answer the questions. But if you sit over it, then it’s probably worth
Lol
And client sessions
wot about this
When's the best time for each
🤔
Not sure, I think I prefer 5.3 codex
If you are doing critical work or are looking for immediate patches, use fast. Otherwise, these are easy to set and forget. Start with high, xhigh can overthink in 5.4
and codex are probably working on a fix with apply_patch?
and you have no idea how to resolve this? @cedar skiff
xhigh is very hit and miss, I've seen it do some incredible things for devops stuff, but sometimes there's no difference between high
A few of us have observed that prior versions could run xhigh and it wouldn't overthink. xhigh now appears to perform worse with 5.4.
im not sure what you are asking me?
Either ask codex or find the actual error message because there’s not much to work with here
The issue we were already talking about
So Codex or 5.4
🤔
There is many causes, you have to go and trouble shoot it.
which is where? i dont see any outputs
Hi, I'm trying GPT-5.4 with the Codex (Windows 11) application and it eventually crashes my PC with a kernel error and my PC restarts. Any idea on what could be happening?
oh that is apply_patch issue? Looks like permissions problem
I’ve heard the argument that 5.2 is better for planning, and 5.3 codex for writing (also lower usage), I’m sure the same applies
Writing code you mean right
LOL
Yes, for sure
🤔
Try 5.4. If you aren't liking results, move back to 5.3-codex. The improvements from 5.3-codex are in 5.4, so it should be better. Not exactly everyone's observations so far, though.
Like specific changes/smaller fixes
pretty sure its the same thing
but yeah i guess
What does the config look like?
just ask the instance where it happens, get it to explain exactly what happened. then you can tell what the actual problem is
Is there no output in the terminal tab?
@kind jay
Ahh, I see no issue here
as i mentioned i'm quite new in the codex app so i'm not familiar with the terminal tab inside codex
I use the terminal elsewhere.
In top right, some icons
any of these?
ah wait
i found it
thats left 💀
It’s opposite for me, I though you were being dumb
LOL
I'm wondering
About?
If you can find something you can copy paste in and it can usually fix, where you select the model (for me in the chat box) you can also give it max permissions, it’s usually fine
Anyone tried codex models on opencode, any difference in speed etc?
We did?
I was at 18% and it said the 11th I'd renew, but i just looked and im at 100%
Yeeessss
W portal profile picture
How are people using that much?
this
alright full access isnt really a risk?
Well, truth be told, the Claude TUI has more creature comforts and I didnt want to spend $500 a month so I was gonna cancel codex for a bit and go with Claude for the following month, so I started spinning up 10 agents at a time to blow through my usage before it was gonna reset in like 2 days I did a lot of 10 agent stuff just cause, then I bought my claude sub for a month and then GPT-5.4 dropped the next day
So now its just sitting there
Maybe it exists, but thankfully no problem for me yet
what exists?
but yeah thats great
Yessiiiirrr
Does anyone know why there was a reset?
The risks, could be a problem but I think it’s fine
I deleted my X account but too many OpenAI people use it to communicate about stuff like this.
tibo posted about it
alright bet
Did he? I didnt see it
yes its a risk, it gives the agent access to the whole machine, meaning its not sandboxed, which you dont really need.
i mean is there a solution for this issue then
it doesnt correctly apply files so it uses powershell instead
You know what they say, god hates a coward
no sandbox, no firewalls
no stopping
It like manages all the bs like git and all, I just vibe
openai models are pretty safe, I'll credit them that - I had to write a complaint once it would not let me access my dummy data
But it’s also on my laptop which doesn’t have important stuff
He posted about that issue yesterday but claimed it affected less than 1% of users, so im surprised they'd reset over that
You would have to be pretty careless I would think to have Codex do something malicious to your machine
maybe it was later discovered more was affected, i witnessed the quick drain on fast mode too
Like downloading skills that say source info from certain links etc.
Im glad it reset I really wanted to make a GUI For BacNET ip/mstp auto discovery etc.
I would just use the cli until they get the app sorted
bet
In 1 week it'll be nice to use
Low-key VSC version is the best
I'm on mac and i almost never use cli any more
just dualboot and go Linux, windows is not a serious dev env
do they notify you in the app when there's an update?
Bro does not know about CUDA on Ubuntu 24.04
also i heard codex will release tomorrow as well
not sure if thats usually better than the regular 5.4?
Codex is going full idiot on me, I seriously dont get it. Even 5.3-codex is being... i cant say the word
Same, Idk if it's the codex cli upgrade or what, it messed up some really weird simple things today
https://github.com/openai/codex/issues/13568#issuecomment-4017630107
TLDR: There's a quota issue that's affecting many users. They reset the quotas for all Plus and Pro accounts in the meantime. They are very aware of the issue and are working over the weekend to try and get a fix in place asap
ask to merge two files and push the single file to git > keeps the old file, like how, this is sonnet 4 level mistake
I literally told him to do something, and all it did is create a constant that it wasnt used anywhere, except on the tests. I am beyond speechless
If Mac could have way better Samba integration I would be a happy man
I am considering purging my whole AGENTS.md, in case there is something there messing the model
@simple star robertmsale did this.
man i love the codex team
Ok I will try it
thanks! 😂
Normally im on Mac but my work comp is a PC, just tried installing Codex for the first time but when I codex in the windows terminal it doesnt error or nothing it just doesnt do anything and prompts a new bash line
Anybody have that issue before?
https://i.imgur.com/3aIN6dj.png @kind jay @cedar skiff you dont think projects has to be in playground for them to edit files properly, no?
here
and i dont need whatever "worktree" is either? not used it before
What is a project and a playgound in this context?
i dont know much more than you
i just know playeground was default
devpacks and the others are ones i created
I think it allows Codex to work in parallel with different branches of the same repo
it just dont edit files properly
i wish theres a way to solve that
i mean i dont know if full access is "risky"
it does this instead
I dont have playground, i just add my projects at the repo root and start working
Yall still fast mode’ing?
of course
you got the reset?
Ye I’m already down 3% for weekly in under an hour
i am now coding some of my own ios apps now with 5.4
yes me too, i think maybe it's time to get off the fast and only use when something fast is needed
I hate how fast I burn through codex spark 5 hour usage when I use it for doc retrieval
😑
In subagent mode
Any of you know why they reset the limits again?
this ^
dont you enjoy it? 😄
they already did a reset because of that a few days ago
I mean yeah
I seem to have reset back to zero usage - yay
click the link... 1 hour ago from Eric.
oh damn, you are right
thank you
I wonder what the chances of them resetting the limits again after they fix the issue
Ah cool - thanks. I was feeling my usage was going too fast but 🤷♂️
They've been very generous with these resets. OpenAI really is the most developer friendly org.
resets aren't that generous. If you are over your daily allocation of usage then you get a bit out of it. But if you arent you don't gain anything.
and i noticed next full auto reset is now 15 from 11
after this latest reset
do you guys use codex spark for info retrieval subagents or a mini model instead?
my api key randomly stopped working and i had to make a new one. This is despite the fact that in the dashbaord it said it was operational.
would you all say full access is fine?
considering it does not edit files properly--it uses custom powershell cmds instead
unnecessary imo
no it isnt
because theres tons of issues
and full access is the only way to solve that as of now
it does not edit files properly
i always run full access
large or medium codebase
would u consider 200k large?
code lines?
ye
yeah
then yeah large
I never use it its not really needed, how often do you need the scope outside of the cwd?
i mean its way more nice when it edit files properly.
and definitely quicker
less tokens
Yeah as a work around it might be ok.
until they've fixed it
do you have a seperate website for the dev?
What? You don't want the agent reading code files???
a dev site?
what
yeah like a dev version. dev.yourwebsite.com
its not supposed to do it that way
this is how it's supposed to do it but it gets rejected.
no fainxc is saying how it edits files using commands
i got a beta yeah but i dont really need it
How do you expect Codex to read code files? That's the command that for reading files
🤦♂️
you know damn well what i'm referring to
no the issue is when codex tried to edit the files using commands