#codex-discussions
1 messages · Page 64 of 1
Codex really needs to be able to stop, to ask questions, even outside of "plan"
it is just funny how you guys wanted it not to ask questions
you should be able to build that into your skill file
ive barely used plan mode in codex yet and get plenty of questions 🙂
not via a question tool though if you mean that
lots of complains back then
You will not have heard one from me. Conversation is necessary to get good results
Thats why Plan mode is great
@lean lark Thats gotta be fake
they closed the coding gap at about 5.2 imo. They were better at compaction and coding at that point, but 5.2 is really tedious to prompt because it doesnt get semantics this, that, it, this etc, it would pick the wrong context of those words and do the wrong thing. Like you would say lets do this and it would pick the wrong this same with that etc. It also never changed its conversation to match the level of the user. It was technical and hard to understand if you didn't know SWE terminology and architecture. So despite being better at code it wasn't seen that way because vibe coders and weaker devs still had trouble understanding it and prompting it. Claude matched the vocabulary of the user it is working with and spoke at their level, and understands the semantics this, that and other etc. So it was easier to understand and easier to prompt, but not better at coding. But people felt it was better because of its better communication skills. gpt 5.5 is a huge step on the conversation skills, it still isnt quit claude level of just getting it. But now the gap is small enough that the coding difference can shine past the language barrier. Fun fact as a coder alone 5.5 is about the same level as 5.3.
Oh and 5.2 was REALLY slow
What you're citing is exactly the issue I had with that v4 project where I was trying to drive its semantic understanding. In 5.2 I couldn't continue, the project stayed on the shelf. Now we can do it.
Coding while I am on a Lyft ride. What a time to be alive
you aint coding tho, you are yapping in discord 🤣
last time i used a macos app written by openai before this week was last june, and it was the chatgpt app. it was such a pile of trash i had low expectations on codex
Asking a machine to code for you
Have you tried codex app?
I love it
I’m tired boss
Yeah who codes by hand anymore
yeah it's so good, i'm wondering how long it's been like this
I started using in 5.3 but it wasnt as good
Folks who are on the verge of or already have been promoted from SWE to customer
I did ooga booga code my open source project years ago
But now I wanna vibe it to golang
the chatgpt app is good now too 😄
Yeah me too. Also got a BS in Comp Sci like 2 months after GPT 3.5 came out 🤡
I dunno about "Fun fact" but the numbers are good for 5.3-codex and 5.5: https://chatgpt.com/share/6a07ad6b-eb68-83e8-a81d-686bad359b65
Codex 5.3 medium is my goto
it legit shows it first line in that chat o.0
1.8 points diff
It's better at everything else
Agreed on "The shared overlap is narrower than the pages make it feel" but all of these models are progressing in a similar "evolutionary" manner, not "Revolutionary".
Can someone explain Codex to me and what I could use it for?
The 5.5 page doesnt even chart swe-pro, they mention it mildly in a sentence. If the diff was big they would have it front and centre.
I wrote about this another time:
#codex-discussions message
This one is interesting as well
https://www.reddit.com/r/codex/comments/1t5ipjd/gpt55_xhigh_is_the_strongest_coding_agent_weve/
A lot of people coming to the same conclusion on their own terms and then talking about it in public forums and you tube vids etc.
Where ChatGPT presents a series of questions and answers, prompts and responses, Codex actually does things in the real world. It reads and writes files, assembles data from different places, kicks off other processes. It's active, not passive.
Howzzat for an elevator pitch?
Interesting
write code mostly, but can do much more than that
I'm not sure i want my context thinking about pizza and happy hour before it updates documentation
Yea thats why I dont use memories, but I have a session retrospective skill that i sometimes run. It puts stuff directly in tracked repo docs though, not hidden in some memories folder
same haha
yeah you would probably prefer bananas 😄
a reset is coming
oh boy
and I'm over here at 85% left with 4 days
😮
codex mobile is supported for wins now?
I heard somebody has got it with his wins
I try to connect codex mobile with my wins but theres no reaction
maybe just skill issues?
There are some ppl who mentioned how to do it for winodws in the chat, there are feature flags you have to add to the config.toml
gpt5.5 fixes itself whilst performing worse
Idk 5.5 seems.. dense to me. Very much so. It makes rookie mistakes in coding and dumb decisions, often ignored instructions, or on the other hand listens to them way too literally.
In coding it feels like a regression to be honest. I still prefer 5.3 codex
is there a big difference between 5.5 an 5.1? we only have 4.1 or 5.1 avail and not sure what to expect if we update
obviously
its huge
Does anyone let the llm do 100% of the coding but you actually check for mistakes and test to make sure things work and is safe?
We don’t really test things around here, why would we? It’s not that we’ll use the stuff afterwards
I mostly do this now days
I look at the code that matters but i dont write any of it
My projects have robust unit testing and integration testing
😅 but do you let codex for example create and run the tests? I think for my project, after every feature i have it create tests, test locally and then test it on production (thankfully i got approved for the cyber stuff so it can actually do it)
how many beers does it take to develop a 100M b2b saas
codex writes all the tests
I curated and adjusted skills for writing them until they worked
Codex loading insanely slow for anyone else?
Taking forever to respond and keep getting connection issues
With out skills codex does lots of bad testing patterns, it wont use helpers and will reinvent the wheel in every test file, it will create low value tests, it will test implementation details, it will pass tests manually that are blocked and just tell you about it in a report, it will mistakenly update implementations to have tests pass.
yup good skills are key
anyone use Codex on Windows? Is the app best way to use?
I’ve used CLI and have gotten way better results
up to your preference, the core functionalities are the same across all ux surfaces
Ima go touch grass guys
that photo plays tricks with your mind
How
codex just told me to use kubernetes if I hate myself enough :\
I hate kubernetes
it looks like its on its side
What is the most unlikely thing you’ve done with Codex that you thought would be impossible?
New motivation prompt for codex "Make no mistakes or the next project will be using kubernetes"
Fr? U think 5.3 is better. I would use 5.3 if it was good at steering messages. But it forgets instantly.
5.4 ig
guys how get damn near infinite ratelimits
i still don't know why i hit such fast weekly limits...
I have plus n my shi going dowwn fast
I mean I didn't realise codex would go off, look at websites, parse them and then come back with opinions but that was just newbie surprises not codex being fancy
i don't know what else am i supposed to do to reduce that
maybe anyone here knows what i gotta look for in local .codex?
Mostly replace manual testers for flutter apps in mobile and web lanes.
I just didnt think it would be able to do it
does any plan give very high limits ?
"surely there is something out there already so I don't have to do this myself" => codex goes off, finds the data I need on the ABS site, tidies it up, turns it into json, writes an editor, fills in an entire extra field from the next step of my documentation without being asked
me: "oh"
I got muted lol
I just used half my 5 hour limit on ... idk, something, it was very annoying anyway
damn
I had infinite limits
ended tdy
I might need to buy the $100 a month plan atp
I need to keep codex away from drupal. Boy does it eat the quota
me playing Spyro 1, later 2, while awaiting limit reset
lol
@cedar skiff how do you mitigate heavy weekly consumption on Pro 20x?
like, how do you check what makes rapid consumptions without ever being --yolo or xhigh on 5.5/5.4 ?
i wonder if it uses more since I'm using wsl
I posted an offensive it seems. Damn pg 13 here
I make my orchestration use weaker models where they can
Thats it
same as mine
yet... still
i try whatever i can to make orchestrators just... work, but even still
at that point, either what i have is so fundamentally wrong, or Codex as harness is not something that works for my workflow, so that means making a custom harness with the flow i need
I also dont go crazy with it, i could spin up 10 agents but i only have a sequential 1 agent run going instead. This way i can manage my usage.
so, you limit threads to 1 only?
@cedar skiff how lower uesage
so that only 1 subagent can run?
is this normal usage for plus ?
I say do this huge list of tasks sequentially with codex 5.3 high reasoning subagents
Or something similar, maybe if its information gathering for a task i might have five 5.4 mini models do it.
Then it just chruns away and i can stop it every now and again and get a progress update
https://github.com/xsyetopz/OpenAgentLayer thing of my i own i use. on pro20x, there is a major split of many 5.4-mini's && 5.3-codex's. one or two use xhigh when you choose pro-20x as your sub for codex
some i recently switched to 5.4 from 5.5
but if all subagents are to be centred around 5.4-mini/5.3-codex, then there are fundamental trim-downs needed next reset
but i'll also contact openai support in regards to what i'm doing wrong, && hopefully get some guidance
if you can see what i do wrong, maybe asking me for specific files may help further?
like from my local .codex
as you seem to know the workings more than i do
not sustainable to keep paying 150-180 bucks per month && hit weekly before i even make it to half a 7-day window
to get 5.3 working properly i do a refinement loop first, I get 5.5 high to write a skill get the subagent to do a task then 5.5 audits, interrogates the subagent about the problems and then runs the same ask again after updating the skills. There are nuances to this system though.
You have to tell 5.5 not to start make explicit rules for the given task, all skill changes need to be generalised and agnostic the tasks. When it talks the subagent it needs to make sure the subagent knows we arent looking of an apology we want to know the reasoning behind the mistake, otherwise it will just say oh yes i should have done this other thing instead my bad...
Back on the topic, i could use heaps more usage if i had it.
So i just temper what i can do.
this is my usage this week so far
Damn you are doing too much
It's not enuff
Too much context = hallucinations
I think you are confused
This is pretty much how goal works
It just iteration on the same task over and over until we get what we want
5.5 crushes this work load
I got mad at codex 5.3 cuz it didn’t adhere to my rules
So I had it write to the rules to explicitly follow it which it does now
Thats why i do a loop to refine the skill
I should do that but it doesn’t learn until a specific event happen
Until then I’ll call it out in its mistakes
you can go from 1 task to 5 and then scale to automated or similar
It took a dozen prompts for it to get the semver rules right
From dozens of prompts
Like it would iterate the version on every prompt. I’m like no, compare the local unstaged commits with the local head to the remote
here is an example of it iterating on a skill using goal
It's just going to keep churning until it works
Share code
ㅠㅠ
What's going on? The post I just wrote has disappeared.
Is it true that Codex Mobile isn't available on Windows yet?
Is anyone using it?
I wonder if I used a banned word or something controversial.
you're asking Codex Mobile on Windows
i think you realise your misunderstanding, right?
You know how the Chat-GPT mobile app recently got an update that lets it integrate with the Codex app?
But that feature hasn't been updated for the Codex app on Windows yet.
My friend uses a Mac and gets along with it just fine, but I'm on Windows, so I haven't quite got the hang of it yet...
long-term solution: move to 🍎
short-term solution: hope that people actually care about Windows here
ㅠㅠ
not even microslop cares about Windows--see Windows 11 for more information
I just wanted to vent in case there’s been an update and I’m the only one who doesn’t know about it T_T
so why should OAI lol
That's really sad FACT news for Windows users.
Codex mobile connect not working?
is anyone getting "Oops, an error occured." msg?
are we getting a reset maybe?
codex mobile is working, but my phone is a bit hot even leaving it standby mode
If we get a reset this will be one of the time actually needed it
Hey, quick question — is Codex working normally for you guys? Since yesterday I’ve been constantly getting this error:
"We're currently experiencing high demand, which may cause temporary errors."
And I literally can’t code because of it.
hit the reset button
Guys i feel claude code really tries to address the user requirement more than codex
Codex just does something for the sake of the request
This is the same issue with gpt i feel!!
Just a personal opinion
heh, commented to codex I was staring at this bug for half an hour before actually seeing it and it gave me the eqiuvalent of pats the human on the head it is ok feeble human, you sometimes miss things on the page when you are not looking directly at them
man codex is just going downhill
decided to update and rip
and its so slow and still freezing
mine has been fine. I'm the wetware failing the testing
are you on mac
yeah but not in the app
For Windows nothing?
im going back to extention
yea convo loads fine in the extention
sucks because even t3 code doesnt work for me anymore
we back in cursor usinng the extention loll
not working...
codex mobile not appearing
I've had that a few times. "That's completely normal. Humans have limited attention spans which means they can miss things. An AI doesn't get bored and can stare at code forever with the same attention level." kind of
yes like that, a very slight change in wording would be condescending but it worded it very diplomatically
Mine often sounds condescending, in the absolutely kindest and most caring way
"You are weak and pathetic. Luckily you have me to take care of you and I'm always here for you" is the feeling it can give off at times. 😛
cli or app? They seem different
I'd fold if somebody told this to me
oh chatgpt itself can be overly nice
64% 3 days left - 5x plan
not bad ..
codex 5.3 .. best model
usually doing paraell agents ..
codex 5.3 xhigh

nice israel flag
You should probably try to use it all up
Tibo is resetting tomorrow
oh man
validated ?!
Hi, I just create a local solution for latest Windows Codex version. FYI
https://github.com/openai/codex/issues/22985
Im all g with it
code maxxing now
saw it like 6 hours ago
when reset 🤣
reset maxxing .. he is on
I have a couple /goals running on the background overnight on 5.5 medium fast
imagine he would say ... sorry no reset lol
He wouldn’t, he already mentioned a reset and to /fast max
tokens... omnomnomnom
Do you have any good skills to share mayhaps?
mine are all project specific and tuned to the local work flows, let me check what generic ones have
the only really generic one i have is one i extracted from claude superpowers, i use it a lot as well it's the systemic debugging one from here:
https://github.com/obra/superpowers
Thanks 🙂
I use it in most of my work flows
i do task-of-some-sort -> blocked or complete -> if blocked new 5.5 high agent with systemic debugging to fix the problem -> back to the worker to finish the task.
Unpopular opinion… skills and agents and plugins are extremely overrated
I use only one tiny agents file per project - it’s maybe 10 lines of scoped instructions.
Mostly, everything else is just noise in the context
if youre not using skills and orchestration youre behind
Hooks are much more powerful and controllable if you need to interfere in llm‘s chain of thought
Name me one thing the llm can’t do without a skill or agent
it cant write idiomatic tests.
It also cant be on top of the current best practices, it has holes in its thinking that you have guide. It doesnt know the entire spec of something like material 3 or ios design system etc etc
Neither does the skill you add
An MCP could solve that somehow, but also not entirely, because llm does not know == means it needs to search for specific answers
I don t use much "skills" per se. I have one I made
But that's just a format choice.
I use written procedures a lot for LLMs. And it can easily be argued they are the most important thing
it also doesnt follow architecture ideas or understand how to work within a given projects policy
A llm of you want to reason how you want you need to give it very explicit meta instructions
You misunderstand the difference between an mcp and a skill.
Written procedures you mean basically a project scoped agents file with instructions specific to it?
Otherwise it will just vibe how it does stuff for you based on whatever generic defaults are salient in it s training and generic instructions
mcp tools LIVE in the context skills are called when they are needed.
Neither does the skill you add yes they do, you clearly dont understand how to use them
Ok 🤷♂️
Yes, written instructions I d say that s what matters. How you pack them and how you bring them into context is more of a style choice
Yeah
I mean I m with you on "skills* per se are not some kind of magical skills like many imagine they are. They re just some text that gets a bit of special treatment, but nothing more
Plus a bad skill is worse than no skill
😂
the models are trained on the skills standard, and they use them really well, they are either proactively called or explicitly called and the context is not loaded until that time. they dont pollute the context, there information is injected at the very time it is needed or not at all.
Using your own home rolled system wont compare
you mean the cli, not the codex app
Once you work out the power of skills you will kick your self for not using them sooner
If you never work it out you will be dommed to baby sit your agents for ever
Models are trained to follow instructions that's important.
The part with "trained on skills standard" and what you imagine it means, it s hard to argue what adds on top of instruction following without serious scientific A/B
You literally say that same instructions, will work 10x better if they re put in a skill as opposed to "home rolled system".
Which personally from my experience I don't find it to be so
this could get as heated as claude v codex
i dont remember saying this?
I do remember saying your home rolled system wont compare.
I would bet my house they to post training on skills usage.
Your home rolled system does not have the saem ability no matter how you look at it.
Yea i added the 10x, but everything else is basically your claim 🙂
Because you do need to compare the 2 on the same instructions expressed
I used to use them extensively actually
Figured a) they aren’t guaranteed to be loaded and b) using a custom orchestrator you’ve much more control over what the llm does and how/when
Looking at all those skills people use ranging from „you’re an expert xyz“ … doesnt make your llm any smarter.
Loading a skill that has let’s say a tool to extract/preprocess specific file format it wouldn’t understand out of the box etc, yes, that’s useful. But it’s still arbitrary whether it’s used or not.
Having an actual orchestrator is the way if you need that level of control, but it’s not as easy as loading a file saying „you’re an expert“
Like both skill and home rolled system tell the same sequence
Cause otherwise the comparison is not about skill vs home rolled, but about the quality of instructions
from what I understand, skills kan also include code that gets executed (assuming it can run in the agent sandbox)
giving them a certain level of determinism
Exactly what I say above
Bro...how you school your orchestrator to do stuff, for me is 100% the most important thing
I sent it to Harvard. Is that good?
Was thinking homeschooling but it’s frowned upon
Haha
skills have a small meta data that is injected into the system prompt at the start of the conversation.
From that they are proactively called by the agent in the moment they are needed. Your home rolled system cant do this.
For example:
Add a dialog to this page that tells the user they have an update and get a subagent to write units tests for it. The subagent will get the topical data from the skill without me having to do anything. It will just use it. But the main orchestrator wont have it at all. If told the main orchestrator to write the unit test, when it was time to write it it would read the skill.
A skill is just a prompt, it can have scripts it calls or you can get it to do anything a model can do. The main advantage is that it is called apon only if it is needed and it doesnt exist in the context until that point in time. So you can have a really huge amount of utility that doesnt take up context.
Your home rolled system can absolutely do anything in terms of what gets injected into the context, how and when. You need to use at least custom codex cli for that though
My homerolled System absolutely writes tests and docs and runs 2 verifications and commits to git and ensures it’s written as I want following the standard I want… with 10 lines of agents file
It is injected into the system prompt and it is trained to use it though.
the meta data is injected into the system prompt!
It even tests stuff on multipass docker
You can home brew your system prompt structure
and can you train the model to use your system?
You speak as if system prompt is some secret sauce
That part, neither me nor you have access directly
So what you or me think it is trained to do it s a different story
i dont need access i just use the skills system and it is very powerful.
im out, good luck with your home rolled systems.
Oki skill master 🙂
You are out here trying to sell ppl on reinventing the the wheel like its no big thing. Skills clearly have some post training, all of the major vendors have bought into them deeply, but you still think somehow your trinket solution is worth while.
i won't go into a debate about meaning and semantics with someone that has no prior background in these matters :))))
take a seat then
😂
no reset yet though. which is good since i still have 40%
i spent 20% last night in a few hours after that tibo announcement, let's see how much time i have left till reset
using fast is not very good. if you run 2-3 threads in paralel at least on pro 100 you finish the 5h limit in 2h with fast
It's nice they give us a bit of time to use some extra power till the reset
yup!
never saw that one
looks like one of my apps haha
App too
5.5 so much smarter about making choices
I only have 8% left might have to back off the background tasks
some ppl found work arounds
you should use /goal, and aim further. it works even exceed of your limited quota
I'll do that if i get to 1%, i will run an orchestration and see how it goes.
but i want to be able to keep working on tasks i cant automate as much for now.
Well I signed up for Claude Max and I'm pleasantly surprised. It has about as much useage as Codex, generous weekly limits. Definitely worth $100.
So I'll keep GPT Pro for $100 and Claude Max for $100, that's probably the only AI I need to develop my game
Plus a benefit of Codex is that it can use GPT Image 2 with its imagegen tool and generate hundreds of images before I hit the usage limit.
hi what game engine you works now?
Be very careful with keeping the image style consistent. People are very picky about such things.
I'm not using a game engine, I'm using a JS stack with Svelte + Vite + Capacitor with a WebGL UX layer.
Well that's why I have an art bible
That most certainly helps, but don't trust it.
It's literally the only thing that keeps the style together and it's quite comprehensive, I think it's plenty trustworthy
ok. hope you're right.
can you share about your art bible please? is that markdown file that tells agent to be consistant?
Yeah, I'm right. I trained a LoRA on Scenario with Gemini 3.0 but I don't even use it because GPT Image 2 has better quality. I upload reference images and I have strict prompting guidelines so every prompt uses the same wording.
Yeah that's what it is, I can send it to you
Just don't share it with anyone else lol
i just wonder why x'D
I can make it into a skill for you if you like
Because I put a lot of work into that document and it's valuable to me, and I don't work for free.
got it sir, respect that ❤️
The art bible isn't actionable, I have related skills that use it
But I appreciate the offer
skills are very useful being topical
i also create my own skills. just curious how people works with them
anything that needs guidance on a regular basis i put into a topical skill.
I found my self correcting agents constantly about material 3 spec details, integration tests, unit tests, package usage etc etc.
So i generate topical skills with that guidance and then i never have to give it anymore.
I also have scripts in skills and build runs with tools that call scripts and do validation of stuff etc
They are just short cuts to common repeated usage that you dont have to remember to use
skills are pretty useful; they even work in the regular web UI if you upload them
did they give an api for them on web now?
I don't know if there's an actual documented API or not but they work if you upload them ad-hoc in individual sessions
Ah yeah i see what you mean
Would be really useful if they actually supported them in the chat
you can get a surprising amount of work done that way, particularly if you convince the model to just try installing stuff with pip even though it knows it doesn't have a network connection
it turns out it has a gigantic package cache available
When will that happen?
5 minutes after you find enough work to queue up to where you believe you're actually going to take advantage of it
my team is waiting, all my team member is on 0% weekly yesterday night
hahaa same
oh, you're out? in that case the reset won't be happening
i moved back to free models for now
wen reset
$100 plan also wiped out? hahaha
Lol no I'm just being weird
y no wake pet shortcut? 😿
also why does right-clicking along the top of the app crash it on macos? 🤔
oh man why did i try that ahah
It's the reincarnation of the old alt+f4 for the special menu
wait usage will be reset soon?
allegedly yes
did you make yourself a pet? 🙂
I havent done that yet
why is everyone vibe coding bonzi buddy and clippy now
They probably giving ppl a chance to burn some usage
if they made it show all the chats rather than making me have to scroll the tiny chat bubble this would be my main ui 😄
i feel like 5.5 performance is still degraded..
it's been killing it for me today
its legitmately been so stupid for me, not following directions, does something else, not remembering simple basic stuff, confidently claiming things that arent true
working well here
Lol bonzi buddy, I legit used that back in the day on my eMachine. Little did I know it was spyware 😂
I think a lot of stuff back then would have been classified as spyware or malware these days but it was a different world and people just didn't look at it like we do now
Yeah absolutely
Internet privacy today means something very different than it did back then
it was the wild west, you could /whois someone on IRC and dump their IP into a script kiddie tool and literally BSOD their machine
Lol yeah
and early broadband internet was literally set up like a LAN where you could print stuff on your neighbor's printer lmao
early cable
All this nostalgia is making me want to actually make my own bonzi buddy pet lmao
early cable was really lax with any kind of auth stuff, to the point where you could spoof the tftp server it would pull the modem profile from and change your own bandwidth limits
the first broadband i could get was adsl 1 and it was capped at 50kb/s or something similar. I downloaded from usenet 24/7 for as much as a could for a month and got something around 30gb
Why I feel they changed the usage limit again
Last Month I calc my Credit usage is 4293 Credits/week (I guess limit at 4300)
Last Week I calc my Credit usage is 4194 Credits/week (I guess limit at 4200)
This Week I calc my Credit usage is 3891 Credits/week (I guess limit at 3900)
I still use usenet 😄
The usenet was also provided by the service provider and it was loaded with rips etc
ok, 8%... time for a reset
Lol yeah that was the good old days of P2P apps like Kazaa and eDonkey
and DC++
Credit in here means this, GPT Plus Weekly limit = how many Credit
made a material 3 spec skill by setting up a skill to rip and curate the web site. Its working pretty well, made an example app.
Just the start of it, but its looking as ugly as material 3 so making progress.
I made the skill it would build responsive layouts
mobile, tablet and desktop all specced to material 3
It's not just visual i also ripped the the flutter docs and api and turned them into skills and made a specific opinionated skill for mvvm architecture they recommend.
So the code is also idiomatic, not just slop
Still a bit of fine tuning to do though
Niiice, yeah their default thing with flutter is material widgets with copious rounded rectangle cards. Takes quite a bit of effort to steer em away from it
That is the next step after i get this working, the goal is vibecode an app as a starter and get a really good base to work from. I thought material 3 on flutter would be the most straight forward because it is already most of the way there.
we can get an ios android and web app all in one hit.
I think getting a custom looking app will be much harder
It is... But worth it! One thing that I found works really well is set it up so you have
- The main client application
- A design system package
- A web-only "design_lab"
All widgets, atoms, molecules, pages, shell are created in the design system package, imported into main client app and design_lab client. That way for design work, they use the lab to fire up flutter run -d web-server --dart-define=RENDER_PAGE= and use bun webview to take screenshots and hot reload against that particular page. As long as they're making edits to the design system package, the client app will look exactly the same as the design_lab screenshots. Higher fidelity than golden tests 🙂
npm i -g @OpenAI/codex@latest does not seem to work for me, it throws the error npm error 404 '@OpenAI/codex@latest' is not in this registry.
npm i -g @openai/codex@latest
case sensitive
Oh thanks!
does using /fast affect quality?
No, it simply gives you high priority processing
i see
it does route your requests to a different cluster though, so it's totally possible for something to be wrong on the backend in a way that causes degradation in one cluster but not another
🤔
And here I thought OpenAI ran on a single server. 😛
big one
just a box in the corner of sams office
Seriously though, do they own tons and tons of servers, or do they rent them from other companies?
both
they'll probably be on aws soon too, after the microsoft deal ended/changed, guess that will add a lot of capacity
How long until our entire economy is based on AI?
if you look at stock market in US it already is
probably still some time before its core at "normal" companies
Can't wait for the world economy to get to a point where we are forced to leave the old economic paradigms behind.
Funny enough, if I didn't have to work for money, I would be doing the exact same thing I am now. 😛
i found a new way to burn tokens fast. integrated codex review in my git pre-push hook and blocks pushes unless codex doesnt find anything
makes it painfully slow sometimes to land fixes 😄 need more parallel dev servers
I need I rarely share
Completely worth the overhead though
yea it finds a lot
What does codex review to exactly? Is it like plan and goal just a different prompt chain injected?
Or a totally different model and such?
i completely rebuilt my workflow from PRs to trunk based. and with these new hooks i'm using a "stacked local diff" to compare against upstream. so its like a little local PR with codex reviews until everything fixed, works really well so far
Yes! It has a different system prompt. Basically says "focus on changed code, not existing bugs outside of the current diffs" and stuff about briefly describe the bug
its the same as running /review inside codex, but non interactive
starts with a fresh context
well 😅 i never runneded that, lol, hence why asking
I will try it out
its really good
seemed like nonsense, who reviews code
I mean 5.5 is pretty darn good at not making bugs, but the thing review tends to catch is drifts. Oh yeah, unlike the regular agents, the review agent will scrub through all relevant AGENTS.md files and ensure the implementor is following those guidelines
prompts written by AI, implemented by AI, reviewed by AI - what could go wrong?
yolo
you forget harness application written by ai and self-modifying based on level of user swearing in steering window
I made it detect if I am just rambling or truly point out real issues
If I just ramble I have a special agent replying with a meme.
Ther must be some fun in all the sad reality/
Did you ever figure this out?
I've been having abnormal experiences with codex and web chatgpt recently
I think my account is flagged or something and it's purposely poisoning all output
it has a memories feature but i think its opt-in right now
saves stuff to ~/.codex/ outside your repo
yeah i've opted in to the codex memories feature but it doesnt explain the weird stuff that's been happening with web chatgpt
but I think the issues are related to something with my account
maybe related to the behavior people are reporting in last few days which led to the investigation yday and reset promise
i have memories disabled everywhere, prefer to build the context from scratch even in chatgpt.com
one of my accounts got flagged for suspicuous activity, guess i overdid it a bit with the oracle usage
I noticed the issues in codex & chatgpt output quality happened immediately after I started getting 403 and 404 errors when using shared projects on the website (its an upstream issue; i've already tried every combination of local troubleshooting steps)
My only theory is that my account got flagged so now all the output is poisoned, but I cant find any information on if openai does that
maybe i got flagged for uploading zips with 500k loc for gpt pro extended thinking deep reviews lol
nah I dont think that would trigger it
holy f* my codex just crashed i queued like 30 messages
can someone help it wont even launch
and i think an update after bricked it
ask claude code to fix it
seriously, thats why i still keep a $20 claude sub around
😄
genius
or opencode or pi or whatever
i still have claude code too
so u can send 4 prompts and have ur 5 hour limit be used up already?
so codex chrome isn't available in europe or something, or for brave ?
well the occasional "fix codex" prompt yea 😄
if you mean computer use, i dont think its available in EU yea, maybe with vpn
waiting for codex chrome/computer use to come to europe/windows
why is openai doing this???
we're so lucky to have technocrats protecting us from innovation and ensuring our safety
i also can't seem to have the codex desktop thingy work on my mobile chatgpt app, nothing happens when it says to follow the instrructions on desktop
dang
i havent tested it yet but the mobile thing seems to be quite buggy from what ive read
the mobile functionality is great
Did anyone of you ever had this case where you are working in a given CWD for days and days and suddenly upon a codex command it goes back to ask
Do you trust the contents of this directory? Working with untrusted contents comes with higher risk of prompt injection.
Trusting the directory allows project-local config, hooks, and exec policies to load.
You can then press ctrl c and it opens the chat anyway
maybe after codex update? i see same with hooks approval
then i found the new --dangerously-trust-hooks or whatever its called flag
Idk if anyone here can do anything about this, but Tibo told all codex users to start using fast mode to use up your usage faster because he was going to reset limits in a couple hours and then didn’t do it
That sounds rather malicious to tell people to use fast mode because using usage wouldn’t matter
I think he wants to ppl a chance to make use of their usage first
many ppl lose out if they dont get a chance to use extra
Could be but he said that evening. I would just hate if it didn’t happen at all
Follow up communication and it’s all good
i still have 40% to burn before reset so im glad its taking longer.
Ive never hit my limits on the $200 plan ever but with the goal skill i hit 0% 2 days in 😭
i hit my limits all the time
this is my second acocunt, first is already empty
pro 20x
5.5 xhigh + goal + auto-review + parallel = 🔥
a cocunt? or did you mean 🥥
No homo reseticus ever
mobile codex not working for anyone. says go to dsktop after i scna but nothing is there
I know it has to be a MacOS desktop but I think even those people are saying it’s buggy
Yeah i have mac. Just nothing shows
maybe i will try to restart
Anyone else having this issue? I cant open previous Codex chats. But I can start a new one. Seems like its happening when the session ends.
flippin text encodings I'll tell ya h'wat
the great mojibake incident of '26 lmao
when do you guys think they will reset usage?
after Tibo has had his morning coffee
hopefully its right when i exceed my weekly limit
I am on 0%
still waiting for reset
i still have 22% left, dont reset yet Tibo
I must say though, codex does seem snappier and smarter again
wish i could say the same
no he said "this evening"
same error - any fix?
Not that I know of
Open a fresh chat, ask it to track down the chat you want and print the command with the identifier
track it down and do what?
Print you the command for that chat
what command??
Codex resume <identifier>
ggs no pro model for me bye bye 20x was good
itll just reopen the same convo?
or ur saying open codex in cli?
It shouldn’t if you found the chat you wanted
damn 1 prompt used 16% of weekly and 100% of 5h damnnnn
Yes
very long way to say open in cli. and thats not a fix...
you can easily find it in the UI... u dont need to ask the ai
literally copy session id
Dude these are for chats people seem to lose
Have you ever used codex before?
hes not complainig his chats lost
hes complaining its not opening properly
The chat isn’t loading
You're slow
So see if the session is discoverable by codex
You’re dense
@cinder lynx Anyone else having this issue? I cant open previous Codex chats. But I can start a new one. Seems like it’s happening when the session ends. Reading comprehension man
at least in vscode extension windows/wsl
yes
after I close vscode and then reopen, it can be tough to load previous conversations, the bigger the worse
sometimes its about waiting a bit, but other times it just never loads it
but when the chat does not load, I have to close vscode, open it and try again, sometimes up to 5 times and eventually it loads
maybe also dont do it too fast, open vscode, let it load the text about model, then open the conversation...no idea if that helps 😄
pretty much the only issue I have with the extension, but eventually I can always load it
> console.log(this.evening)
[object Object]
Are you on Windows or macOS (assuming not linux because desktop app)? Did this start happening after an update?
@rocky fog try this
Also that was copy paste of op’s question
I’ve unexpectedly closed many vscode chats or codex chats to have this problem
@lost drum
interesting
I will try that next time, thx
what?
Yeah it helps if you remember specific prompts or keyword from that chat
I thought spud came out cause of notification and I am not on pro
If you are on Windows, you should download https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer
See if a program has an open file handle on that thread's rollout log, terminate it, then open the chat. That's usually why on Windows I recommend "unplug USB devices (rule out hardware driver interference), restart computer (clean up file handles)", but if it takes multiple attempts with eventual success, you might just have zombied processes hanging onto the rollout log.
Also if you use codex for desktop app and vs code extension at the same time you will probably have the same issue specifically on Windows due to an architectural decision made back in the MS DOS days
I think it also happens through restarts and then reopening (and whether its in WSL or not, same)
but good to know, will keep it in mind
What was the decision regarding
Oh yeah, restarting just VSCode is not enough. Typically requires restarting entire computer.
i think they didnt reset limits yet because there are so many other bugs they would ask to reset limits for that theyre waiting to fix them all and drop one big reset
these menus in codex just love to appear and disappear randomly 😄
nothing new 😄
Yeesh windows gets more problem than linux is saying something
as for what im working on, ive been making a pokemon gold romhack that is supposed to make you feel like youre playing for the first time again (new type chart, buffed pokemon, buffed moves, harder gym leaders)
Ng+?
They decided a file should only be accessed by 1 process at a time, whether it's reading or writing, that file gets locked by the owning process. On unix-based operating systems, everything is treated as a file, including hardware, so by design they cannot lock files like that. Your keyboard & mouse, storage devices, audio devices, everything are files in the file system (byte stream with input and output), so they need to be readible and writable by multiple processes 🙂
its not meant to be like a crazy difficulty hack, but you should feel your heart rate increase a bit when you face a gym leader
regular trainer ai is actually completely unchanged, just the gym leader ai
so they truly feel like a different beast which would make sense in the actual universe
There’s an Elden ring mod that’s tunes everything too that I’ve been meaning to try
Meant to be unplayable but people have beat it
I’m on the windows codex app, yes this happened after the update
its more like a process can decide to do that, but does not have to
Was there a reason
They don't treat hardware as files. Idk what the reason was, but since files are a separate abstraction they decided "let's create locks in the kernel for files". And this is the reason why software development can be a herculean process. Build failures in Visual Studio are oftentimes solved by restarting the computer for example
lol go compiler does funny stuff too, just updates randomly after giving anxiety
In vscode
Yes, but if you have anti-virus software running, those obtain exclusive locks on files by design to secure the system (don't let viruses touch bad things, intercept handle before damage done). Sometimes they have bugs. Sometimes you're running builds that are updating lots and lots of files in rapid succession, the anti-virus leaves an exclusive lock on something and bricks up your PC. Doesn't have to be just anti-virus though, lots of software stacks make you opt into shared mode, all it takes is one bad process to hold a permanent tombstone lock on a file and cause issues
Oh yeah I guess disable anti-virus is the best first step, before unplug all USBs and restart PC
I dunno what kind of A/V you've seen that takes regular locks on files instead of implementing a proper minifilter driver to transparently intercept access but I don't want it anywhere near me 😂
Even the minifilter driver will block writes pending analysis. Not all compilation processes are single threaded and serialized. It is however better than the way I described for sure
yeah but even if it blocked forever, e.g waiting on the user to click something they never bother to click, at worst it should just result in an async i/o worker (or whatever is being used) never returning a result, so if anything actually hung up from there the bug would squarely be in the build system (or whatever) in question
sora is gone
rip
dead
https://help.openai.com/en/articles/20001152-what-to-know-about-the-sora-discontinuation
(well, does not answer why, but more info :D)
did not turn out to be what they wanted
Do you all know those satire videos where they make fun of "if google where a person" and in come all those awkward questions..
I wonder what GPT would have to say to that
"where dis"
"give code"
"not work why fix now"
"I am poor make me rich"
"My wife left, fix it"
"I need exam done, now"
"😠 🤯 😰 "
there's plenty of other services for that I believe
Also, Users between the ages of 13 and 17 must have permission from a parent or legal guardian to create and use an account
Do you?
It's funny when I use Codex to diagnose problems with the Claude desktop app 😂
Got one agent diagnosing the other
"Where are you from?" Being 13 is no excuse to have poor grammar in a channel that is literally about prompt engineering lol
Lol yeah and what do we do with Codex? 🤔
not prompt engineering lol
Streaming interrupted. Waiting for the complete message…
Here we go:
no prompt ängeniiiireeing
If you're not engineering your prompts, what are you even doing? Seems like a waste of money to use Codex with bad prompting.
why, I have codex write my prompts
or gpt
Huh
I only say "why dis. fix"
gpt pro writes my prompts for codex cli
and cli for gpt. Its a closed loop
The only time I get really ängeniiiireing is when I ask for memes
Interesting. I never considered using AI to rewrite my prompts. I'll spend like an hour and a half writing a detailed prompt because hitting that enter button costs money lol
How do you manage hallucinations
noob
It’s more fun if you edit the prompt with gpt after
you mean his hallucinations?
OK I guess I have to step away, the /s is having a moment with me
Yeah
No, gpt hallucinations
not worth perfecting prompts/spec imo. just get stuff implemented with hard guardrails for code quality etc, then iterate. even with perfect spec it will never be correct in one-shot mode anyways
wat dat
Pain in the butt mostly
Yeah that's a real concern for sure, writing my own prompt prevents hallucinations (assumptions about things that don't actually exist, in this case) from creeping in
Very easily noticed if you read
idk what you are doing but i never have problems with hallucinations
when it does that I just have it create what does not exist yet
Lol you're cracking me up today
not going to say it... but who reads?
yeah me too
one shot right now actually means a long meta-turn of the orchestrator who with the proper meta-program can one shot a lot of stuff
oh, they had human errors?
Thats normal - humans make errors all the time bro
even with orchestrator + subagents in isolated worktree + pre/post tool use hooks with hard guard rails + post commit review-loop, it will never be perfect because your spec/goal/prompt will never have all edge cases in it
Yeah like not reading
Doesn't need to be perfect, it just needs to be like 80% correct about 90% of the time lol
i m currently one shotting programbench tasks at 100% score with this method lol
but sure you re entitled to your certainties
yea exactly, thats what im saying. so not worth writing the perfect spec/prompt
That's what the human is for! Agents haven't replaced us yet! 😂
I mean in his closed loop that’s exactly the case 🤔
It’s also insane when I check chats from 2-3 years ago to see differences in output
Closed loop, I guess that's fine for processing huge amounts of information if nothing changes and you're just processing sort of the same thing over and over. But in a software engineering environment, a closed loop with an AI is asking for problems
then what exactly do you mean by "long meta-turn of the orchestrator who with the proper meta-program"?
Yeah this, determinism
not familiar with programbench exactly but i looks like you are building from a verifiable spec for something that already exists. thats a bit different than building something new in the real world
LLMs are great at that
The human ideation is what aid the second imo
i mean the bun guys recently proved this with the rust rewrite based on existing tests
Infinite closed loop, maybe
i'm building new stuff where i dont know every single spec detail upfront, so for me its not worth defining everything up front, i just iterate
Well ain't that something, Codex fixed my performance issues with the Claude app. Runs like butter now.
There’s a part 1 from 2025 too
Very cool read and maybe viable once models get better
Anthropic already uses Claude to code Claude, they stopped coding by hand a few weeks ago.
I don't think the same is true yet for Codex though
Months ago no
They said it a long time ago already
pretty sure they are doing stuff like that in their research dept, incl synthetic data etc
Months? Wow time flies
They probably help with the generation
codex is entirely 100% vibe coded I am fairly sure
They even have a blog post somewhere describing a misterious app they created solely with gpt, yet never disclose whihc it is
Lloking at the pace of release, its def codex
also the release notes
Speak for that famouse "didnt read it" approach
It was smart of them to wait for the models to progress to vibe code it
Took apples approach
very important to dogfeed this stuff so it actually gets better. they should also actually use normal subs to see real limits from time to time while working internally, not just their unlimited /fast API keys
Also stuff like the models having supplemental training to get them used to tool usage etc
Can’t just deploy the harness without the foundational model learning how it works
So like every time ig they make major changes to codex they must respect that process
We're slowly moving into a world where two things will become true. Code won't be written by hand anymore, and anyone can just make their own app. But SaaS has been around for a long time so it'll just get even more popular, people won't be buying software licenses anymore, instead they'll be paying monthly for support and updates.
Have you guys seen the vibecoding video by andrej karpathy
It would be a nice “where we are now”
Yeah dude, it makes zero sense to do less than 100% vibe coded. If you choose not to 💯, you're opting out of adapting to the future of all SWE. The sensible thing to do is learn to 100% vibe code right now with current technology, so when it inevitably evolves you wont still be stuck banging rocks together in a cave somewhere haha
I like the image of banging rocks
i managed to burn 50% of the pro 100 quota since tibo said he will reset lol
i m down to 12%
reset isnt happening. we got trolled
yea similar, wen reset
whats your workflow you teased earlier for those 100% scores, or is it secret 👀
it s not secret there are experiments in conceptual reconstruction using only general meta-rules of reasoning (that s the meta-program basically) + what's usually allowed in that eval, namely only the readme of the program plus the compiled programto test it's behavior
that s the current most abstract form of the steps
will update you when i'm done, if it proves stable value
thanks, sounds interesting, but this is specifically for reverse engineering right?
or are you also using this approach for other stuff
it's the same for new programs, just that there you discard a couple of late steps
the goal is what i call Semantic IR from intent till actual implementation in code
but with something like programbench it's easier to bootstrap since you have a hard base of what the intent and the actual program are
so i can just focus on seeing how to get the model make the transition between the 2 better
and you use this only as instructions in prose form or is it a custom harness or similar?
it relies a lot on the ability of current models to obey machine format structures, so it's not really prose
how is codex usage for chatgpt go plan?
great i m happy with this one, managed to use 50% and lost only last 10% in the last 18 hours
phew finally! he didnt forget lol
good reset for me, first plan at 5% and second at 30% with 4-5 more days to go 😄
it s awesome like this, to have 18 hours heads up
if they just announce 1 hour before, that s not much you can do to use what u have
yea definitely nice this way
although i dread the end of may when 2x ends
working hard to get some stuff finished before that 😄
if they cancel the 2x then mad expensive it becomes
is there a latency for reset?
@boreal holly so umm, is there stuff going on I need to worry about?
yea it takes a bit to roll out
well i'm out of 5hour haha, so i need to wait for that reset to roll
I was hoping it would take longer for the reset rofl
I know that there are ways around the codex mobile on windows, but any news on the official update?
Apparently some folks had
[features]
stupid_mode = true
Enabled in the backend, OpenAI toggled that off and reset limits. Other than that no stuff to worry about!
??
what file?
why doesn't mine work
is that in the toml file or a meme
It's a joke 🤡
joking
<3
did you try fresh session or logout/login
yep
the reset isn't instant
rip then i guess, upgrade to pro like it suggests 😛
it's a db update that has to touch millions of rows
TY for reset ❤️
Oh yeah, still no Linux GUI for Codex, and some folks are still trying to get Codex for Windows to work with varying levels of success. Other than that you missed nothing!
yay
try resending the message
just in time haha
nothing
I may or may not have my own ui at this point
think of how much electricity they would have saved if they just released their own vibe coded port as an alpha instead of having like a thousand different people do the same thing heh
Anyone else didnt get the reset?
It takes a few minutes
It took like hour so far
maybe you are on the DNR list
Still cookin
@boreal holly thank you for your time sir!
how's spark for you?
auto reload credits huh, living dangerously 😄
Really good! I use it for quick refactors or "how we lookin overall in this area" stuff
i used spark maybe once when it came out then never again, but its seperate limit so maybe i should tokenmaxx it hmm
If only I could reach 0% in a week 😅 seems only possible if I wanna be egregiously wasteful
spark is okay if you don't make the mistake of trying to do complex work with it, the context window size is like that movie 50 first dates
did any one get the reset yet?
yes
ok so it takes time to spread out
also not here yet
actually as soon as I made a call it reset
(didnt have it running for a bit)

😂
When will we get a codex model thatll cook on frontend
I resumed goal, I checked usage and the plus icon usage and it went to 100% 100%
then I went here to say that
then I focused back to vscode and it changed in front of my eyes back to old usage
codex trolling me
There s something fishy with this reset. Don t think it usually takes so much like 30 mins to take affect across users
"/goal reset usage for all users"
... Working 35 minutes
"Alright Tibo, I copied the quota rows to a backup table, erased the production columns, and copied the quota rows back in place"
So is it better to use 5.4 now or what
FUll reset usage? coulda told me yesteryda and i wouldnt have saved my usage lol
You mean they didn't announce it in the Anthropic server? 🤡
That woulda been nice indeed, could one hope next time?
they announced it yday on x
Gotta tell my hermes to keep me updated on these things
most important follow if you want to optimize around resets https://x.com/thsottiaux 😄
It needs to watch tibo's x account
oo
i mean i was at like 50% weekly, shouldve used more
but ill take it
i got about 1,5 pro 20x subs out of this one, perfect timing
would have needed to buy a 3rd if no reset with 4-5 days left on both 😄
I literally just went from 0% to 100%
about same, also got 3 days of claude 20xmax on their reset yesterday, good weekend I guess
was it always like that when we had extra reset it was basically starting new weekly cycle earlier? Because reset happened, but it also moved my previous reset timer to date 7 days from now
yeah it moves to from whenever the reset happens
yea sometimes you get lucky and get it in the middle of your 7 day window, sometimes it sucks and doesnt make a difference
When I set it to GPT 5.5 xhigh, it says it's impossible, but 5.5 high works infinitely xd
Been optimizing the workflow in Codex now tho, callin in Kimi and Claude workers within the codex window for 5.5 to have agentic controll over agents and workers with speciailized and optimized task fit for their model , to try and reduce my 200 codex 200 claue and 100 kimi to about half. Anyone have a had any luck in making this work 100 % ?
I have GPT and Claude $200 tiers (only month I’m doing this)
Yesterday Anthropic reset limits 15 minutes before my limits reset
🙃
You are destined to win the lottery at age 95
mines still not reset
not bad honestly
careful with that if you use claude -p since it will now use your extra credits instead of sub
i just use 5.5 xhigh codex for all coding + chatgpt 5.5 pro extended for planning
yeah ik, that’s exactly what i was worried about too lol
not using claude -p for this anymore. got it running through a local mcp/tmux thing instead, so codex kicks off a real claude code session on my logged-in sub and sends the prompt into the tui, then tails/stops it from there
kinda janky, but avoids the sdk/print credit bucket
i think most of us would suffice using deepseek
Bro I didn’t even tell you the other part
use 5.5 high as the main brain/final call. you fnd xhigh worth it?
my roughly setup is sonnet as the main external worker via claude code/mcp, 5.4 is the main codex-side worker. 5.3 for smaller scoped stuff, spark for tiny fast edits, opus for deep review/security, haiku for scout/logs, kimi-worker for background/bulk/final docs but not blocking the main loop.
I’ve been waiting since last night for codex usage to reset
I didn’t have anything to do all weekend besides a baseball game, was hoping to set a goal before I had to go
Get in my car to go to the game, check twitter at the first light and Tibo reset usage
💀
honestly i cant really judge the diff between xhigh, high, medium.. im sure xhigh is overkill for some work
I use medium or low mostly
Xhigh is worth it
so i just KISS by using the max reasoning for everything right now, if i get into more budget constraints i might do a smarter system
Your better off with 5.4 high than 5.5 low, 5.5 should be used for high xhigh imo
I disagree, thinking too much tends to cause more problems than it solves and also the model runs slower
5.5 low is quite fast and token-efficient
medium is a good middle-ground if you need something better for planning
Lets go, we got a reset
Haha bro, just get a minimax 19 bucks sub, and you wont be waiting all weekend next time, atleast you can do some groundwork with the 2.6m ✌🏽
i dont really care about speed right now, i use big goal prompts that run for hours or even days in some cases in remote-dev tmux. most of the time spent is actually compiling, lints, tests, mutation tests etc, not inference. and i just have a lot of remote dev envs at this point
this is a good feeling every month
in that case why do you even need xhigh?
you mean because i have detailed prompts or what?
What are you guys thinking about codex mobile? Right now some chats I start on mobile disappear if I close the app, and cannot find them in mac afterwards
if the model is simply executing predefined tasks, there's no need for higher reasoning levels
Also it's not following repo permissions on new chat.
works 80 % of the time, some chats wont even open, some works perfect. I just hate the that "wait 10 sec before we group your chats with repos" after you do something else..
maybe. but i have very aggressive hooks to enforce specific code architecture etc, so i think xhigh is still worth it
to make the most out of the hooks feedback, custom lints, etc
ehh
Not only that... If you gotta crank it up to xhigh 24/7 to get the results you are looking for, that's a huge red flag.
I like to think "If my agents are struggling on Medium, I have a problem and I need to make it so they don't struggle."
I think what would be more helpful is having more stuff in the model's context that's not unessesary reasoning traces
one thing xhigh is useful for is finding contradictions and other reasoning traps in your setups that will cause issues with lower reasoning levels
unfortunately 5.5 doesn't survive compaction as well as 5.4 did so it is important to minimize those
presumably to be fixed in 5.6 or gpt-6 depending on what they plan on branding it
Huh, I've had the opposite problem. 5.5 does unrealistically well across practically infinite compactions
when your setup is smooth enough that even xhigh isn't overthinking every step of it, you know you're in a good place
i also dont see problems with 5.5 compactions
Odd, my weekly usage just reset. But I still had 4 days left.
I notice it gets off track a lot more than 5.4 did with multiple compatctions
the caveat being 5.5 doesn't go through as many compactions because it's more token-efficient
my usage reset back to empty actually after it reset to full just now meh
i find 5.5 to be much better at staying on track in long running sessions
What is even going on here, my useage is resetting every minute, it's staying at 100% remaining even while running a session
infinite usage glitch
I'm thinking it's most likely just a visual error
But if it really is unlimited I'm not restarting the Codex app ever again lol
My usage definitely reset since i was at 0% so its not visual only
Still not reset
Mine is continuing to reset every minute, the 5 hour window keeps moving up every time it happens
well, dont say you guys didnt ask for it 😄
Please reset this, it just killed my project, the fake reset...
Sounds great
how did it kill your project
Basically, it shifted my reset day, like it was as if I hadn't had a reset today :/
nonstop reset
no more wen
just make usage unlimited
and my next reset will be on the day my subscription ends.
well those are formatted terribly in Discord lol
but if you expand them, you can see I'm not lying 😛
I doubt its a fake reset if it pushed your next reset back
Well, my week is back to 0%.
on website or /status?
Website doesnt seem to update as fast as /status in codex TUI
my token allowance on plus was on it's last 6 percent for the week with 2 days remaining...
It's just suddenly reset to 100% with a week left from now. Is that normal?
oh im catching up
just now I am at 100% (again)
lets see if it stays
is it a lie or not
think it is lying. It says i have 99% left but getting message that my 5hr limit has been reached
OpenAI is experiencing some technological differences right now
im at 100% again too 😄 hopefully it doesnt flip flop again
so far longer than the last time 😄
It will all be settled in monday night rehabilitation
I just checked the website and confirmed that it was all a lie. My weekly useage remaining is right where it should be, 61%
This next /goal is impossible 💀
it is all weird. website says 99% remaining for 5 hour, 100% remaining for weekly. but getting "You've hit your usage limit. Upgrade to Pro (https://chatgpt.com/explore/pro), visit https://chatgpt.com/codex/settings/usage to purchase more credits or try again at 6:04 PM."
gonna have to do another reset for the reset issues 😄
That is such a high quality image, I can't get over how good GPT Image 2 is. There is soooo much right about that image that I would seriously think a human made it.
Fixed that post, sorry about that. Thanks @boreal holly.
Topic: How to connect from Android to Codex Windows.
I think my 5 hour window just moved
and I guess its not going down from 100%
so yeah 😄
maybe tibo just spammed the reset too many times and its all coming through
Yeah
Poor OpenAI trying to deploy a reset on a saturday and I'm over here burning up GPUs on dank memes 😂
Curious to know what this system is
Bro please spare some gpu for us
There’s many dank memes
Images and low quality requests should be routed to a different data center.
not system, not rack, maybe a different country. 😜
That would be the endgame with all requests depending on model inference and nature
and then, who decides what a low priority request is? 🤔
/model low
well, we already have that. I'm kinda trollin'. 😁
Cursor tried it with subscription but and all of them give priority access based on tier but request specific I guess inference and speed etc
Since EU can't computer use or chrome extension they probably got some GPUs to burn 🤡
in which chat tab can I talk about chatgpt behaviour getting out of hand
and I mean, really out of hand
No reset here still. Not that I need it, but thought to mention it, resets wen
/goal make gud HUD, don't stop til HUD gud
gpt-6 wen
I kid. It's coming along nicely. But Claude is much better at UI stuff
It's kind of frustrating having the entire game fully functional but being trapped behind a terrible interface, it's a process
speed i need this
ironically both openai and anthropic likely use the same vendor for UI training data, anthropic just updates it more often then openai does
my usage is kinda homeless
usage not coming down?!
Interesting
google does too but gemini is less capable in general
You didn't get a reset? lol
Gemini usage is a joke tbh
gemini is pretty nice for some stuff, just not coding really
i use it a bunch via api
Did some work in antigravity last week, 3.1 pro did fairly well, but what the heck is that usage,
I may not have said "make gud HUD" but that was basically the summary of my most recent prompt to improve just one page in my game. Codex and Claude both seem to understand problems with UI if I give them a preview screenshot, but for some reason, Codex can't seem to actually fix things. It will try and then it will take another screenshot and say "yup looks good, I'm all done here" while there are still obvious problems with the layout. Then I'll feed that exact screenshot back into Codex and it will quickly identify the problems that it didn't identify before. But Claude doesn't do that, it will see the screenshot and keep iterating until the problems are actually gone.
hey guys i need some help sign up for school SSO
did you ask your agent
what agent
codex, since this is the codex channel
looks interesting https://github.com/wesm/agentsview
ask codex mate... "Sure. I can help you through it.
For school SSO sign-up, you usually need:
- Go to your school’s login portal or app.
- Choose Sign in with SSO, School login, or Continue with institution.
- Enter your school email address.
- Select your school if prompted.
- Log in using your school username/password.
- Set up MFA/2FA if required, usually with an authenticator app, SMS, or email.
- Accept permissions and finish registration.
Do not send me your password, verification code, or recovery codes.
If you paste the exact screen text or error message you’re seeing, I can tell you what to click next."
and I`ll let you get back to codex and it will help you
unfornatunatly http://letmegptthatforyou.com/ seems broken 😄
finally got the reset
seems it eventually settled and is now not moving the window and usage goes down
seems to perform worse than composer 2 (cursor's budget model)
with gemini and share window this gets even easier, askin real people for stuff seems 2020 ish
quite funny that claude performs better in cursor than claude code
anyone tested grok build?
i was hoping for some kind of high usage glitch/promo but seems pretty limited
he is special
Does your usage stack up or some thing?
I stopped looking at these charts, whoever puts claude opus over 5.5 is nuts. Ive been a claude maxi for a long time preaching high about Claude, but since 5.4 ish, OpenAI is once again the #1. But ofc, the masses lags behind and catches up to Claude now, and buisness are not better, signing deals with Anthropic and taking the devs hostage within Anthropic, when OpenAI clearly has the best agentic setup atm.
I still hold my claude max sub, but will reduce it in half next month.
But rest assure I will be back with preaching Claude again as soon as they stop releasing nonsene,and actually starts to focus on agentic coding again..
their models are kind of a joke in agentic tasks so idk if you could do much with them
that's opus over 5.5 only in the cursor harness
in claude code it performs worse
Nah. Don t see any reason to use anything else than codex right now
Is the 400% coz of the promotion on the Pro plan?
A magician always shares his tricks or whatever that saying is
Okay buddy...
cursor's whole thing is being the most optimized harness for models, so it's very possible 5.5 in cursor will surpass opus in cursor at some point
You are saying 5.5 codes better in Cursor than in Codex?
But the usage must be a lot less
oh significantly yeah
Yea don't worry. I have 500%
5.5 in cursor is super efficient in particular
you dont know then 😄
Wow so it consumes less tokens to do the same thing?
in cursor but yeah
can you use the openai sub in cursor?
the harness is getting a lot more important than the model at this point
I don't believe so
Damn claude code is really bad at caching
yeah idk why those open-weight models were included running in claude code though
I'd want to see benchmarks in opencode for those tbh since that's the harness actually designed for them
anyone using hermes wiht codex? considerig to swap out kimi for codex in my hermes setup.. but not sure it will be alot better, or just cost more tokens..
Have you guys tried OpenDesign yet?
Yes