#codex-discussions

1 messages · Page 3 of 1

plucky halo
#

Of course it deserved to be flagged.
My use was for on my own repos but, realistically, OpenAI doesn't know that

wide schooner
plucky halo
slow flax
#

I don't think its complete garbage I have had great amount of success with compound engineering plugin.

Reason why I like codex is because it just think better and don't rant.

wide schooner
slow flax
wide schooner
#

🤣

slow flax
#

That I agree

wide schooner
#

So even more creative but dumber

#

Gpt is less creative but actually trustworthy

ebon radish
#

Here is a radical idea for Codex App: human2human chat.
People want to share their works, share their threads, prompts, skills. And the best way to do it is enable teams to work on the same thread

cyan wing
#

Is anyone else not seeing reasoning messages in the CLI?

#

I’m only seeing the title when it’s reasoning now, no subtext

molten basin
#

Guys what is the point of a 200$ ChatGPT subscription for coding if I can have Codex 5.3 and all the other LLM for 200$ too on Cursor ?

You need to do a 60-100$ unlimited monthly offer for vibe coders

wide schooner
molten basin
#

20$ membership is great for 2h of coding a day

#

For 10 hours it’s not enough

cyan wing
cyan wing
cedar skiff
#

They did this with cloud tasks at the start as well

cyan wing
cedar skiff
#

oh really, my bad, i didnt see that o.0

#

i dont use x, can you give me link?

high girder
#

I can confirm from my daily usage at the moment, that the rates are double. I'm using the VSC extension for the most part.

wide schooner
wide schooner
terse kraken
wide schooner
#

And are you sure you logged in?

tranquil hare
warped pine
#

it's easy to hit even pro limits when youre parallelizing stuff

cedar parcel
#

after playing with 5.3 codex for a solid 4-5 days, I'm heading back to GPT 5.2 (non-codex). I consistenly find that 5.3 avoids reading long files, and lots of files. It's producing incomplete implementations and ships bugs fast, but I prefer 5.2 one-shotting tasks and shipping working results in however much time it needs. I even put an agent on it, to keep sending the sub agent back again and again until it finally reads the 60+ components in a vue front end (patternfly).

warped pine
cedar parcel
#

this was the death of claude, though. "you're absoluty right, i didnt actually implement...."

warped pine
#

i hope regular 5.3 will not lose these extremes of 5.2 that i loved

#

and i hope -codex variants will stop losing them during code tuning that oai is doing

cedar parcel
#

yeah openai ties speed to adoption, they say. Im fine with a fast inferior model as long as they keep the superior ones available. The inferior fast ones are a nightmare for me because my codebases are big and complex.

#

sub agent just keeps giving the orchestrator the finger... It's not doing it

warm rain
#

Can you try adding a boat?

cedar parcel
#

i wonder if the sub agent can get the orchestrator to use threats and profanities if it keeps refusing to put in the work

#

boom that's how 5.2 rolls in. "separate work tree to check everything twice". keep talking dirty to me, baby.

sleek spoke
#

random Russian in a response from gpt-5.2-xhigh, I've definitely never written a word of Russian to it lol

cedar parcel
#

should use /feedback to report it

sleek spoke
#

is there any way to /feedback it if I've already sent it further messages since then?

cedar parcel
#

yeah it will ask to send session history

sleek spoke
#

thanks

simple star
#

Getting a real polished results was not working so well

warm rain
warm pilot
simple star
#

@warm rain But you have the repo. Feel free to try and throw me a PR if you wish 🙂

warm rain
#

Ok 😄

cedar parcel
# warm pilot Where is this documented? I've never heard of that before

https://arxiv.org/abs/2406.20052 "The phenomenon of LLMs inserting foreign language words—particularly English—into non-English outputs is documented as a sign of degradation in the research paper "Understanding and Mitigating Language Confusion in LLMs" by Kelly Marchisio et al. (2024).

This study introduces the Language Confusion Benchmark (LCB), which evaluates how often LLMs fail to maintain the user-specified language, especially in non-Latin scripts. It identifies that models like Llama Instruct and Mistral frequently generate responses with unintended language switches, even when prompted clearly.

Key Finding: Base and English-centric models are most prone to this issue, especially under high sampling temperatures or complex prompts.

Mitigation: The paper suggests few-shot prompting, multilingual supervised fine-tuning (SFT), and preference tuning as effective countermeasures"
wicked briar
#

codex has become slow again

chrome raven
wicked briar
#

why?

chrome raven
#

i dont know but on pro even with xhigh it's fast

wicked briar
#

im on pro

chrome raven
wicked briar
#

app

#

maybe I try cli again

cedar skiff
#

They did a fix but you never know

wicked briar
cedar skiff
#

How you finding 5.3 anyway?

wicked briar
#

much faster

#

worth it

cedar skiff
#

seems smarter as well

wicked briar
#

yeah

chrome raven
orchid plume
# cedar skiff How you finding 5.3 anyway?

the difference between 5.3 codex and 5.2 is noticeable. 5.3 codex is faster, smarter and more token efficient compared to 5.2. Other than the ID requirement fiasco, I also recommend it. If you're not flagged then you don't need to worry about the ID requirement though, appears around 9% (if I recall right) of users were impacted by the overflagging incident

cedar skiff
orchid plume
#

haha fair enough 😛

cedar skiff
#

false positive, bit of a drama. Couldnt imagine being banned o.0

orchid plume
#

yeah unfortunately it was poorly executed and implemented, but lessons learned hopefully on OpenAI's side

wide schooner
#

Why don’t these big tech companies just use localhost? Are they stupid? My localhost app is fully unhackable!

wide schooner
#

But its faster and smarter so its not an issue

orchid plume
#

yup, considering how it thinks less but is smarter, it's impressive

wide schooner
wide schooner
#

Thinks a lot but complete buffon

orchid plume
#

yeah, and that fast option, that has to be an April fools gag. Way more expensive for possibly twice the speed

wide schooner
#

Its not faster model

#

Its "priority queue"

#

🤣

#

Pay even more for making it remotely usable 🥀

orchid plume
#

ah, so fast isn't even true, it's just the same concept that OpenAI use which is included for Pro plans, priority processing

orchid plume
#

😂

wide schooner
#

Except you pay 3x

#

🤣

orchid plume
#

haha

wide schooner
orchid plume
#

jees, I'm glad OpenAI include that with Pro, without paying extra

wide schooner
#

They have no idea how to optimize their models

#

Openai actually has experience

orchid plume
#

true, 5.3 codex is a good example of that

wide schooner
#

Waste of money, more hallucinations, more dementia, worse performance

terse kraken
orchid plume
#

every little helps

hallow sky
#

yea... plus 20% faster on top of the previous overall speed improvement of the standard queue

#

20% faster backandforth, iteration, planning and implementation for a model that can work for hours is quite significant

wide schooner
#

Nothing like opus or so

brisk junco
#

How many people here experience crashes of codex desktop?

brisk junco
#

It crashes on my end even when idle (Mac OS Tahoe)

wide schooner
#

Mac os issue likely

brisk junco
#

Rip

wide schooner
#

With the app

#

The cli works fine

#

The windows fork too iirc

velvet wren
brisk junco
#

Interesting

#

I am running Version 260210.1703 (602) on Mac OS 26.2 Macbook M1 Max

velvet wren
#

Version 260208.1016 (571) on macOS 26.2 on a Macbook Pro, M2 Max, 96GB RAM. I'll update Codex

#

ok, now on Version 260210.1703 (602)

brisk junco
#

I heard that since the latest two releases it started to happen

#

I am also experiencing UI laggs, the Cursor not changing, not able to click anything and so on

velvet wren
#

working perfectly here

brisk junco
#

I am going to further investigate if that might be colliding with other Codex instances running on other IDEs

#

Thanks for your Info!

sick iron
wide schooner
slate osprey
#

##+

hushed crown
#

Hey which codex model the plus users have access to?

rapid patio
#

submit button is non-functional on visual studio code extension?

hushed crown
supple perch
#

Or if you're in the extension it's in the chat box now

hushed crown
supple perch
#

It also uses 25 * more rate limit. So I recommend switching to CLI or the extension as soon as you can

hushed crown
supple perch
hushed crown
warm pilot
frosty zealot
#

Any idea why Codex can perform reviews on my PR’s for my private repo but when I ping it to perform the corrections, the web task says there’s no upstream set. How do I configure that I assumed it could do it since I can already review and I have GitHub connector linked from where I can see.

supple perch
toxic torrent
#

codex is much quicker today than yesterday. thanks

restive token
#

why can't i see gpt-5.3 codex in /model? i'm using v0.98.0

hollow stirrup
latent tiger
#

you guys are flagged for misuse

#

tried doing anything hacky?

#

going to have to ID verify to keep using it lol

restive token
latent tiger
#

yes it should be in the /models for plus/pro users

#

if it isn't see above

nimble terrace
#

mmmm what is this memory v2 i keep seeing in openai/codex commits

restive token
#

i doubt i'm flagged, not doing anything weird.

and if i'm flagged for misuse 5.3 requests are rerouted to 5.2 but 5.3 should still be visible in /models

strong jungle
#

How do you find out if you 've been routed or not?

winged ore
# latent tiger

this is interesting.. but what exactly is this misuse flagging?
I mean, if it flagged something as a misuse, why route to 5.2? shouldn't it block the request outright?

#

what is the interest in fulfiling a request that was flagged as misuse?? sounds counter intuitive

#

but well, there must be an explanation

latent tiger
#

everything you do gets integrated into the model

#

scary stuff

#

google red teaming practices and data poisioning

winged ore
#

??

#

what I mean is.. if the request was flagged, why is it routing to a different model instead of blocking it?

latent tiger
#

because 5.2 isn't as capable

#

probably depreciated for future models too

winged ore
#

capable of what? 5.2 isn't capable of being misused but 5.3 is?

neon girder
#

the routing is not per request. once you are flagged, your whole account is, and every request you make will be rerouted for a certain amount of time.

latent tiger
#

5.3 is likely the base for upcoming releases/ new builds

winged ore
#

this makes a little bit of sense

#

but I find it odd that the system would be able to detect misuse and still opt to fullfill the request with an older model

neon girder
#

if they outright rejected your request, you would easily be able to figure out how their filtering system works and bypass it

winged ore
#

I mean.. ok.. maybe the detection isn't very reliable and they don't want to block an account just because they have a suspicion...
so... well.. the detection system isn't reliable so it shouldn't be deciding to downgrade the model anyway, I guess

#

Im sure that there must be reasons beyond of what we can see, things are usually not this black and white, im just specualting on the odd-ness of the presented information

neon girder
#

once you verify your identity your requests bypass the detection anyway. they are playing it safe, its way too easy for a state actor to buy a bunch of accounts with stolen credit cards and abuse them.

cyan wing
#

more evidence of something strange going on with 5.3 codex

livid flame
#

Hello eveyone. I just wanted to say that i've been having issues using codex IDE for now. WIth the new updates it has a lot of memory problems that did not have before. It slows down and it freezes my entire system as context grows. This did not happen at all before and it was very snappy. Also some other change i noticed is that the backend does not clean up context like before. before it would clean up context automatically, each time the agent would do research, and while idle times happened. But now it does not clean context anymore, just when the context approached 80%. Before i never used to run out of context, now it's compacting a lot. I don't know why these changes were made, but it was way better before.

latent tiger
winged ore
#

yea, makes sense

cyan wing
exotic cave
# neon girder if they outright rejected your request, you would easily be able to figure out h...

Security by obscurity is not security, that's day-one infosec.
A system should be secure even when the attacker knows exactly how it works.
Hiding the rejection doesn't make the filter stronger, it makes it untestable.
And bad actors don't sit there iterating against your filter, they use open source models locally with zero restrictions.
The only people the silent downgrade affects are paying customers doing legitimate work.

potent mason
neon girder
potent mason
#

But I think they removed the temp param

neon girder
exotic cave
neon girder
exotic cave
#

They silently rerouted our traffic, that is intentional.

neon girder
#

Everybody makes mistakes lol

#

A mistake is not necessarily unintentional by nature

boreal holly
# winged ore I mean.. ok.. maybe the detection isn't very reliable and they don't want to blo...

That tweet (or whatever they call it these days) leaves so much to interpretation, but seeing as how they are not just blocking accounts entirely I think the reason they're routing to 5.2 is because it's been more thoroughly tested and they know the refusal rate is extremely high for cyber-abuse, whereas 5.3-codex needs more testing to ensure its refusals are strong enough. And they don't wanna block valid uses, they just wanna limit the blast radius.

neon girder
#

Anthropic literally was serving worse versions of their models to their users for weeks if not months with NO way of knowing at all

#

And they literally denied it until they were forced to admit it

exotic cave
# neon girder Anthropic literally was serving worse versions of their models to their users fo...

And we should not accept this from any company Anthropic, OpenAI, or anyone else.
Since when is silently degrading a paid service the standard?
These practices harm paying users doing legitimate work while real attackers use open source models locally and are completely unaffected.
The only people caught by these systems are the compliant ones because the non compliant dont use services tied to their identity and credit card inthe first place.

cyan wing
#

Posting for anyone from Codex team to see this.

with gpt-5.3-codex xhigh on Pro plan
I'm seeing faster output but..

  1. No reasoning blocks
  2. Heavy use of bulleted lists
  3. Repeatedly compacting
  4. Spawning subagents without being told to
cedar skiff
#

How do ppl handle rules and permissions in a reliable low friction way?
I want to prompt on destructive commands and allow on safe ones. but its seems codex can just use zsh and bypass any prefix rule, and we cant target inner case on zsh wrapper commands.
Whats the solution?

cedar skiff
#

Me either it is working well for me

cyan wing
#

welp... not sure what to do

boreal holly
cedar skiff
#

Im more worried about things like git checkout nuking local changes or --force commands

ocean gorge
boreal holly
cedar skiff
# boreal holly Oh yeah, I think Codex config supports matching specific commands like that, so ...

It does but then codex does compound calls using zsh like this (below) so using a prefix rule on cat is by passed by this so it can just do it anyway. We can target zsh but the inner target is a string match so we cant use it to target cat in the inner command.
Meaning codex can do it anyway

rg -n "zsh -lc 'cat > /tmp/ec_test.txt'|resolves to "'`allow`|updated to reduce the /bin/zsh -lc|=> `allow`|matchedRules: '"\\[\\]"
orchid plume
slow flax
#

Guys those who are on Chatgpt 200$ per month plan the limits are good? I mean almost all day coding with codex would be fine? I'm not a extreme coder but getting it to not have worry about speeds and limits

orchid plume
#

I consider myself a reasonably heavy user, but best I've reached on 2x rate limit is 50% before weekly reset

slow flax
#

I was confused between choosing claude code max vs getting chatgpt pro but going to get chatgpt just because I like working with Codex

#

It got its quirks sure but I just find it more sensible

orchid plume
#

also don't forget that the quotas for Codex and ChatGPT are separate

orchid plume
# slow flax wow really

yup, unlike Claude, ChatGPT's quotas are separate. You have quota for ChatGPT and you have quota for Codex stuff

boreal holly
# cedar skiff It does but then codex does compound calls using zsh like this (below) so using ...

Yeah, it can indeed find workarounds. It could easily run python code that deletes files too. But I think if you set up a rule that catches git checkout and does "prompt", that gives you the opportunity to deny the request and steer them away from using it. If you do "deny", it might not prompt you, and they will work diligently to find a workaround like that. It's a tradeoff for sure but "prompt" is the way to go

#

Codex is pretty good at following rules, but in the cases when it decides not to, the prompt is a good realignment opportunity

cedar skiff
#

I do have them as prompt and its not about it trying to find a way around it, it might just decide it needs to have a compound command for what ever its doing. So it just slides right past the rules.

warped pine
quaint hazel
#

Excalidraw diagrams rendering inside of Codex app 🚀

#

PRs about to get visual asf

frosty zealot
#

Does the app have a way to pull/rebase after you merge a PR on Github?

#

I cant find it

slow flax
#

Guys when windows codex app coming?

latent tiger
#

idk if they mean the work week (friday) or the entire week (sunday)

slow flax
iron storm
#

has anyone managed to create a team of SKILLS for a given workflow?

lean lark
lean lark
cyan wing
#

So I'm supposed to believe this is normal 5.3 xhigh model behavior?

I'm seeing no reasoning output, long stacks of tool calls, numerous ------ dividers, Worked for 1m in every output

warped pine
#

this is normal behavior

#

model tells you what its doing

#

it's one of the benefits that was explained in the release

#

these ⁨⁨---⁩⁩ dividers and "worked for 1m" things are parts of CLI, not model's output

cyan wing
#

yes I'm aware of what CLI output is supposed to look like

warped pine
#

so, what's wrong then?

neon girder
warped pine
cyan wing
neon girder
#

reasoning output is hidden in most cases when using 5.3, since the model updates you as it goes anyway. 5.3-codex is also significantly faster by design

warped pine
#

there was a verbosity setting iirc

cyan wing
cyan wing
warped pine
#

ask it what model is it

neon girder
cyan wing
neon girder
#

i've also been using 5.3-codex from the hour it's been released. yesterday and the day before there were rerouting issues (which you could check with the command i sent above). they are fixed. the model behaves the same as it did before for me.

#

numerous ------ dividers, Worked for 1m

this is literally the main difference between 5.3 and 5.2... 5.3 updates you as it goes and therefore you see those dividers.

cyan wing
#

Do you still see reasoning sections like this?

neon girder
#

reasoning sections are hidden by default when using 5.3

#

press ctrl + t in the tui and you should see them

cyan wing
latent tiger
#

custom cli ui

#

your vanilla terminal/powershell wont look like that lmfao

#

the average ai coding agent user has no idea what sorcery real coders used to do prior to the advent of these tools lmfao

cyan wing
#

i just mean "reasoning sections are hidden by default" is not the UX I've ever seen with 5.3

latent tiger
#

oh its in the settings

#

type / and scroll through the cli setting options

#

i think hiding it saves some context

chrome raven
#

the summaries are perfect

lean lark
#

I think there will always be user preferences for how much "thinking" to see and the format. The text Must remain in context - it's a part of the "context" which allows it to continue to do what it's doing. The only issue is how much we see and what it looks like.
Maybe there should be a callback from the CLI output where we can determine or define our own output for these things, and maybe other things like the language, timestamps, etc. AND/OR, maybe more correctly, we just need to drive for OpenAI to add config.toml props for just about anything they hardcode.

warped pine
#

vanilla macos terminal

cyan wing
#

Yeah there's no shot.. something is very wrong

5.3 xhigh in my CLI is ripping through the code base like a mad man
eating up the context window insanely fast
doing ridiculous searches and file reads

am I on a Cerebras test release or something? get me out of here

warped pine
#

"eating up the context window insanely fast" - that was my experience since first hour

lean lark
#

I've written on this topic before : Don't just tell the assistant to do stuff. Add explicit directives in AGENTS.md to create documentation as .md files in a /docs folder, with an index and links to files, so that it can later learn everything about the project before it goes to work. This almost entirely eliminates all of the random 'ls' and 'grep' commands that it needs to issue to guess and hunt all over your application to figure out how it works for every single query.

frosty zealot
#

Does anybody use the codex PR/Review/Fix Issue work flow?

warped pine
lean lark
#

Create AGENTS.md in every project as well as at the workspace and server levels. Give the assistant some clue about the layout of your projects.

You don't need to do this manually. Tell it to create this information as it discovers resources and assert that it will be used by the assistant on all queries. It will tailor the information and it will get better over time.

frosty zealot
cyan wing
steady vigil
#

GLM-5 dropped today. its a good release. I'm using it in both claude code and opencode. Its a good replacement for anthropic. I think it will hurt them a lot (they deserve the pain anyway how they played things when they were a monopoly). its a sweet compliment to codex-5.3 (not everything needs 5.3 to implement, and oai rate limits suck lately, 2x is what should be 1x, so if they go back to 1x, codex will lose a TON of value)

#

Unfortunately they raised prices, but if you are an existing subscriber no change in pricing so its a hell of a deal for those of us in that boat

torpid trout
#

Does someone understand what this means?
By my feeling, the 5hrs window is resetting at 9PM which is indeed ahead in my future... but the weekly resetting at 1PM makes zero sense, 1PM is long gone here today
And if I where to assume those to be UTC or anything else, then the 5hrs window reset makes no sense anymore

warped pine
#

perhaps it means some other day but doesnt show it

high girder
warped pine
#

check in web version of codex

torpid trout
#

ah ok - web version has dates.

#

Lol, so once they go down to 1x, I will be basically at 0. Hopefully by then the 5.3 is api-available!

high girder
torpid trout
#

ok
Well I have to check in IDE sicne the CLI does not show usage limits etc (or I am to obtuse to see it)

warped pine
#

in CLI you can run /status command

torpid trout
#

Right, thanks. To obtuse, clearly lol

lusty nimbus
#

Is anybody else having problems with the Codex VS code extension? Particularly when codex asks to run a command, and you're not able to approve/deny, its just frozen there and perpetually awaits permission.

torpid trout
#

Its truly impressive.

lusty nimbus
chrome raven
#

0.99 is out

torpid trout
#

This might be interesting because honestly my projects are sometimes so large even the creator does need to dig code to remember why 🤣

lean lark
lean lark
#

We now have the ability to get the bot to document functions, special variables, types, components, modules, build rules, schema, db usage, coding patterns, preferences, deployment requirements and pre-flight, environment requirements, and all of these other things that we need to just internalize by osmosis as we try to understand code ... even stuff that we wrote ourselves last week. This costs us no time or money and saves a lot of both.

cyan wing
#

More evidence of 5.3 codex xhigh misbehavior...
spawning subagents when not specifically told to

warped pine
#

i guess system prompt mentions that it should use subagent for exploration

#

nothing wrong about it

cyan wing
#

ok great, more opaqueness, unclear tool use, and potential to exponentially increase cost

plucky halo
ivory zodiac
tranquil hare
ivory zodiac
#

yes i built one. trying to improve it

#

but really you need agent memory to perfect it

#

i'll share it with you hold up

#

agent memory is coming so this is just a bandaid.

frosty zealot
#

@potent mason I'm glad I gave the app an honest try, this worktree flow is ridiculously nice lol

#

My productivity has like quadrupled, good thing they have the 2x limits or I'd be crying right now

tranquil hare
lean lark
lean lark
lean lark
frosty zealot
#

Also I think using Code Review usage more reliably would save me on weekly usage, but the flow still seems kinda buggy, sometimes it doesnt give me the option to perform the fixes

ivory zodiac
#

best way to handle this is SQLite but i'm not putting in that kind of effort when OAI is <1 week fromlaunching an official version heh

frosty zealot
#

In before vesion 0.101

ivory zodiac
#

put this in your agents file

Agent Memory
Search your CWD for .codex/memory/MEMORY.md at the beginning of each session. If it exists, load it into memory and take note of any memories/instructions. If it doesn't exist, just proceed forward with your task.

#

let me make a 'create memory' skill

#

this will be like init, you need to do this first

#

but basically in your CWD/.codex make a folder called /memory/

and then add a file called MEMORY.md

#

Project Memory

Active Handoff
There is a session handoff document at HANDOFF.md in the same directory as the MEMORY.md file ( CWD/.codex/memory/HANDOFF.md).

Read it first at the start of any new session to pick up where the last session left off.

Note: The handoff may not be relevant to your current session, it may be stale. Use your best judgement.

Key Learnings
-learnings

tranquil hare
#

MEMORY.md goies in the .codex/memory/ folder, right?

ivory zodiac
#

handoff should APPEND

#

make sure you tell it to append

ivory zodiac
#

current working dirctory

tranquil hare
#

got it

#

thanks! Will try tomorrow and report out

ivory zodiac
#

/path/to/repo/.codex/memory

#

you could do your base .codex but then you'd have to manage the project folders

#

.codex/project/memory/memory.md

#

thats prob how OAI will do it

#

🩹

tranquil hare
#

@ivory zodiac do you have a X handle? In case I share this publicly, I'd like to give you proper credit. You can DM it to me if you prefer not to write it here

ivory zodiac
lean lark
# ivory zodiac

I copied this and will audit and implement sometime today. Thanks!

ivory zodiac
#

it will read HANDOFF every session

#

if HANDOFF isn't relevant, you just loaded irrelevant context.

#

thats not necessarily good

#

i just had an idea.

ivory zodiac
#

well hmm acutally this is tough

#

you can make a skill or explicitly say "read memory"

#

and not, when you dont want it to

#

this is a tricky one because you want it to read memory

#

Okay actually yeah here's waht i'd do.

I would keep this in your AGENTS file

I would remove the HANDOFF instructions INSIDE the memory file.

tranquil hare
#

yes, I was thinking od something like that. There should be a /handoff skill and a /read_handoff skill

ivory zodiac
#

and thenjust tell it to read handoff when its relevant

#

yeah

#

so adjust the skill

tranquil hare
#

(or /write_handoff and /read_handoff)

ivory zodiac
#

yeah you got it from here

tranquil hare
#

so, keep the "Agent Memory" section in AGENTS.md, but remove the followingfrom the memory file:

Active Handoff
There is a session handoff document at HANDOFF.md in the same directory as the MEMORY.md file ( CWD/.codex/memory/HANDOFF.md).

Read it first at the start of any new session to pick up where the last session left off.

#

correct?

ivory zodiac
#

yeah if you dont want it to read handoff every time

#

which you probably dont.

#

i was honestly just testing it out to see if it would reliably read it

#

and yeah it does

lean lark
#

Gotta separate what AGENTS defines for consistent behavior compared to what you want just right now. Be explicit about what the cues are to involve desired actions.

ivory zodiac
#

i dont think you need a read handoff skill tho

#

just @handoff.md

#

handoff overwrites the old one every time so there's only 1

lean lark
#

Note: AGENTS does not prescribe reading of supplemental files for instructions.

ivory zodiac
#

it will tho, reliably

#

its quite good at it

#

it might not 100% of the time but probably 98%

#

just keep your agents kinda concise, dont go crazy

lean lark
#

It may, but directives outside of AGENTS are non authoritative. I'm running around right now and can't explain in detail. Look up docs and just ask the assistant in ChatGPT how exactly it's defined to work.

#

If it actually worked like that then we'd have different files for how it should process different details. That would be awesome but it just doesn't work like that ... yet.

ivory zodiac
#

i do undersatnd that, but its the best we have right now, and they are authoritative enough to inject the proper context. it doesn't ignore it. it reads it, loads it into the cw, and allows you to pickup where you left off

#

for all intents and purposes, it is memory and works all the same

#

the only thing you need to worry about is if you are providing a crap ton of context, it might ignore some stuff.

#

but thats the case for anything really

lean lark
#

I completely agree, we're on the same page.

ivory zodiac
#

i like how claude does memory

lean lark
#

We could simplify the intent of all of this by telling the assistant to save contextually relevant into a new context.txt file, and then open the next session with "Read context.txt and let's change ...".

ivory zodiac
#

hoping the new system from codex is good

quaint hazel
#

has anyone found a way to mount multiple folders into one project?

ivory zodiac
lean lark
ivory zodiac
#

the key difference though is it wont just write memories without being asked to

#

but the handoff skill does that.

#

but i dont call the handoff skill every time.

quaint hazel
ivory zodiac
#

the real system will use SQLite and should write memories on its own

#

well, should

#

idk how it will work in reality

lean lark
quaint hazel
#

Fair, suppose that's one way to do it

#

Thx!

ivory zodiac
#

didnt even consider that

#

that answers the whole monorepo ques4tion i get regularly

lean lark
#

Doing that all the time ... PNPM also works on "workspace" concept so it's elegant with a single build for all related projects. Codex gets it, VSCode gets it, Linux gets it. Profit!

#

Oh and each project is still in its own repo. I also tend to have subfolders of sub-projects ... it all really works. Took a while to get brain around multi-project JS/TS projects, using @project/folder syntax but that was a good education.

#

So...

/opt/codex/repos/WorkspaceName
  ... Project1
      Sub1 ... local under Project1
          .git
          dist/
          tsconfig.json ... etc
      Sub2 ... symlinked
          .git
          dist/
          tsconfig.json ... etc
  ... Project2
     .git ....
     ... refers to @sub1/ ...
AGENTS.md ... these files in workspace root
README.md
package.json (for scripts that build other projects)
ivory zodiac
#

@teal cargo just some feedback here. hope you're well.

in 0.99 you guys removed the ability to review subagent sessions when they are closed. This makes auditing impossible. Without any alternative to investigate what went wrong. You cant check them with /resume either. Once they are closed, they're just gone.

https://x.com/LLMJunky/status/2021645793074049391?s=20

lean lark
#

Back on the topic of docs, I've been asserting that we can get good quality docs from Codex which help the next assistants to process the next tasks. That's always been a subjective assertion: I believe it's true but cause it seems to be true. But I haven't been able to prove it qualitatively.
I'm now working with Codex to generate metrics to measure model difficulty in processing a project task with and without some documentation. That is, process a task on some undocumented functions, get a metric, add docs, re-run the task in a different session. Does the second assistant perform better with the docs? We shall see.... 🤓

#

@ivory zodiac I'm guessing you might be able to use this kind of A/B test to verify your notes about multi-agent orchestration.

ivory zodiac
#

i was using it a lot to 'perfect' the prompting of my swarms skill to make sure it got as much helpful context up front as possible

#

i can still do that, but ONLY when the agent is still active

#

i dont see why i shouldnt be able to go back and check them at any time in the session

#

oh wait

#

its back

#

nevermind VB

#

lmao

#

SORRY

#

I was using the alpha.

sand snow
#

😀

ivory zodiac
#

I'm not crazy though. There was a note about this in the commits for 0.99 alpha, and i went and checked, and yeah, they disappeared after closing.

#

But now on GA its working right

#

thats what i get for ignoring the whole "alpha" disclaimer

lean lark
#

It's not like they actually tell us what's changed in release notes for any of these products... 🙄

ivory zodiac
#

so idk if they changed it back

#

i'm not 100% sure.

#

but the reason i even tested it was because I read it in the commit notes.

lean lark
#

Yeah, we shouldn't need to read GH commits ... and BTW, commit != merge.
I'm in OAI betas and we never get info about what's changed to see if they actually fixed or enhanced something before they push it to production. So why the beta? Just cuz someone said they need to beta?
( Former QA Manager here, very sensitive on this topic.... )

#

If anyone wants to see my AGENTS.md directives regarding documentation, I/GPT just abstracted them out to a separate document. DM me if interested.

cyan wing
#

5.3 vs 5.2 in my CLI today
bothcodex and xhigh
same prompt + plan mode
5.3 shows no reasoning, just tool calls. 5.2 shows reasoning

5.3 has 4% remaining while 5.2 has 86% remaining

#

This degraded model behavior started yesterday.. 5.3 had been providing phenomenal output since its release.

lean lark
#

Consider 5.3 medium : Almost no quota consumption.
No reasoning from 5.3? My guess is that it's just not disclosing it, I think there's a verbose debug mode setting in config.toml.

#

We shouldn't need to guess on this stuff - should be documented.

last stream
#

Is my understanding correct - codex is provided the date in UTC and there's no way to configure this to align with system or local?

i ask because my codex is constantly time stamping logs and docs incorrectly and digging through the repo with codex returns no way to configure this. ?

#

@teal cargo ?

#

by incorrectly, I mean using UTC which isn't intuitively helpful for me

lean lark
#

Tell it to get the current time using the local CLI.

#

Give it a format if you prefer.

last stream
#

I know it CAN get that info but it shouldn't require an extra prompt or agents.md lines. should be provided at runtime, no?

#

its already being provided UTC. I should just be able to tune to my system/local time

stray swift
lean lark
last stream
#

Completely understandable. For the sake of UX and not wasting tokens, I'm proposing being able to tune the UTC injection to align with system.

lean lark
#

I would agree, however, "align with the system" just imposes a different rule that others might not want. I'd suggest allowing the date to be tunable in config.toml.
But frankly if we're going to tune it anyway, why ask for the option to tune it 'there' when we can already tune it 'here'?

last stream
#

That's why i said being able to tune it. in case people don't want that to happen. and the difference is codex working more than it has to vs an instant runtime injection

lean lark
#

But ... the Codex server doesn't know what your time zone is, that would have to be determined locally. I'm not disagreeing with the suggestion, I'm saying it's not been considered well enough to adapt to a worldful of other users.

last stream
#

Good thing I'm not a codex dev 😅

lean lark
#

hehehe - How about this ... post the suggestion for the ability to tune in some way in config.toml. Until they process that in some way at some time in the future, use the solution that's easily available now and see how it works for you.

#

I need to think about this more... as I'm thinking about it it seems putting it in AGENTS.md isn't that ideal either.

#

In short, the model needs the data for output, so am I suggesting that it gets the data from the "local" server and then take it back to the model for incorporation into a response? That would be icky.
My original idea was that logging is or maybe should be done with code, and that code should do a local OS call to date +"%c" (or whatever your preferred format).

So I don't have as solid an answer for you as I thought I did.

#

Ask ChatGPT. robotshrug

latent tiger
#

did codex remove support for MCP in cli?

#

everything that required oauth just broke for me

torpid trout
#

Hummm so I added the AGENTS.md in ~/.codex (it was there already but empty
Then I started codex in a project and prompted List the instruction sources you loaded.
It responded:

• 1. System instructions (global policy and tool-use requirements).
  2. Developer instructions (coding agent behavior, sandbox/permissions, collaboration mode, formatting, escalation rules).
  3. Repository AGENTS.md instructions provided in your message (/path-to-working-dir/).
  4. Your direct user request in this turn.

  No skill SKILL.md file was loaded, since no skill was triggered for this request.

I am not sure if it does or does not read the AGENTS.md file?

#

It seems it reads an AGENTS.md in my working dir, but there is none there
Also I would have expected it actually pasting the actual contents so ew can be sure it is loaded

#

This is its content, its barely 4kb, so should not hit any limits

modern moat
#

AGENTS.md located in your working dir takes precedence over the AGENTS.md stored in your /.codex if there is conflicts present

sand shuttle
#

Does codex quota reset follow utc or my local time,

It says limits reset at 8 30, so is it 830 utc or my local time

ivory zodiac
cyan wing
ivory zodiac
#

my guy what?

#

if you're gonna be a tool, you can just figure it out on your own. i was just trying to help 🤡 my guy

jaunty mountain
sand shuttle
#

Oh 💔

#

I still got 57% left and it's about to reset in 45 mins

ivory zodiac
sand shuttle
#

It does and i sint got my pc rn

#

To use some

#

I was gonna burn thru it yesterday but I fell asleep

ivory zodiac
#

i have 3 accounts, always forgetting to use all of them. doesn't feel good to waste it lol

#

the other half the time i have all my usage burnt on all 3

lean lark
#

@torpid trout when it says "Developer instructions" it's referring to "the current user's instructions in ~/.codex/AGENTS.md, with the fair assumption that the current user is a developer. Sometimes it simply doesn't even mention that file when it's absolutely using it.

When in doubt, just ask Codex or ChatGPT about exactly what it means, or the exact files used in a specific configuration. If you had asked for "file paths" rather than "instruction sources" you may (or may not 🤔) have gotten the specific path to ~/.codex/AGENTS.md. 🤡 (Anyone here being reminded (or not!) of the HHGTTG philosophers?)

ivory zodiac
#

i hope they launch a $100 plan soon

#

eventually they'll get tired of hearing about it

#

bruh

potent mason
#

Codex app is driving me crazy

#

Hopping around trying to find the terminal that has the server running in my 100 chats is annoying

#

We should get a unified terminal or something

toxic torrent
#

What’s the consensus? 5.3-codex-high or 5.2-high?

potent mason
#

5.3

main nimbus
ivory zodiac
#

high is better for me. it is faster, and sometimes xhigh can overthink for little benefit. even detriment possibly

#

i use high for almost everything.

#

actually medium is pretty great too

frosty zealot
#

I find it just works better

cedar parcel
raven crane
#

Even on Xhigh I would say it's worse than 5.2 non codex medium. 5.2 Xhigh forget it

stray swift
raven crane
#

I checked and I'm not being routed

#

But it's just so bad

#

I have no idea how people say this is good

cyan wing
raven crane
supple perch
#

I had to change my prompting from "keep docs" to "use code comments" and it fixed most issues

stray swift
# raven crane Codex models always feel this way for me vs normal gpt models

You can instruct it on the style you want in detail. I have this in ~/.codex/AGENTS.override.md. ## Coding Style & Naming Conventions

  • In general, follow the guidlines in the books The Art of Readable Code and A Philosophy of Software Design

The Art of Readable Code

Core idea: Write code as if the next reader is a collaborator who needs to understand it quickly and correctly. Readability is a first-class requirement, not polish.

Practices to apply

  • Optimize for “time-to-understand.” Prefer simple, explicit constructs over cleverness, even if they’re a few lines longer.
  • Choose names that carry intent. Use precise nouns/verbs, include relevant units/constraints (e.g., timeout_ms, is_ready), and avoid vague placeholders (data, tmp, handle).
  • Make control flow easy to scan. Minimize nesting, use guard clauses, keep the “happy path” visually prominent, and avoid surprising side effects.
  • Reduce cognitive load. Break complex expressions into well-named intermediate variables; keep functions focused; keep related logic close together.
  • Use comments to explain “why,” not “what.” Comment on intent, constraints, non-obvious tradeoffs, invariants, and “gotchas.” Don’t narrate code that’s already clear.
  • Keep formatting consistent and informative. Use whitespace, grouping, and consistent patterns so structure is visible at a glance.
  • Make tests and diagnostics readable. Tests should communicate intent and failure messages should help the reader localize and understand the problem quickly.

Red flags

  • Dense one-liners, deep nesting, unclear naming, and comments that restate code instead of explaining intent or constraints.

#

Also, a section on A Philosophy of Software Design, but discord is eating that for some reason, maybe I'm too prolix.

raven crane
#

Or.... The official instructions on the docs

#

Instead of some vibe coded 🥛 you found from someone farming engagement

cedar parcel
# raven crane Codex models always feel this way for me vs normal gpt models

I agree I just can't like the way codex models behave. And I've given 5.3 codex xhigh a really decent try but it just doesn't produce good results for me. It's choosing speed over quality which just means having to spend several sessions debugging. It skips reading large files and hallucinates the missing pieces afterwards.

I'm back on 5.2 xhigh since yesterday and perfectly fine with everyone else staying on 5.3 and its GPUs 😁.

formal quiver
#

Where is the best spot to get a sense of benchmark performance of different models on lesser known or more eclectic coding benchmarks?

raven crane
#

Only a marker of how much a company benchmaxes

#

Unless of course one finds realistic the notion that models like glm and gemini are apparently on par with opus, gpt 5.2 Xhigh, etc

#

Which is obviously absurd

cyan wing
#

it's going to be Cerebras access, calling it now

simple star
#

v0.100. I called it

warped pine
cedar skiff
simple star
#

@cedar skiff Release tags in the Codex github page

cedar skiff
#

ahh ok, but what does it mean?

simple star
#

The way I interpret it, is that after v0.99 does not come v1.0, but v0.100 🙂

cedar skiff
#

I was thinking you had some juicy tin foil hat ideas about cool things to come 🤣

simple star
#

no 🙂

cedar skiff
#

do you guys utilise skills much with codex?

high girder
cedar skiff
#

Thanks

chrome raven
#

what model does sub agents use? parent model?

exotic cave
#

I am rerouted to 5.2 once more, and guess what no error was surfaced in the CLI.
What's going on with codex?
Opened a github ticket hopefully will be resolved fast.

exotic cave
#

Yeah, worked for a bit and now i am back to 5.2...

#

All i do is work and also develop game mods which at times require reverse engineering.
But if that tripped the system, then the classifier can't distinguish between a developer modding a single-player game and an actual threat actor.
Also they did it stealthily again, no error surfaced, no notice.

chrome raven
exotic cave
#

Nope not a single time, i am also on the 0.99.0.

cedar parcel
cedar parcel
chrome raven
jagged estuary
#

Has the statusline update been released yet?

finite sorrel
#

getting this a lot with latest update

finite dragon
#

Hello, is codex down right now?

jagged estuary
#

Calude has had this option for ages. Its a really great functioanlity I would really appreciate it if they added this.

modern hearth
#

howdy, I started a task in a git inited project directory but I got

• I also see stale temp package folders from earlier attempts in dist/; I’m cleaning those up so only zip
  artifacts remain.

✔ You approved codex to always run commands that start with Get-ChildItem dist -Directory -Filter package* | ForEach-Object { try { Remov...

• Ran Get-ChildItem dist -Directory -Filter package* | ForEach-Object { try { Remove-Item -Recurse -Force $_.FullName -ErrorAction Stop } catch { Write-Host "Could not remove $($_.FullName): $($_.Exception.Message)" } }
  └ Could not remove : Access to the path 'C:\Development\web-extensions\cancelx-git\dist\package\icons\icon128.png' is denied.
    Could not remove : Access to the path 'C:\Development\web-extensions\cancelx-git\dist\package_04bcef0ff4f84f85b99c7d27d20e1629\icons\icon128.png' is denied.

the path codex can not access were created by codex cli in first place in the previous task.
How can I grant codex cli any permission in cwd? I'd like it not to stop for such things

terse kraken
#

or -s danger-full-access in sandobox

abstract inlet
#

i've been using opencode with 5.3, and this morning i can only see 5.1 and 5.2. I've uninstalled, removed all configs, reinstalled, refreshed models. I've got chatgpt plus subscription. Anyone know what's going on? Running arch linux

sleek spoke
#

what's this gonna be? "joy" doesn't normally scream out Codex

chrome raven
#

sub agents are really good now, niceee

hallow sky
tranquil hare
plucky halo
plucky halo
hallow sky
tranquil hare
chrome raven
hallow sky
#

in my case, 'gpt-5.2-2025-12-11' despite having gpt-5.3 selected

exotic cave
#

you can test it much faster you just need jq installed:

RUST_LOG='codex_api::sse::responses=trace' codex exec --sandbox read-only --model gpt-5.3-codex 'ping' 2>&1 \
    | grep -m1 'SSE event: {"type":"response.created"' \
    | sed 's/^.*SSE event: //' \
    | jq -r '.response.model'
chrome raven
exotic cave
#

Anyway this is a real productivity killer.
I don't like it that they have the power to do such a thing without any notice.
Sent them an email about it, and opened a github ticket: https://github.com/openai/codex/issues/11561

ember venture
exotic cave
vestal locust
#

no i use the VS Code extension

#

@exotic cave

main wraith
#

the IDE context feature. is it on when I click on it and disappears, or the way it is now - showing up in blue ? it's just confusing

exotic cave
neat sinew
#

When is the Codex App coming to Windows?

#

ChatGPT doesnt know

simple star
#

nobody knows

torpid trout
#

I have not yet tried (and maybe I should just try it but sometimes I prefer asking) sub agents
How does it work?
Assume this scneario:

  • we are in WD, codex prompt:
  • Refactor the monolyth with 5000 lines in file xxx/zzz/y into modular smaller dedicated files spearating by concern
  • Implement new feature XYZ
  • Write a user-facing documentation about how to install the app in /doc/user

Is the thing smart enough to spawn subagents (without asking for it) and make one agent wait on the other?
Or will it end up in a total mess as feature XYZ would be in the monolyth, but the monolyth is being ripped apart for refactor, and will doc subagent document everything, or only what he finds at the moment of prompt?

Asking GPT it says "usually" it will organise the subagents correctly but that for safety we should prompt it to do so - I then just wonder, how sub-agents are even useful, because it becomes a synchroneus process, which could as well be done manually (wait for results of task A, then prompt B and then C, also allowing us actually to see and test what task A resulted with before we go ahead with Task B)

livid flame
#

you guys completely broke codex IDE. Look at this, it just stops all the time, without fixing the issue. What did you do ? Vibe coding is cool and all, but test your changes. What is this model, is it even 5.3 codex ? It does not seem so at all. It just lazy as hell, and just stops, without doing anything. What did you do ?

exotic cave
#

The silent reroute problem is bigger than people realize.

Developers rely on these tools daily.
When a model is silently swapped, the damage is done before we even notice.
Work gets built on the wrong model, time gets wasted, and nobody told us.

This goes beyond inconvenience.
These tools are now part of how we compete and deliver work.
Being silently cut off from the model I'm paying for puts me at a real disadvantage compared to others who have access.
And it's happening on false grounds.

The classifier is supposed to catch attackers.
Instead it's flagging developers doing normal work.
Real attackers don't use services tied to their real name and credit card.
The people getting caught were never a threat.

What we need:
Tell us when we're being rerouted, don't do it silently.
Fix false positives fast, hours not days.
Make verification stick, not expire without notice.

Silently downgrading paying users isn't security, it's a trust problem.

livid flame
#

using the same model in opencode btw and i'm not getting this lazy behaviour. It's so strange and it does not even seem like the same model there

hushed storm
#

now I am scared that I am going to be rerouted 5.2-codex

plucky halo
sleek spoke
neat sinew
#

I believe OpenAI knows what they're doing, delivering the most cost effective tool they can afford

lean lark
#

OMG Is yowave still ranting about a problem that was fixed yesterday!?!

neat sinew
#

If every ChatGPT user actually paid for Plus, the price could drop to half for everyone already paying.

exotic cave
lean lark
plucky halo
plucky halo
sleek spoke
lean lark
# exotic cave It wasn't fixed, i followed the verification, and today back to 5.2. This is als...

I explained this yesterday. Won't re-engage today.
Bugs happen. Sometimes bad ones - and yeah, this one was bad. We all try to fix what's broke.
I see you created a new ticket - that's great. It's now 7am in SF. It's now 7am in SF. Give um some time to wake up and get to work. Allow them time to process it. Be sure to logout and even restart your WSL ... it's common for some details to require exiting/re-init before taking effect. Be a professional and move on.

torpid trout
#

Verification? Rerouting?
I never had to do any "verification" since log in, and I checked with RUST_LOG='codex_api::sse::responses=trace' codex exec --sandbox read-only --model gpt-5.3-codex 'ping' 2>&1
| grep -m1 'SSE event: {"type":"response.created"'
| sed 's/^.*SSE event: //'
| jq -r '.response.model'
says gpt 5.3

Did I miss something? 😄

#

Or maybe because I am on v0.98?

cedar skiff
#

is anyone else having codex app problems where it is overriding session policy to never?

boreal holly
plucky halo
#

Since verifying, it's been fine though.
EDIT: FOR ME. I'll get that in before I get the "well it's not working for me" replies lol

cedar skiff
devout talon
#

hello

sand shuttle
#

Does coex automatically reset its ocntext window? if i leave my ide as is for the nextfew hours then come back will the ontext still remain

plucky halo
#

Yes it does

devout talon
#

guys i have a
question

lean lark
#

Hey, complete aside here - in the IDE, check the F1 + "new context window" feature. It's a real treat to get out of the sidebar and have multiple chat windows/contexts.

devout talon
exotic cave
# lean lark I explained this yesterday. Won't re-engage today. Bugs happen. Sometimes bad on...

Seems to me that you don't understand the gravity of such an issue.
This isn't a bug, it was a deliberate design decision to silently reroute users without telling them.
Restarting WSL won't fix a server side account flag.
Pointing out that silently degrading a paid service is a serious problem isn't unprofessional, it's necessary.
These tools are part of how we compete and deliver work, being cut off from the model we're paying for puts us at a real disadvantage.
As a community we shouldn't normalize silent service degradation, no matter how it's framed.

lean lark
blissful basin
#

Should this happen "Codex ran out of room in the model's context window. Start a new thread or clear earlier history before retrying." ? (latest version of codex and enabled memory, so maybe this is the reason)

lean lark
#

This isn't a bug, it was a deliberate design decision to silently reroute users without telling them.
Responded to this yesterday.
Restarting WSL won't fix a server side account flag.
It may if the client side v0.99 was supposed to address our side of it.
Pointing out...
Over and Over and Over ....?
we shouldn't normalize silent service degradation
Addressed.

#

( getting back into code where no one is whining except me )

blissful basin
sleek spoke
lean lark
#

I'm excited about Codex App for Windows and Android, but I need to wait to install until after all of the initial wailing and gnashing of teeth, the fallout from the 0.x rollout, and the subsequent fixes. When it's stable I'll load. Until then I won't be an alpha. I am a beta for ChatGPT but get no feedback from the company for that. Since they've disrespected me as a beta, I won't even try their alpha. This is how these things work.

sleek spoke
#

will be funny if iOS gets the Codex app before Windows 🤭

exotic cave
# lean lark > This isn't a bug, it was a deliberate design decision to silently reroute user...

Your responses show you don't understand how this works.

The model you get served is decided server-side, not client-side.
v0.99 updates the CLI, it doesn't change what model the API returns.
Your config can say gpt-5.3-codex all day, if the server decides to send you 5.2 based on an account flag, no client update or WSL restart will change that.
This is basic client-server architecture.

drifting granite
#

What have you guys been using for frontend skills? I feel like thats where OpenAI has been struggling the most compared to Claude. I'm working on a new project, and just looking to explore what everyone else has been using.

nimble terrace
#

Codex is unfortunately not good enough at design even with the front-end design skill and other skills it just doesn't nail it.

lean lark
#

Please remember that I've been writing systems for decades. You can rightfully suggest that I'm wrong about something version-specific but I'm verifiably on top of the technologies.
There are no 0.99 release notes so we don't know what was done there. Note that the GH agent linked https://github.com/openai/codex/issues/11592 to Your GH issue because there seemed to be a link between the client and model degradation. Without looking at the code we don't know what flags are flying back and forth.
You've made your point that this is serious. I believe anyone who understands the issue agrees.
What else do you want here?

lean lark
# drifting granite What have you guys been using for frontend skills? I feel like thats where OpenA...

I've found Codex to be really good with React but I need to improve directives to get it to modularize better. By default (5.2 anyway) it creates huge components with everything and the kitchen sink included in one function.. That's not cool. I want it to abstract hooks and other functionality to other modules to make them easier to maintain. Other than that, it understands all the React rules and writes good code.

drifting granite
#

Yeah FWIW I think the React component it writes are fine, and if you've got your project structured to encourage that abstraction I don't think Codex has an issue.

exotic cave
# lean lark Please remember that I've been writing systems for decades. You can rightfully s...

Decades of experience doesn't change how client-server architecture works.
The client sends a request with a model parameter, the server decides what to serve back.
No client-side flag changes a server-side account classifier decision.
The linked issue confirms it, the rerouting is server-side based on their cyber abuse classifier, not a client bug.
The code is open source on GitHub, you can verify this yourself.

main nimbus
drifting granite
#

The real issue is actual styling.

#

I've got a couple of jr's that are swearing up and down about Opus because it's better at styling xD

main nimbus
#

Since gpt-5, openai has made it possible to get decent UI. Prior, not a single model could come close to other SOTA at that time

main nimbus
#

Those other models for me have never made “good” designs, they simply added more details than asked for. So, less prompting can lead to a decent enough result for most people.

main nimbus
chrome raven
drifting granite
ivory zodiac
drifting granite
#

Codex mobile is only good if it acts as an ssh tunnel.

#

Otherwise the cloud rate limits cost something like 3x more.

main nimbus
ornate jolt
#

im really hoping that openai makes even better models for taking down claude as much as i love claude there way over priced

ivory zodiac
#

Everyone is going to make better models 👌

plucky halo
ivory zodiac
plucky halo
#

baby

main nimbus
frosty zealot
#

I need a --web flag or something for codex app --web since its just an electron app, would be nice to access localhost:8080 or somehing and use the app

#

You know what, I'll do it myself

plucky halo
#

Persistent agent workflows coming soon?

cyan wing
frosty zealot
#

oh my god im gonna bust

glass furnace
#

ok this is getting really hard to follow... https://x.com/sama/status/2021984777470193767?s=20

When you name everything codex, its hard to tell what codex is getting the "special thing"...

Is it only for the codex mac app, or will it be in the codex cli as well? lol

sleek spoke
glass furnace
#

its worse imo

plucky halo
glass furnace
#

@tim tebow from codex, please clarify

simple star
#

Oh boy...

sleek spoke
simple star
#

I swear to God, if it is only for Mac, im burning something

glass furnace
#

burn your pc

simple star
#

@glass furnace I have one. I dont like it.

plucky halo
boreal holly
#

I cringe almost to death every time I see keep4o on samas posts.

plucky halo
glass furnace
#

I just want the codex app to use SSH so i can stop using the tui

frosty zealot
boreal holly
glass furnace
frosty zealot
#

I know it's crazy

glass furnace
#

yes they are

frosty zealot
#

Is the codex app open source?

#

or just the cli i might be dumb i dont see it in the repo

spiral gorge
#

lol it will just be windows codex version

#

for pro usres only 💀

#

or control your computer. oooring boring oring ring ring ing ing ng

simple star
#

@spiral gorge Makes no sense to make that Pro-only

#

Im putting my money on ultra-fast model

sleek spoke
simple star
#

Sparkles... Joy...

sleek spoke
#

and fast....

spiral gorge
#

its sooo annoying pissing me off

sleek spoke
#

maybe the Codex app is not the only thing coming today

cyan wing
spiral gorge
#

can someoen tell me what sub i am on?

versed fjord
#

I really dislike the interfaces since GPT-5.1 and newer models...

#

Has anyone been able to workaround it?

#

Even if you give it a UI library like shadcn, it still goes for the "gamer" look with very specific palette and gradients

plucky halo
simple star
#

GPT-5 is already "deprecated" 😄

versed fjord
astral pine
# sleek spoke they also keep using this emoji ✨

Gemini team always uses that emoji... and they are talking "fast" so maybe like claude introduced a couple days ago a fast version of opus for like 3x the cost running on google tpus, thats what this is... fast codex (2-3x speed up) running on google tpus for only pro users.

still trellis
#

keep 4o is one of the weirdest "did not see that coming" moments since LLM's became main stream

astral pine
#

And I am a pro sub.

plucky halo
#

That would make me very happy

simple star
#

@ivory zodiac says it's 5.3 Max

#

Lets hope

frosty zealot
simple star
#

(¬‿¬ )

cyan wing
latent tiger
#

I think pro users are getting mobile app today

astral pine
# cyan wing

we gambling with our model settings like a slot machine now?

latent tiger
#

Either that or pro/max versions

simple star
ivory zodiac
ivory zodiac
#

That should fix your issue

simple star
#

I cant wait to go BRRRRRRR with Max

astral pine
ivory zodiac
#

I'm not sure, I suspect

#

All the hints are shaping up that way

main nimbus
#

5.3 is fast

astral pine
# cyan wing

I guess this is the nod to their terrible naming conventions... not gambling lol

main nimbus
#

5.3 max is probably just like 5.1 max

chrome raven
main nimbus
#

better, maybe new capability given their shared language on twitter

plucky halo
main nimbus
#

a codex fast would be a mini model, right?

latent tiger
main nimbus
#

bc 5.3 is already significantly faster

latent tiger
#

you sure about that lol

#

its either the iOS app, windows app beta test, or codex 5.3 MAX/FAST model varient

chrome raven
main nimbus
#

they launched Codex app + 5.3 at the same time. could be both.

latent tiger
tranquil hare
#

5.3 Max out today but only for Pro

#

this SUCKS

plucky halo
tranquil hare
plucky halo
#

It’ll make up for the rerouting haha

latent tiger
#

it's most likely the 5.3 max/fast varient

chrome raven
main nimbus
#

I wish they would take note from Anthropic's org billing and allow Teams the ability to buy a Pro-level of usage.

cyan wing
#

get a Codex Pro plan
go into debt if you have to

ivory zodiac
#

i hate it here

main nimbus
#

not about cost, it's about team availability + simplified payment.

latent tiger
#

did you guys see deepthinks stats from the update today

#

kinda wild lmao

simple star
#

I cant believe im giving 200$/month for this, and Im not even mad. When I joined the Pro plan, I thought "Im sure I will cancel it in 1 or 2 months".

And here I am, not even caring

honest saffron
#

We need a $100 plan 😅

astral pine
frosty zealot
#

IS the 'max' a good coding model? or what is max? Ive been estarnged from openai for a while

honest saffron
#

Max is a good thinking model, the base model coding is what you get on it. Codex use that foundation but tuned for coding and codex

frosty zealot
#

Oh gotcha

#

So it would be 5.3 Max-Codex kinda thing

main nimbus
latent tiger
#

deepseek v4 drops within the next 5 days

#

the chinese models released the last few days already mogging opus 4.5

#

1/20th the cost

honest saffron
#

I have a feeling 6 months down the line AI providers will decide that its time to specialize more than competing for the benchmarks

latent tiger
#

90% the results

#

GLM-5 is going to force the US labs to either lower prices or increase rates or a combo of both

honest saffron
#

After certain % its no longer improvement, it can never achieve 100%. Ideally AI shouldn't achieve 100%.

main nimbus
latent tiger
#

yeah lmao

#

MiniMax just dropped M2.5 as well

#

moggs opus

#

1/20th the cost

#

codex hanging on to top spot right now by very thin margins

ivory zodiac
#

80.2 on SWE-V

#

thats not mogging

#

but for the price yeah

main nimbus
latent tiger
#

price moggs**

ivory zodiac
#

Opus is better at many other things though so that's the benefit of opus

#

opus is good at literally everything

#

you know where else m2.5 mogs?

#

SIZE

#

460gb full weights

#

its relatively tiny

latent tiger
#

i feel bad for elon man

honest saffron
#

Open weight models are going to be slow, they are not fighting the markets and open weight models need improvements on all levels including not having any hardware to run it instead of causing global hardware shortage 😂

ivory zodiac
#

me too honestly

latent tiger
#

grok 4.20 release within a week apparently has a lot to live up to im worried it's gonna be trash

spice stone
#

Wsp

ivory zodiac
#

still too small 😂

spice stone
#

How's everyone?

ivory zodiac
#

192gb vram and its still not remotely enough

main nimbus
ivory zodiac
#

4.2 is DOA

#

its literally dead

#

elon himself admits its not a strong coding model

#

no one outside of x is going to use it

latent tiger
honest saffron
ivory zodiac
#

Grok 5 or 6 is their only hope

ivory zodiac
#

apple mlx is what runs slow

#

but the problem is intelligence

#

going to have ot run reap or quantized versions

#

or both

#

and smaller context windows

#

i may have misunderstood your point there actually

#

but yeah. we'll see lol

#

i think that smaller models are going to get better.

#

i'd rather have the hardware now than see it go up 50% in cost.

#

or more.

honest saffron
#

User: Hi
LLM: Let me think what to respond to hi before I send a response
......
LLM: Hi, How can I help you today?
User: How many rs in strawberry?
Error! Ran out of context window

ivory zodiac
#

😂

#

also VLMs are small and i have some ideas for those

#

i'm not worried. in a shortage, it'll be the easiest thing in the world to sell.

latent tiger
#

Minimax-M2.5

SWE-Bench Verified: 80.2%
Multi-SWE-Bench: 51.3%
BrowseComp: 76.3%

#

half the price of GLM-5

ivory zodiac
#

its also 33% smaller

boreal holly
# ivory zodiac apple mlx is what runs slow

What are you talking about? local inference is face melting on MLX. If you're using a brand new model then the fused kernels might not exist yet, and I've had to write my own fused kernels to get certain models to work, but it's far from slow

ivory zodiac
#

but apple is still great

#

not hating. i just took a different path

latent tiger
ivory zodiac
#

its getting better too, mlx.

latent tiger
#

china is mogging with the acceleration ngl

ivory zodiac
#

that is a wild chart

terse kraken
#

Is minimax free

latent tiger
ivory zodiac
#

not typically but you can usually find it in things like kilo code and such

boreal holly
ivory zodiac
#

the coding plan is $10

honest saffron
#

M2 be like hello there

terse kraken
latent tiger
# terse kraken Is minimax free

Optimized thinking efficiency + 100 tps to achieve 3x faster than opus

Priced at $0.3/M in / $1.2 out (20x cheaper than Opus 4.6)

ivory zodiac
#

probably like 2 20 plans

ivory zodiac
#

its good though

honest saffron
#

probably better than Claude $100 plan considering people run out after asking a bunch of things in Opus 4.6

ivory zodiac
#

everything is a tradeoff

#

apple gets you vram and power so much cheaper.

#

i got a free rtx 6000 or i'd have went that route too.

honest saffron
#

Codex should have $100 plan, it will generate more revenue than Go plan.

latent tiger
#

lmfaoooo

honest saffron
nimble terrace
#

has anyone made a 2nd cursor that u can invoke or give tasks like "Open spotify and surprise me with a song"

#

idk i'm thinking of making that

wooden halo
ivory zodiac
nimble terrace
#

is this a fire idea

#

or am i burning the bed

honest saffron
lean lark
#

Anyone else getting "Error creating task" from VSCode extension? I can execute a prompt from the sidebar but not from a separate "New Codex Agent" panel.

#

v0.99

simple star
#

I couldnt resist it

honest saffron
#

There is only 3 UIs that Codex can make and we all know exactly how all 3 UIs will look like.

toxic torrent
ivory zodiac
#

is the m4 pro notably better than the base m4 if you aren't using it for local llms?

#

i am thinking about returning my mini and getting the pro

honest saffron
#

Depends on how many core you are going for

boreal holly
# ivory zodiac yes i'm aware but its not the same performance.

BTW, I've been working with the Apple engineers for the MLX project and we discovered a major source of performance issues was metal residency set not wiring down model weights, kv cache, or attention workspace during inference. I've patched these issues myself, in some cases inference speed increased by 400x 40x. The hardware might not have the same TFLOPs as Nvidia but with wired down memory and zero copying across memory it's going to perform almost as well as datacenters

ivory zodiac
#

best of luck to you on that.

honest saffron
ivory zodiac
#

how much those exstra cores are needed for just running coding agents

#

in parallel

#

swarms

#

its going to be a dev machine.

honest saffron
#

10 would be enough, coding agents dont shouldn't eat up that much of anything

#

I run my coding agents and servers on a 2017 PC

ivory zodiac
#

you'd be surprised claude can crash my 5900x with 32gb ram and 3090

#

might be a ghostyy bug unsure

#

but if i run too many at once, it can crash

#

i have a new machine i'm building but for now this is what i'm using

terse kraken
#

Finally pro users get some love

honest saffron
#

Claude eats up if you are on the same CLI for ever, vs code extensions are literally cancer

ivory zodiac
#

but i am missing some of the macos apps.

#

i use cursor like 1x a month and thats it

#

i burn my Pro account and dont touch it again lol

honest saffron
#

codex is a bit forgiving compared to claude cli not because its built better but probably because of rust

boreal holly
ivory zodiac
#

yeah i am getting the 24gb for sure.

#

just not sure if i get pro or non pro 24

#

realistically i'm waiting on m5 stuff... but who knows how long

#

or waht it'll cost.

#

so i'm trying not to go crazy right now but also, its on sale. so i'm conflicted.

honest saffron
#

M5 pro is out right?

ivory zodiac
#

only laptops afaik

#

but yeah

#

desktops are coming this year

boreal holly
ivory zodiac
honest saffron
#

You dont need mac pro, don't waste money just burn it on AI 😛

latent tiger
#

AI optimal tools for February 2026:

  • 5.3 Codex xHigh fast for everything coding
  • MiniMax M2.5 for everything else
#

no need to complicate it

#

will update after deepseek release in a few days

honest saffron
latent tiger
#

if OAI drops codex 5.3 max for pro users today i dont think anything will pass it this month

boreal holly
ivory zodiac
#

yeah i have done some compiling, mostly rust

#

i think i'll take your advice

#

get the 24 non pro

#

cargo builds can crash my current machine too lol.

boreal holly
ivory zodiac
#

thanks brosef.

#

saved me $400

#

one thing about apple products

#

they hold value like nothing else.

#

easy to pivot

versed fjord
#

To be honest there were 2 or 3 examples that it actually got a decent style, but most of them are barely different from default interface...

slow flax
#

Curious what workflow you guys are using for codex i.e a deterministic way of developing software.

https://github.com/EveryInc/compound-engineering-plugin

I'm gonna try this this weekend in codex. I used it 2 weeks ago in claude code and opencode with claude and it worked great but it eats lots of tokens.

chrome raven
#

outttt

#

out now

lean lark
#

deterministic?? way of developing software?? aint no such thing

astral pine
slow flax
#

Maybe deterministic is wrong word but it got commands for each stage.

latent tiger
#

i knew it lol just a fast mode

honest saffron
astral pine
#

Wait... 1000 tps

ivory zodiac
#

whoa

thick basalt
#

how to use the spark model in cli? seems like not available yet for pro users

honest saffron
#

1000

cyan wing
#

it's here!

lean lark
astral pine
#

get your CC out... we coding today lol
128k context window
text only
but 1000 tps

slow flax
nimble terrace
#

Holy moly.

main nimbus
latent tiger
ivory zodiac
#

128k context window

nimble terrace
#

that is FAST

latent tiger
#

it's faster but less accurate

ivory zodiac
#

that has to be cerebras

nimble terrace
latent tiger
#

it is

astral pine
ivory zodiac
#

it is

nimble terrace
ivory zodiac
#

legends

cyan wing
astral pine
#

now what is the best harness to take advantage of 5.3 high planning and 5.3 spark execution and hammering commands and generating logs to debug?

nimble terrace
#

that's insane

#

makes me want to subscribe to the 200 plan

#

but i can't afford it 💔

main nimbus
#

this article isn't even promoted above codex app or 5.3. so this isn't even big news.

chrome raven
#

anyone got access yet?

nimble terrace
#

oh?

#

interesting

#

so we might see this speed on Codex 5.3 with its normal context window

#

very exciting

boreal holly
main nimbus
#

personally, a 25% increase from 5.2->5.3 was enough for me. I will not sacrifice 10-15 points of performance for speed.

nimble terrace
#

oh my god that makes sam's original tweet make more sense

#

"it sparks joy for me"

#

hahaha

lean lark
plucky halo
#

My joy has been sparked

ivory zodiac
#

does it work tho

#

test it

honest saffron
#

dude running alpha version!

still trellis
#

LOL you show that model who's boss

plucky halo
#

Yeah, it works

ivory zodiac
hushed storm
#

Shouldn’t need to

honest saffron
#

ideally yes but if GA the model shouldnt be restricted to apha release of the CLI

boreal holly
cyan wing
plucky halo
#

Yeah - I'm not conviced this has worked

#

And I put it down to this commit:

simple star
#

I still dont haveeeeee. Why not?!?!?! I want it 😄

honest saffron
simple star
#

But I chose myself!!

honest saffron
#

Denied!

simple star
#

oh wait... is itAPI only?!?!

honest saffron
#

No, they said ChatGpt Pro

#

Small partner API users gets access too

hushed storm
#

ig

plucky halo
#

codex exec -m gpt-5.3-codex-test-youcanputanyslugandstillgetaresponse --sandbox read-only --skip-git-repo-check --json 'Reply with exactly OK and nothing else.'

simple star
#

Ah, you mean all GPT Pro AND small set of API