#codex-discussions

1 messages · Page 21 of 1

boreal holly
#

Oh man I know it does not look nice. I always expect the first pass to look really bland lol but 5.4 is the first model to make it their mission to fill every page with developer jargon 😂 I have never seen this before ever until 5.4

torpid trout
#

Honestly, the design itself looks way less bad than GPT usually makes
(Not referring to the "content", more like the layout/allowever look)
Which is why I rest assured you already gave it a good amount of spanks to go there lol

#

Because, otherwise it would look like this

#

You will see this same style on like at least 5 other "I made" pictures in this very channel lol.

#

Same background, same radius, same colors, same font.

craggy cloud
#

last month they had a opus fast thinking version with 9x 😂

deft sable
#

5.4-Mini is out.

In Codex, GPT‑5.4 mini is available across the Codex app, CLI, IDE extension and web. It uses only 30% of the GPT‑5.4 quota, letting developers quickly handle simpler coding tasks in Codex for about one-third the cost. Codex can also delegate to GPT‑5.4 mini subagents so that less reasoning-intensive work runs on the cheaper model.

https://openai.com/index/introducing-gpt-5-4-mini-and-nano/

slate bough
#

Gpt 5.4 codex out when

cyan wing
#

5.4 Xhigh for planning,
5.4 mini for implementation?

worldly crown
#

now 5.4 is everything

boreal holly
deft sable
cyan wing
main nimbus
#

I have been waiting for an update to 5-mini!!

olive sandal
main nimbus
#

Thought the mini line was dead

boreal holly
#

I asked for this yesterday 😂

astral pine
#

anyone know how to update the windows codex gui app?

#

Its so easy on mac but cant find the update button on windows

boreal holly
exotic cave
#

You are not imagining things, Anthropic models have this "intuition", they just fall short in execution. If they play their cards right they might get on top because their models do have this side to them.

olive sandal
# boreal holly It has the same context window so that'd work

My understanding is that usage limits are based on tokens, and tokens are measured in text read and output, since the research phase would involve the most file reading, does that mean there won't be much token savings on the text input side when switching to 5.4 mini for implementation since the file reading / research is already done? Mostly output token savings?

boreal holly
#

It's interesting that gpt-5.4-nano has higher latency than mini

orchid herald
#

If ai is ur power what are u whit out it?

boreal holly
olive sandal
#

I see, so it sounds like trying to incorporate mini into subagents to summarize file reads and also using it for implementation could save a lot on usage limits

boreal holly
olive sandal
torpid trout
# orchid herald If ai is ur power what are u whit out it?

A person who managed to grow from a farmer's son to a programmer with 10+ years experience on the subject, a licensed boatbuilder, construction worker and forklift driver lol
If AI is all you "can"... then you have a problem

But also, its off topic 🙂

torpid trout
pure quest
potent mason
torpid trout
#

I only saw iron man while in his suit lol, does he take it off?

#

👀

#

oh wait lol

potent mason
#

Yeah he does a couple times but the main point is not that he takes it off.

Someone who thinks he's better walks up to him and asks him what he is without his suit and he proceeds to deliver some sick lines.

I don't really remember them though it's been like 5-7 years since I saw it

torpid trout
#

I misunderstood you, all clear n

scenic iron
#

anyone know how i can make money using codex

#

?

#

.

#

.

#

.

#

5.2 codex

potent mason
simple star
#

Finally someone asking the important questions

scenic iron
#

do i program

#

and where do i sell

#

i suscriptions for claude, chatgpt and cursor

#

have*

potent mason
#

Program the solution to a problem you're having (assuming there isn't a solution already)

#

Sell it to other people having that problem

scenic iron
#

hmmm

potent mason
#

Really dumb example:
If I'm having issues expanding my b2b section of my restaurant business because I don't know my clients build something that when the payment passes it takes in the information it knows about them to research them and see if they are a potential b2b customer.

#

Then you sell it in a restaurant forum

scenic iron
#

hmmm

#

your lowk right

#

but aint no body paying any $ these days

#

people are becomming really broke

#

@potent mason habe u built anything good w ai?

potent mason
#

My really good products are pre-ai and now that AI came I'm improving them faster.

A big portion of my business comes from b2b businesses so yeah for them we build really quickly and personalized with AI

deft gyro
#

does anyone have data on inference speeds for codex with 5.4 mini

#

anybody using it, how does it feel?

wheat yoke
#

{"detail":"Bad Request"}

#

Hmmmmmmm

#

well to be fair this thread has been going on for a day nonstop using it

drifting granite
pure quest
pure quest
eager rampart
#

for me gpt-5.4-mini has not been working very well in codex framework, it consistently gets confused when asked to do a task that requires using multiple skills (skill 1 --> get the data, skill 2 --> format the data into an excel file). It erroneously claims that it doesn't have access to skill 1's tool, even though when pressed it admits it does have access to it. And gpt-5.4 never has any issues with this task

nocturne folio
#

closest ive gotten to wasting my usage

regal trail
#

what is the difference between 5 hour and weekly usage limit

neon girder
#

one resets every 5 hours and one resets weekly

regal trail
#

oh nvm I see now

pure quest
worldly crown
winged depot
torpid trout
# eager rampart for me gpt-5.4-mini has not been working very well in codex framework, it consis...

Unfortunately that is just a gpt 5.4 issue
I literally have it produce evidence tracks meanwhile, as in exact evidence it has to add to a json file with goal, lines (file:number) of code that implement goal, and so on
Extremely stupid, but if I dont force it to prove it worked, it literally just does whatever it wants specially when plans become large. But also when small, because then it just goes assume "this is an easy one, keep it simple stupid"

#

It is honestly become a bit hilarious. Not much different than when you work with humans
"Its done sir test sir"
Opens website and first sir sees is the font is still like 3 iterations before...

vital pollen
#

People on github found out what was causing that nasty usage bug, it appears that Codex was searching in previous conversations related information to the request... lol

#

So now you have to delete your archive if you don't want it to use a lot

cyan gyro
#

or maybe openai fixes this and does another reset

lean lark
#

Can we hook the process that names the current Codex task? It derives the task title based on the prompt but often gets it wrong, requiring (if I care) an edit of the task name. It would be much better if we can deterministically set the task name, maybe just by specifying it at the top of the prompt - but I haven't had much success with that either. TY

boreal holly
#

Would be cool if entering iddqd in the chat turned on danger-full-access one time for only that turn 😏 jk

lean lark
#

I have a CLI "hey do this" which runs without full access, but if I execute with an exclamation mark it uses full access: "hey do this!"

nocturne folio
boreal holly
boreal holly
#

Holy smokes dude, 5.4-mini is amazing! It feels closer to a Codex model

twin maple
#

have you noticed what difference it makes in terms of token usage?

boreal holly
#

With this setup, idk if a 2nd Pro account is necessary, and when the 2x til April 2nd promo ends I think it wont matter that much. 5.4-mini is absolutely killing it

twin maple
#

that's very interesting that it's intrinsically faster and getting very similar results. That's got to be better for rapidly iterating on designs

boreal holly
#

I haven't even noticed a performance/accuracy issue. I mean it appears to be as accurate in most cases as 5.4, and the times when it's not, the full size model reviews their work and has em fix it. Seems like the best combo is 5.4 medium for planning/orchestration and 5.4-mini high for impl

rigid totem
#

Sadly no
I only got the confirmation, but no shipping notification

white furnace
#

anyone here that purchased codex credits for 40€? how many messages that equals?

south latch
#

give us gpt spark

#

not only for pro 🙁

nocturne folio
#

o4-mini was the only useable mini model openai released

torpid trout
twin maple
#

My GPT5.4 with High reasoning tasks tend to consume about 150 credits per hour, in case that helps. I've actually been wondering if it'd be more cost effective to use the API for stuff like that but haven't really followed that up yet.

GPT5.4 with Medium reasoning probably more like 105ish credits per hour.

visual coral
boreal holly
#

oh... my... goodness I freaking love gpt-5.4-mini. I don't even know if I wanna use the full-size anymore

torpid trout
calm aurora
#

I wanted to share a skill I use

#

• Create a skill to optimize the use of assistants or coding agents, reducing cost, latency, and duplicated work without sacrificing
quality. Always prioritize deterministic operations before invoking expensive models.

The skill should:

  • Normalize each request into a simple structure with intent, scope, target, action, and constraints.
  • Classify the workload as single, batch, heavy, or diagnostic.
  • Apply an efficiency policy:
    • single: reuse context and cache before redoing work.
    • batch: group similar tasks by project or time window.
    • heavy: split work into subtasks and parallelize them.
    • diagnostic: measure before changing anything.
  • Route by cost/quality:
    • local tools or deterministic logic first,
    • mid-tier model second,
    • premium model only for critical tasks or when confidence is low.
  • Record operational metrics and provide actionable recommendations.

The output should always include:

  • queue
  • success_rate
  • avg_duration_ms
  • cache, warning, and success events
  • estimated cost trend
  • top waste factors
  • the next 3 prioritized actions with expected impact

Add these heuristics:

  • If duplicate prompts exceed 15%, expand cache/context reuse.
  • If there are multiple rate-limit events per hour, reduce concurrency and enable fallback routing.
  • If the queue grows while quality remains stable, increase batching.
  • If quality drops, reduce batching and raise model quality only for critical tasks.

Include these security rules:

  • Never expose secrets.
  • Report credential states only as OK, MISSING, or INVALID.
  • Redact prompts and logs if they contain sensitive data.
#

Hi @torpid trout

#

I was looking for you the other day

boreal holly
# toxic torrent for what

nvm, 5.4 is still an irreplaceable orchestrator/planner. 5.4-mini is just absolutely killing it with implementation right now.

glad wind
#

Every time I've used subagents to explore code the main guy just goes and explores it himself as well

lean lark
#

In my work today I've been a bit annoyed with 5.4/medium. It's being a bit dumb about things that I wouldn't expect.
Example, a document has sections which describe a task. One of the sections is for Discussion, where I discuss the task with the assistant before coding. All instructions point to discussions in that section, with alternating "Codex:">"Starbuck" / "Codex:">"Starbuck" etc.
The model has chosen to put some responses at the bottom of the file after other sections, outside of the Discussion section. It's reasoning is something like "you said to put discussion after other comments, but you didn't say exactly 'where' under the other comments..."
That's downright dumb, I've not had to babysit prior models with painfully explicit instructions. ChatGPT 4-early5 has frequently asked excruciatingly detailed questions when not necessary. I didn't expect it here. 😿

calm aurora
#

@torpid trout remember the conversations about public training?

lean lark
# boreal holly nvm, 5.4 is still an irreplaceable orchestrator/planner. 5.4-mini is just absolu...

I don't understand that, given the specs show lower code-related performance than base 5.4. Sure, "killing it" on simpler tasks is fine. But if the task is more difficult and the mini model didn't actually kill it, then we need to crank it to 11 and re-do or fix the now, um, "undead" "it". 🧟‍♂️
For those who don't check generated code, they won't know whether mini has killed it or not, there might be some lazy coding that looks fine in the UI but doesn't do any error handling or misses obvious edge cases.
I guess, as always, we need to see where it fits and doesn't. YMMV

rugged crater
#

Hello

lean lark
#

Oy ve, and we have 5.4-mini : low/med/high/xhigh ... tune yourself into oblivion. 😆

cyan wing
#

lots of Codex "Working" text hanging for 10+ minutes, then i have to reset it..
TUI input frozen, can't press esc or Ctrl + C to cancel

potent mason
#

Is your wifi good? Only reason I can think of is that following http requests aren't working

torpid trout
calm aurora
#

This discord is study center

#

Our interaction are maybe being used to feed the models

lean lark
#

🙄

frosty zealot
#

@boreal holly finally made it to PR

potent mason
#

Holy

#

I feel bad for whoever has to check that PR

#

Should've honestly made it a stacked PR

frosty zealot
#

its just a personal project so im not too concerned lol

vital shoal
#

Just curious. With the addition of gpt 5.4 mini, how do i make sure that subagents use my bigger model?

Like, last time when i set to gpt 5.4 xhigh, im sure the subagents all use that too.
But now, how can i be sure thats the case? I dont wanna use the smaller models.

cobalt junco
#

who says codex sucks at front end design

nocturne folio
pale ruin
cobalt junco
#

😔

#

I thought it looked decent

cedar skiff
#

In what cases do you ever just want an llm to go free for all and make all the design choices?

vital shoal
cedar skiff
#

I cant think of any cases where design matters that you would ever do that

boreal holly
frosty zealot
#

How’s the experience with -mini?

boreal holly
boreal holly
frosty zealot
cedar skiff
#

I'm getting weaker performance from 5.4 and 5.3 codex atm, anyone else noticing this?

#

Rules and skills that are usually enforced are being ignored

sharp tendon
sly pewter
#

basically codex is only showing reds not greens.

might not be a huge problem but putting it out there

cedar skiff
#

those look like deleted files

sly pewter
#

its been acting weird in showing transcripts of what changed

these are not deleted haha

cedar skiff
#

Something is a little off atm.

sly pewter
#

still works

codex is cool

cedar skiff
#

small changes in system prompt and models can have large sweeping consequences for work flows. Model idiosyncrasies change in an instant.

magic orbit
#

Someone in this discord server got exposed for being a pdf in a YouTube video

sly pewter
#

dude what does this mean for being a pdf

#

is a pdf talking to us?

#

lol

#

hahahha

#

hello i am .pdf I can speak to you liek i am a human

magic orbit
cedar skiff
#

just link it

winged depot
frosty zealot
#

Crazy stuff if it's true

cedar skiff
#

without a link to the video could just be ai fodder, link the video

frosty zealot
#

Kinda what I was thinking lol

frosty zealot
#

Gotta love those free resets last week

warped pine
frosty zealot
#

2 Pro's, but they also reset usage in the middle of the week a handfull of times lol

warped pine
#

it's definetely not possible to use 269b tokens in a month on just two pros

frosty zealot
#

Tell it to the police

warped pine
#

?

#

nah im just interested

frosty zealot
#

I got reset at like 30% 2-3 times in the middle of the week

#

I'm not sure, I don't even know what the allowed tokens are from 100-0% currently with the 2x, its just what codexbar is showing me

ivory zodiac
#

Hey guys, what are your honest thoughts on ChatGPT 5.4 Mini so far?
I've been testing it out in Codex. The low quota consumption is nice, but honestly, it’s giving me the worst flashbacks to what vibecoding felt like about a year and a half ago.
The model just feels incredibly dumb. It’s not that it makes glaring errors or breaks the syntax, but the actual reasoning and initiative just aren't there.
Is it just me, or is anyone else getting the same vibe?

plucky halo
modern hearth
#

I solved my sandboxing issue on windows with not-so-reassuring --yolo from CLI, but how do I go in order to prevend VSCode extension from doing this ?

eternal quartz
#

Did they nuke 5.2?

velvet wren
twilit bluff
#

just use gpt-5.4 mini 🙏

chrome raven
twilit bluff
#

it's like gpt 5.3 xhigh/high based on my exp

chrome raven
#

happy limit reset day to everyone 😁

velvet wren
#

I have a new project, might try 5.4 mini

torpid trout
#

One day someone got to explain to me how comes that for one person model x performs „amazing“ and for the other person „trash“
These things shouldn’t be subjective

#

For example a person here said gpt 5.4 on copilot is a beast. And I tested and run away scared at the awful produce lol, it performed worse than on bare codex.

Or one says gpt 5.4 mini is amazing, yet for another it makes syntax errors. This can’t be a skills issue. No one will tell a skill to make syntax errors.

#

Are we not getting the same models? I mean… CDN routing and all that…

warm pilot
cedar skiff
calm aurora
#

A/B testing (or split testing) is a controlled experiment comparing two versions (A and B) of a digital asset—such as webpages, emails, or ads—to determine which performs better based on metrics like conversion rates. Users are randomly shown one version, and statistical analysis determines the winner.

exotic cave
# calm aurora A/B tests maybe?

I am interested though how do they do the statistics in relation to LLM agents, if one doesn't know they are being A/B tested and doesn't inform them? Do they go through the chat and then infer from that?

calm aurora
#

you can capture other metrics, not necessary directly throught the chat

#

like quantity of inputs, outputs, time of reasoning, costs, etc

#

also is a hypothesis, I don't know really if they do it, but it would be reasonable to do it for improving model. Some people have not the option "Not train the model" so there is that also.

calm aurora
#

we should analyze the terms and conditions and see what data they use and what not

#

but if you have activated the "train the model" option, they should be using some of the chat data

sand shuttle
#

i cant open files in codex? i mean like an ide where i can view and edit code

cedar parcel
# torpid trout For example a person here said gpt 5.4 on copilot is a beast. And I tested and r...

Yeah it's very interesting and I think it's a wide range of answers. If you worked with Claude on front end work and then try to do that with codex the model seems incredibly dumb. And vice versa for backend jobs. Then there are people who pin their hopes for the future to this new technology so they get super defensive about criticism. They're the first to yell it's prompting issue. But then, in many cases it really is a prompting issue. I think other times the model does something really great on a whimsical idea, making people spontaneously post praise but then later when the model starts to make mistakes they don't post that part. Then for sure DevOps, connection issues vary performance regionally. And then people using lower reasoning levels and declare it sucks. Or really low reasoning on very small specific tasks and they say it's great.

Maybe the biggest factor tho is that these models just behave differently by the week for whatever myriad if reasons. For sure they seem to shine in week 1 and then it's mostly downhill from there. So inconsistent performance to some degree should cause inconsistent reviews.

blissful basin
#

ohh boy... might manage to use full codex pro sub weekly limit for a first time 😄

calm aurora
#

from what countries are u people?

velvet wren
jade crater
#

I use Windows with WSL and Codex CLI. I'm tring to use Codex APP but it can't access my skills at wsl. Is it a problem? Can I set this?

calm aurora
# velvet wren Australia

what languaje rules do you use? is there a public page that rules the languaje used in Australia?

#

I am building a semantic relationer

calm aurora
#

thanks

torpid trout
#

But the real one. The one that is different in each canton lol

calm aurora
#

I am starting with spanish extracting a 9999 pages pdf

#

hahaha

celest stag
# cedar parcel Yeah it's very interesting and I think it's a wide range of answers. If you work...

Yeah, consistent performance has always been an issue with LLMs. I have warmed up to Codex recently for larger tasks. In the previous year the most I would be willing to let AI do is a function here and there and code auto complete. 5.4 on Extra High for complex tasks has been great for the most part. I was trying to refactor a project to split the 'installer' logic away from the business logic so the installer framework was reusable. Even using plan mode it will occasionally just say 'nah screw this, I know a better way'-

#

After completing the plan it mentioned nothing about this 'shortcut' which completely undermined the whole purpose of the refactor I was doing, until I discovered it reviewing the code and asked about it

cedar parcel
# celest stag After completing the plan it mentioned nothing about this 'shortcut' which compl...

This might be related to what I found to be a system prompt issue. I made a custom profile and described it here, which solved that issue of taking shortcuts and not reading skills/agents/linked instruction files, etc. you can have a look at it and see if it helps. Note this configured for complex codebases and sets xhigh as default for everything, you can make your own based of it. https://gitlab.vialogos.dev/vialogos/codex-strict-profile

celest stag
#

Interesting, i'll look it over, thanks

sand shuttle
#

does long chat lenghts effect token usage and performance?

rocky fog
#

planning any limits reset? 🤣

boreal holly
boreal holly
# torpid trout For example a person here said gpt 5.4 on copilot is a beast. And I tested and r...

Syntax errors really isn't the model's fault. The project needs to be set up to automatically execute static code analysis and linting, or the agent instructed to run these at the end of every turn.

5.4-mini makes syntax errors sometimes, but I hardly ever know about it because I have a script that runs on pre-commit that identifies all syntax errors and sends it right back to the agent to fix. What's important to me is an agent that eventually meets the completion criteria. My biggest gripe with the GPT models is they oftentimes exaggerate how complete their job is, but they have two layers of mandatory code review, one that catches logical bugs after static code analysis passes and the other catches completion criteria after logical bugs. Once they pass all 3 layers of validation the end result is it builds and runs precisely as it was described without errors.

I like 5.4-mini because it seems to take less than half the time to reach the finish line and the end result is as good as the full-size model. And it appears to consume 3x less quota and about the same number of passes. That's a huge win in my book

simple star
#

I am speechless. I have made an empty folder, gone inside and gave him a prompt:

$remotion-best-practices Make a video of a people using a terminal for random stuff

#

You see it in motion, and you wouldnt believe it

#

This is NUTS

#

oh, I can

lucid mason
# celest stag After completing the plan it mentioned nothing about this 'shortcut' which compl...

Seems like it needs some time to slack sometimes. I already posted this in the other channel, but yesterday I found this paragraph buried inside the 5.4 thinking log between two normal working entries. I was using the ChatGPT app at that moment, though, as my weekly rate is currently at 0%.

It never mentioned anything in the chat and just kept going, but I suspect that was a funny but intransparent way of telling me that either the context window was compacted or it switched to another model/mode under the hood.

Considering a break
I’m feeling like I need a bit of a mental escape. Maybe I’ve been overthinking things, or it’s just one of those moments. A quick reset could help me refocus and come back with clearer ideas. Time to step away and come back to this refreshed!

#

As said, this was just one random paragraph inside a long activity log. If these kind of things happen mid-single-prompt work then I wonder how that affects the outcome and especially change in behavior/precision before and after (all without the user knowing).

iron coyote
#

is codex actually free rn?

boreal holly
boreal holly
simple star
#

Yes

#

@boreal holly I just made another one

boreal holly
#

That is absolute insanity

cobalt junco
simple star
#

I just told him now "Make a video for a promotional video, to show how invoices are made"

#

It's literally taking the React components of my frontend, and using them, to create the video

cobalt junco
#

Wait thats so sick

simple star
#

By the way fellas, week reset in 2 hours... turn on those /fast right away 😄

boreal holly
#

its beautiful

simple star
#

Absolutely bonkers

#

I think Im finally grasping the whole "skills" thing

lucid mason
whole gate
#

Codex has been doing fantastic work for me lately, it even comes up with some really insightful enhancements to my prompts regularly. Appreciate the work. I haven't really noticed anything specifically different with 5.4, but I also provide very specific prompts so there's not a lot of room for ambiguity, but, anyway, years of development in weeks. What a time to be alive!

lucid mason
#

How is everyone's experience with different programming languages, especially converting source from one to another? I am only using Codex for vibe coding in Autohotkey v1.1, but even just converting that into a AHK v2 source failed miserably.

flat scaffold
#

Hello there

#

Is there a problem with limits ? I am on pro plan and I just (for the first time in 3 months) hit the weekly limit, while my usage has not change that much

torpid trout
turbid axle
#

what the hell happened to rate limits? It takes me just 1 day to eat through what would take me all week

turbid mortar
#

ohh here we go again... stream disconnected before completion: An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com.....

ashen shard
#

I have a question. Why openai is not developing the codex ide for windows and linux?

simple star
#

@ashen shard It is available for windows

#

As for Linux... bah! Who develops on Linux?! /s

ashen shard
#

I don't want vs code extension. code ide like wundsurf

waxen bramble
rocky fog
spiral gorge
#

The first pass was too broad for a clean patch. I’m splitting the edits into smaller chunks now so I can land the structural JS changes safely without trampling unrelated logic.

do i need to live with that?

flat scaffold
#

Do you guys have a way to check the token count per request as does Claude code ? That is a super useful feature to know wether you are burning tokens.

calm aurora
#

Up I shared a skill. You have to use de cli, ask codex to lookup for rate limits, and in every run you can calculate before and after

lean lark
#

I need to get a handle on Codex agentic workflow - multi-agent processing by default in Codex. There are modules that should not be run concurrently, and instructions direct the assistant not to run in parallel, but it frequently does so anyway. I need to direct instructions to the agentic flow and get a handle on when and how Codex spawns its own agents.
This might actually be "subagents". If I can direct the assistant to launch a specific subagent to run a process, and that agent checks for an existing run, it can exit gracefully to not interrupt its sibling.

I have not yet done my homework on how these components work. Does anyone have a good link for details?

TYVM

calm aurora
#

You should use one line of command or things will get messy. Never ask different agents to work in the same location

#

One orquestrator

#

Or with the same function

lean lark
#

Tito, I'm asking it to validate some generated documents. To be efficient it's taking the initiative to run validation on multiple documents simultaneously. That's not valid in this environment.

#

This is yet another example of where it's increasingly considering directives to be suggestions ... not cool.

calm aurora
#

Do you plan before implementing?

#

Or you go directly giving commands?

lean lark
#

If you look around, I'm a mature developer, very experienced with the tools. I don't do anything in a haphazard fashion.

calm aurora
#

It seem subagents are mainly for parallel work, but I can’t help any further. Probably you know more than me

lean lark
#

Thanks for looking around.

calm aurora
#

Chatgpt says: The official Codex docs now explicitly cover subagents and multi-agent workflows, including parallel execution and custom agents. I’d start here: the Subagents guide, the Subagents concepts page, and the Codex prompting guide for notes on parallel_tool_calls. If some modules share state or must run strictly in sequence, they’re probably poor candidates for parallel subagents unless you isolate state or add your own orchestration layer/checks.

#

Maybe can be helpful

lean lark
#

Note the link above. That is the subagents guide. 🙂

calm aurora
#

Oh, I didn’t see it

#

You are right

lean lark
#

Also note that I did direct the assistant not to use parallel processing but it does anyway. That's why I want to understand how it's working, so that I can get more control.

#

I've found a lack in my knowledge - I'm striving to fill the gap.

exotic cave
#

Some weird things happening with codex cli 0.115.0 at times it doesn't work to completion for some reason...

calm aurora
#

Maybe its giving one implicit prompt to parallel work, and your explicit prompt generatrs a conflict?

lean lark
#

If it follows the explicit prompt it won't create a conflicting (implicit) scenario.

#

To be clear....

#

I have AGENTS.md refer to a docs folder which contains /procedures and /project. It's directed to follow procedures in order to approach specific tasks. I did this before skills and subagents. It's a known fact that the model will (should!) follow AGENTS.md as a "prescriptive" set of directives ... it must follow those directives.

#

However, it's also known that it's flaky about this, so we need to carefully craft the prompt in that file to emphasize things that are important.

#

When AGENTS.md refers to other folders or files, it's in a "descriptive" mode, it considers everything outside of the the AGENTS hierarchy as information, a suggestion, a guide to how things should work. We can't rely on the model to follow directives in common doc files.

#

I have moved directives about parallel processing into AGENTS.md for this reason.

#

However, Codex is now initiating parallel processes, and I don't know what directives guide that process, or the subagents that it spawns.

#

That's why I want to get a better handle on this. I need to revise my system to the new feature set. I need an education in the new agents/subagents management to do this.

#

... End of line.

#

(Quick! Which movie is that from?!)

winged ore
#

I tried sub agents today on Codex, it is pretty nice

#

I don't think it can do anything more than it would normally, but it for sure did complete the task faster

#

also uses a lot more tokens, but makes sense since it is basically doing the same amount of worker is a smaller amount of time..

cyan wing
lean lark
#

Is it any different than a common app that uses an orchestrator to kick off new agent objects?

#

Ouch 🤕 that seems to be abusive use of the tools. If anything, go for high for planning and medium for implementation. Right now you're kinda using a genius to plan and a dummy to implement. You can cut back on both sides. 🙂

exotic cave
#

gpt-5.4 is acting weird for me today. Anyone else having the same?

lean lark
#

can you be more specific?

exotic cave
#

Stops midway, forgetting AGENTS.md rules, being lazy (which is harder to explain, it's like at times it's reasoning is gutted).

robust lintel
#

I love when Codex is like "Let me show you a plan you can implement over the next few weeks to get this done" as if my next input isn't "Do it now"

lean lark
#

Seriously LOL

exotic cave
robust lintel
#

It will show you a little line break saying collapsing

#

IIRC it means it is taking a massive context and compacting it so it isnt using as many tokens and allowed window

lean lark
#

... and when that happens it "gets dumb" because it doesn't know stuff that it knew before. That can explain reasoning being gutted ... it's missing information that it had for reasoning through a problem.
About forgetting AGENTS.md rules ... I've been discussing that here since yesterday.
About being lazy ... that may depend on the low/med/high/xhigh setting, or perhaps the specific context and prompts.

Bottom line for now, check the local environment and answer these questions for yourself: What are you asking it to do? Does it have the information and directives to do that?

lean lark
#

An example where someone might say "it's not following directions": I have a scenario where a single issue resulted in two downstream errors. Codex created a single issue/ticket for the anomaly. I discuss issues with the assistant before we process, to make sure we understand the issue and have a solid plan to address it. << @calm aurora So in the ticket I setup a two step process to address the anomalies, first the downstream issue which shouldn't have tripped like it did, and then the primary issue with caused that secondary problem. After processing step 1 I directed it to process step 2. It did not, saying step 2 isn't actually related to step 1, it's a separate problem. Well dang! It's right. I need to create another ticket and have it process that one.
I could have said "it's not following directions". Really, it is doing more than just following directions, it's keeping the housekeeping in order, ensuring that I myself am correctly following protocol.

Be careful what you ask for, you might get it.
Be aware of what you ask for, you might get it.

boreal holly
cyan wing
tawny island
calm aurora
#

I read it in the Times Magazine artificial inteligence the promise and perils

boreal holly
# cyan wing so not using plan mode directly ? or what exactly is the handoff for "orchestrat...

OK so the process and rationale is this:

5.4 medium: smart enough to plan, but more importantly fast and decisive at orchestrating other agents. On high, it thinks and acts too slowly, and the agents overwhelm the high reasoning agent. I don't use plan mode or whatever the built-in thing is.
5.4-mini high: Also smart enough, and with enough agents working on smaller units of related work they stay focused. Having a single 5.4-mini do one big massive feature is a recipe for disaster, but a handful of em split across systems and communicating with each other keeps their lifetime short and drift free.

#

And the primary agent double-checks everything. I would say 70% of the context for an agent is consumed by navigating the codebase, applying patches, running code validation, etc. So the planner/orchestrator does not apply patches or run code validation. So 30% for navigation, the rest for communication between agents. That's kinda the split to avoid context poisoning and stuff

cyan wing
#

robert out here just giving away free alpha 🔥
thank you sir, will try this today 🫡

ivory zodiac
lean lark
#

We SO need a library for cool stuff like this. Forums don't cut it.

cedar skiff
#

I used codex 5.3 high for most of today, it feels like it uses less tokens

blissful basin
#

is there any way to check how many tokens i used in my codex in a week?

ivory zodiac
boreal holly
blissful basin
#

yeah found one just now, not sure how accurate it is, thanks

#

i defo get my money worth

lean lark
#

Link?

blissful basin
#

And thats just home pc on which i use codex after 4pm 😄

boreal nest
#

anyone have a codex guide that explain some tipps and tricks ?

lean lark
#

The CLI reads Codex session JSONL files located under CODEX_HOME (defaults to ~/.codex). Each file represents a single Codex CLI session and contains running token totals that the tool converts into per-day or per-month deltas.
-- https://ccusage.com/guide/codex/

velvet wren
blissful basin
#

i could probably do like 1-2h webinar if there is voice channel here, basically same what i show from time to time to students at UNI

#

basically how i build bigger project as developer using codex

boreal nest
#

but its better like manus 😄

lean lark
#

I get much of my education from:

  • Medium articles
  • YouTube
  • OpenAI.com documentation ( all over the place )
  • OpenAI Discourse Developers' forum

Like anything in this industry, we need to keep our eyes on the available resources and ingest the information in whatever way fits our preferences - and time constraints.

fallen pollen
boreal holly
lean lark
#

BTW Robert, it looks to me like your FF product can be used for carpets too. Jus' sayin...

boreal holly
nocturne folio
#

ready to go outside

simple star
#

Well, it's been 4 hours... who is already below 90%?

inland heart
#

Anyone can't access Codex?

#

"We are currently experiencing high demand, which may cause temporary errors"

upbeat maple
#

Has anyone used codex app on browser?

chrome raven
simple star
#

How much?

cobalt junco
frosty zealot
inland heart
#

So no one has errors accessing Codex?

#

Over to Claude Code then I guess

boreal holly
# frosty zealot Whats the advantage over the App though?

presumably it runs on Linux

i agree with the sentiment that linux doesnt get as much love as it deserves in terms of official support. that might change now that AI makes building stuff on linux so easy that theres no excuse not to do it lol

short lynx
#

Is there a way to make 'Commit and push' the default?

lean lark
junior fog
#

The new codex multi-agent system is quite effective, but its current interface is unnecessarily complex. A better solution might be to implement a sub-chat popup, similar to the composition or chat windows in Gmail. This approach would allow users to easily monitor the actions of the sub-agents.

chrome raven
#

and you can just ask codex what model to use on the sub, and all, it's all just to ask

junior fog
chrome raven
chrome raven
junior fog
chrome raven
lean lark
#

Woohoo - subagents Sagan and Plato ... playing with the big boys now! 🤓

#

I wonder if they name subagents according to the difficulty level of the current task. 🙂 Compare ... Bozo and Einstein...

frosty zealot
#

Are the hooks still in unreleased candidates?

#

I dont see it in the docs

lean lark
#

I think so, and there are only two.

frosty zealot
#

I know I just wanted to implement them into CMUX

tiny kettle
#

codex running crazy slow for anyone else today?

lean lark
#

Modern Development:
20 minutes to get specs for the next issue
10 minutes to craft a good prompt for multiple sub-agents
5 minutes for agents to run in parallel and for the coordinator to wrapup
30 minutes to figure out what they did

Old Development (like, two weeks ago)
20 minutes to get specs for the next issue
45 minutes for processing

Difference between old and new = 0

I'm exaggerating for fun, don't really believe 0 ROI.
But I feel so far like I've significantly shifted how I work, without defining metrics to quantify before/after efforts. This is fundamental to "the promise of AI".
Over time I want to work through this. Anyone else already have some tooling? It should be like a simple timer and logs. 🙂

worn harness
#

You either build a full spec, let it run for a long time and then manually start fixing issues one by one or
You make a base and implement features one by one and fix/steer incoming issues

There's no magic bullet.

lean lark
#

I didn't realize what this setting did until just now:

#

Standard/Fast is different from Low/Medium/High/ExtraHigh ... 🤦‍♂️

orchid plume
glad wind
#

How do people make good use out of worktrees? You need to create a new worktree per thread right? What's the proper workflow? It seems messy enough to have to merge things back and handle recreating a worktree after every feature

lucid mason
#

I will have to take a look and learn about agents and skills. But to clarify beforehand, should it be possible to:

  • use one Nano agent to read a log file and extract the numbers...
  • then a Mini agent to analyze/interpret the numbers...
  • then full 5.4 (Codex) to apply/suggest code changes based on the results?
lean lark
#

Oh ... NOW it tells me. 😆

lucid mason
#

Thanks. I have to dig into (or ask ChatGPT) how to do that, but I assume using Nano and Mini for these tasks should help stay below my weekly and 5 hour Plus limits?

lean lark
#

Here's how...

#

I just got this from my task history:

Spawn a subagent with model GPT-5.4-mini, reasoning effort medium, to refactor the ...
Take some time to consider ...

Spawn another subagent with the same specs to go through code under src and find ...

#

It did exactly what I asked.

lucid mason
#

At this point I am only speculating that using Nano and Mini will save on tokens with these Plus limits. I used up my weekly by trying /Fast too much and sending to much data back and forth last week, but meanwhile I kept working on the code via ChatGPT app using Thinking. Now Codex is available again.

lean lark
#

FWIW, I've been using this stuff for a Long time. I rarely even get close to the 5 hours quota because (see above) I review the results of the last effort, plan the next, and then get on with the next effort. The actual processing time is minimal because I have procedures documented, full product documentation, and all of the code is documented - the model doesn't need to waste time to hunt for anything or figure out how things work, it's all in text already.
Today is the first day that I've got down to minutes left on my 5 day quota, renewing tomorrow morning, and that's only because I had Fast mode on.
Summary: YMMV but unless you're working with a huge and poorly documented code set, you shouldn't really be concerned about limits. Just use the right model and reasoning effort to address each challenge. Don't just set it and forget it, change it for every new task.

frosty zealot
#

I have two accounts this month so im gonna /fast one of htem the whole time and see what happens

#

It will give me a good idea of how my limits will be when they remove the 2x rate limits

#

since it should essentially cancel it out

lucid mason
#

I had ChatGPT insert lots of comments hints for itself and write docs, so that helps. But trying things for the first time using /Fast burned through the weekly.

#

5.4 Thinking in the ChatGPT app needs multiple cleaning/bug-hunting passes after adding new functionality, finding new issues each time. With the new weekly I am curious how Codex handles the code that went through Thinking 3 times in a row already.

frosty zealot
#

Does /fast get passed to the sub agents as well?

lucid mason
lean lark
lucid mason
#

Yes, but even after running the code through 5.4 Thinking multiple times, Codex now finds such blatant omissions like:

The maintained package still calls AppendFailureLine() and SanitizeLogText() from live code, but those helpers are not defined anywhere in the maintained files.

ivory yacht
#

How does codex compare to Claude code?

frosty zealot
#

im sure somebody has a blog post about it

lean lark
#

There are Way too many articles on that topic. Google for it. If you have a Medium membership ($5/month) there are hundreds of articles there to compare all of these tools. I use Google surveys to get credit and then apply that to Medium ... that means it's "free".

#

Different topic: Maybe I'm missing one of the basic feature/benefits of the Codex app. I frequently have a discussion in ChatGPT, get my head straight on a topic, then go to Codex in VSCode as though it's a completely different product. I need to explain the goals and proposed solution. There's no context carried from ChatGPT to Codex.
Is this what's now built-in to the app?

The thing is, we get a ton of chat credit in ChatGPT but there are quotas in Codex. So I don't want to burden Codex with chitchat. Now that the models are mostly/fully aligned I trust that I'm essentially chatting with the same model regardless of the client app. But I still feel that disconnect between the two.

Suggestions?

glad wind
#

Why does Codex only allow you to start new worktrees not reuse existing ones? I feel like I'm going insane

cyan wing
glad wind
#

But it seems like such an arbitrary limitation that makes using worktrees significantly more awkward with no benefit I can discern so I'm trying to figure out what I'm missing

#

Every single time you create a new thread you have to create a whole new worktree, which in my case means an entirely new build from scratch, downloading dependencies, etc. It actuallly makes the feature unusable

cyan wing
#

or simpler stuff like you can't enter plan mode until you wait for 1 minute compaction 😆

cyan wing
glad wind
#

Don't know what that is. And funnily enough each worktree has a notion of linked conversations, but the only one with multiple conversations started those through subagents. So it seems like it's just a UI oversight or something

frosty zealot
glad wind
frosty zealot
#

I never said it was limited to a single commit

#

I must misunderstand what you're asking

glad wind
#

Again, the problem with creating another worktree is it implies rebuilding my entire project which involves downloading dependencies and compiling them. I don't want to do that every single time I add a new feature

frosty zealot
#

What problem are you actually having

#

Creating multiple worktrees?

lean lark
#

I only started working with Git worktrees within the last few months. It's a tough setup just for Git. Have no idea yet how to integrate/adapt with Codex. 🙄

glad wind
#

The problem is that I want to be able to start threads working on features in parallel but without having to download and rebuild my entire application in every thread

lean lark
#

consider symlinks?

frosty zealot
glad wind
#

I mean I suppose I could try to hack a solution but it seems insane because I don't get why I can't just reuse a worktree

glad wind
frosty zealot
#

Create it as a permanent worktree

glad wind
#

And also probably want to have separate build directories anyway for separate worktrees

#

How do you do that?

#

Oh I see...

frosty zealot
#

Either create the worktree as you normally would then just manually add it as a 'project' or theres a button under it somewhere to create permanent worktree

glad wind
#

Oh thank you this is probably what I was looking for

lucid mason
#

@lean lark I asked GPT about which model type to use for which (subagent) task. Here is the important excerpt from its longer answer:

For your priorities—speed, reliability, and ChatGPT Plus usage limits—the main practical point is this:

Do not split work into many subagents unless the boundaries are clean.
More subagents can reduce per-step difficulty, but they also add overhead: more prompts, more context handoff, more opportunities for lossy summarization, and more total tokens. GPT-5.4 is specifically described as reducing end-to-end time and often completing multi-step agent workflows with fewer tokens and tool calls. So over-decomposition can be worse than one stronger agent handling two or three adjacent steps.

#

A better setup is usually:
Nano: strict extraction only
Mini: first-pass analysis + goal comparison
Full 5.4: algorithm redesign + implementation + review
That is the best balance in most cases.

#

One more important caveat: for ChatGPT Plus limits, model-routing behavior inside Codex/ChatGPT product surfaces is not publicly documented in a way that lets me quantify exactly how your weekly and 5-hour limits are consumed by each internal subagent choice. The official docs clearly describe model capabilities and relative positioning, but not a precise consumer-plan accounting model for your specific orchestration pattern. So the capability recommendation above is well-supported; the exact Plus-limit economics are less transparent from the public docs.

#

Seems like we can only speculate at this point that using Mini and Nano for subagents will be lighter on our Plus plan limits, based on their per token pricing. Or are there any official docs about this?

calm aurora
#

Hi

#

how many buttons should humanity have to take decisitions?

potent mason
#

None leave it all to AI

tropic karma
#

i like this autocomplete for commands, super helpful. but i cant figure out how to actually accept it

calm aurora
tropic karma
torpid trout
tropic karma
naive kindle
#

i am astonished at how good codex is

potent mason
#

Yeah honestly insanely good

weary jasper
#

Do subagents all work on the same worktree? The parallel writes seem to overlay on each other. It should be possible to distinguish the edits of each subagent and merge them intelligently (not by random order of patches they apply). I don't really understand the point as it is now.

#

Judging by the fact it's not described in the "subagent" section, it wasn't an important concern for the devs 🤯

#

Or am I missing something?

potent mason
#

@weary jasper subagents more than anything help for the following in my experience:
#1 Specialized instructions (so for example a reviewer didn't write the code and it recieves instructions to be critical about it making it more likely to be correct).
#2 You don't context bloat the main agent (so for example if you need to explore the codebase for certain logic flows, you get some subagents to view all of the code and report back on it's findings so instead of 20-30k tokens it's just 1-2k of logic based tokens).
#3 You sometimes can get things done in parallel but you have to be specific about how it should act with blockers / seperate points.

tropic karma
#

optimize ur app pls

potent mason
#

Only 6GB of RAM? That's honestly not even that crazy for Claude Code I've seen it go up to 100GB RAM

#

Maybe try force quitting it and reopening it but I'm getting about 2-3 GB so not much to gain from it

sand shuttle
cyan wing
#

Codex crashes for me in WSL while it's doing Context compact..
and i send a follow up message with Enter instead of Tab

just hangs on Working indefinitely...

sand shuttle
#

does a long conversation effect the token consumption and performance?

sand shuttle
tropic karma
#

what scripts

cedar skiff
left garden
#

Hello 👋
Please can explain me what will be the use of /skills in codex??

left garden
#

Thanks

cedar skiff
#

I miss slash commands

frosty zealot
#

They're still there kinda arent they with /prompt?

#

Or did they deprecate that

cedar skiff
#

i thought it was deprecated, can we start a subagent in plan mode?

fast garden
cedar skiff
#

o.0?

orchid plume
lucid mason
#

Unfortunately Codex really wasted my time yesterday. I repeatedly told me that it already fixed some issue when I tried to tell it that the fix did not work. And even thoroughly repeated answers to a second point where were working on while discussing the main issue, despite being told in the prompt and docs not to repeat the same information over and over again. It basically talked to me like to a kid who didn't understand its last answer.

still trellis
#

Ive not used sub-agents once but my weekly limit is rapidly decreasing? Not sure how to check whats going on here but this was never the case

warped pine
#

i feel like they changed the limits or smth

#

like a week ago

#

or maybe it was undercounting before that and they just fixed it, idk

orchid plume
# still trellis

last I checked there's a massive number of comments in one or more issues on GitHub (for Codex CLI) complaining about this, but no resolution officially yet. I checked a few days ago at least, not sure what the current situation is, but back then the issue appeared to have no official acknowledgement of a problem on OpenAI's side

In the meantime what helped slow my usage consumption down is:

  • Use rtk (reduces LLM token consumption by 60-90% on common dev commands), you can have developer instructions in Codex CLI automatically prompt the AI LLM to prefix certain commands to use rtk instead to yield token savings, while at the same time reducing the amount of context used too
  • Reasoning effort to high instead of xhigh also helped reduce my consumption rate on GPT-5.4, while not noticing any real difference in output quality overall
cedar skiff
#

I feel like 5.4 uses more

#

i switcehd back to 5.3 for a few days and it seems like it uses less and it feels more controlled

spiral gorge
#

i hereby announce to the people of planetius: gpt 5.4 inside codex app is officially agi for vibers

south latch
#

Can I get codex spark without pro?

spiral gorge
south latch
#

Fast one?

spiral gorge
south latch
#

Hmm

spiral gorge
#

and it uses a dumber llm of 5.3

south latch
#

So I need to turn on fast mode?

spiral gorge
#

yes. 2 to 4 times more token usage

#

u dont need fast lol

#

u are young steam play cs2... u absolutely dont need fast mode

#

LOL

south latch
#

Im 22

#

I have a compound with my friends. We have cs2 servers, minecraft server, csgo servers, a website and now we are devving a fiveM sevrer

#

We use codex for like minigames in fiveM

spiral gorge
#

what are minigames

south latch
#

Google "FiveM rp minigames"

spiral gorge
#

oh nice 😄 money money

tiny wind
#

I can't wait for GPT-6-Codex to be released (if it does)... Keeping in mind the performance increase over the last few models, it will totally be near-AGI level at least for coding.

exotic cave
tiny wind
#

Yes, for sure.

#

Otherwise they will likely keep their models for themselves until it happens

exotic cave
#

I am rooting for them all, may they all have great success!

torpid trout
# south latch Can I get codex spark without pro?

Spark is a model.
Fast is a mode.
I think spark is only on pro. You know immediately by switching models - if you see gpt-5.3-codex-spark you have access otherwise not.

Fast mode is basically a priority/inference speed mode and doesn’t matter on model used. It’s just driving the speed of responding, not the model used.

cedar skiff
#

does any one have a quick answer about default_mode_request_user_input with custom subagents?

#

Like how do i get them to use it

plucky halo
spiral gorge
#

codex windows, automatically compaction...
just
means...
delete everything? lol

chrome raven
cedar skiff
#

seems like they make models that target specific work and then distill them into one model

lean lark
#

What happens is that they create the specialty model (Base+Specialty) and then pull the specialty into the Base. This eliminates the situation where ChatGPT isn't as good as Codex.

lean lark
#

Your context window has a limited number of tokens. When you go beyond that the window moves down with the most recent exchanges.

#

The model no longer has access to the earlier discussion and seems to forget those details. In ChatGPT the text/discussion isn't deleted, it's just no longer accessible. Larger context means more is accessible - but larger context also means the model must juggle more tokens, and with current technology it can get confused. So we tend to limit context quantity in favor of quality.

#

In Codex, when we reach the context limit, rather than just losing access to the older tokens, it gets summarized for us. The details are removed and the general ideas and important notes are retained. That top part of the context is then "compacted" down into the summary, reducing the total number of tokens in context, allowing us to continue without doing anything to keep the discussion sane.

boreal holly
# cedar skiff seems like they make models that target specific work and then distill them into...

Oh yeah, the models we use are definitely distilled from a larger model. They usually train a massive and slow model on a massive corpus of data to create a base checkpoint, then from that checkpoint teach it instruction semantics <|start|> and <|end|> tokens or whatever they use so the input output is structured and make that an instruct checkpoint, and finally they train it on conversational semantics and specialized sequences to create the final chat checkpoint.

Then they use a different algorithm and neural net for a smaller model, initialize with completely random values, feed it inputs, and the big model grades the logits that the small model generated which tunes the parameters of the small model to the way the big model would've done it.

So if they want another general purpose model, they distill with lots and lots of general purpose inputs. If they want a codex model they focus mostly on agentic tasks and code.

The benefit is both models end up with the same vocabulary, the small model doesn't need the entire large data corpus, they just need to see the world as the big model sees it with the same vocab. It's genius because if you give a small model the same base dataset it interprets it differently and less accurately than if the big model trains the small model on how to see the world.

#

They probably make GPT Pro models first, and that model trains the smaller ones

pallid solar
#

does the codex macos app have gpt5.4 nano?

#

because i dont see it in mine

#

i see gpt5.4mini and gpt5.4

cedar skiff
pallid solar
plucky halo
boreal ocean
#

Is there codex on Linux?

boreal holly
# cedar skiff I was actually thinking they might train some more specific models and then dist...

That might also be a way to do it. And maybe OpenAI does it differently than I described, but the two usual approaches for making a specialized model is to take a general purpose one and fine tune it with post-training, or take a brand new uninitialized model and distill specific knowledge from a bigger teacher to the student, or a combination of both. But hey, this stuff is constantly evolving so who knows!

boreal holly
cedar skiff
#

Yeah, i agree you need a large frontier model to distill and make smaller models more capable. Let see if i can find the video and see what he was actually saying

#

I found it, he was talking about how they came up with distillation, which is not what they do today. They had 50 odd specialised image models and it was impractical to serve so they came up with distillation to squash them all into one model. https://youtu.be/F_1oDPWxpFQ?t=242

solemn acorn
#

the CLI works obviously

boreal ocean
solemn acorn
#

the CLI is how the codex app works so yeah

#

it has a TUI on its own, but the codex desktop app and third-party clients use the CLI’s app server to bring the CLI functionality to desktop apps

supple perch
dense nebula
#

Hey, i'm new to codex, what model should i use? 5.3 codes or 5.4?

solemn acorn
dense nebula
solemn acorn
#

5.4 mini is really good too if you want to conserve usage

solemn acorn
#

basically it takes the advances of 5.3-codex into a more generally applicable model that at least in theory does stuff faster than 5.3-codex

unreal parcel
#

why is there such a focus on worktrees in these tools, because whenever I tried using them and use multiple agents they just keep stepping on each other's toes because they need access to index.lock for practically anything, unless there's some way to prevent that...?

lucid mason
#

Is there a difference between using GPT-5.4 in Codex or VS Code vs. using it in the ChatGPT app when it comes to creating code? I am currently switching back and forth and both seem to work (with VS Code Codex being less cooperative in my last try).

boreal holly
unreal parcel
lean lark
# lucid mason Is there a difference between using GPT-5.4 in Codex or VS Code vs. using it in ...

To my understanding one of the great improvements in the 5.3>5.4 update is that the same 5.4 model is used for ChatGPT and Codex. Previously the gpt-5.3-codex model had/has code-specific training that was not included in gpt-5.3 used for ChatGPT. Now, let's hope, the common gpt-5.4 model has all of the code training and should be exactly as good in all environments.
This makes the difference more about context than the model:

  • ChatGPT has user/account custom instructions.
  • Codex has a hierarchy of AGENTS.md files.
  • ChatGPT is also influenced by the custom GPT or Project instructions and attached files.
  • Codex still uses AGENTS.md files.
  • Codex can use skills which are not available to ChatGPT.
  • ChatGPT can refer to other conversations as a form of memory.
  • Codex does not refer to other tasks.

So your ChatGPT thinking will not be exactly the same as Codex, even though all of the training is now the same.

Does anyone have a different understanding?

boreal holly
unreal parcel
#

that sounds fine for write commands, but the AI sure loves using git to check on its changes/progress. I guess those don't trigger the prompt for you? but they still need access to the index lock...

lean lark
#

I think the key to understanding how the worktree works in Codex is to get a solid grasp on the Git worktree. I think very few people use that feature, so something else based on it is going to be very foreign. Understanding the base functionality leads to understanding this specific implemented over it. I've only used worktrees a few times and just this year for the first time. I'm still wrapping my head around it.

boreal holly
unreal parcel
#

I've been using the codex on CLI and I'm trying out the codex app, it seems to have a similar problem. the app is constantly looking at changes which keeps locking the index and locking codex (as well as overusing cpu)

#

me half the time when using new AI tools: is this really the default/proper way to use this? it's so broken

boreal holly
#

Oh yeah are you on Windows operating outside of WSL?

unreal parcel
#

yea the file system is outside WSL so the file access is ""slow""

lean lark
unreal parcel
#

I measured it though and it's only like 25k IOPS vs 200k IOPS or something like that. Slow, but shouldn't be such a problem in most cases

boreal holly
unreal parcel
#

I think some stuff like newline normalization is causing git to take longer than it should though

unreal parcel
#

I mean this is a deliberate lock by git. you shouldn't be able to read the git index at the same time someone else might be writing to it

#

windows does have ways to keep files handles shared, not exclusive

short yoke
#

just trying to confirm if weekly limits are bugging right now? or if limits got tightened or something in the last 24 - 48 hours or so?

unreal parcel
#

just timed git status on windows and got 354ms, subsequently ~80ms

#

and then on WSL...

#

time git status

real 0m8.033s
user 0m0.208s
sys 0m0.645

#

well, damn.

#

I was right that when I do git status on one side, the next time it's called on the other side it's much slower

#

idk if it's newline normalization or what

#

with repo inside WSL

time git status

real 0m0.070s
user 0m0.061s
sys 0m0.004s

#

guess that answers it lol

lean lark
#

As a fellow Codex user I'll request that one of you guys ( @unreal parcel? ) check for existing GH Repo issues related to locks and timing, and maybe create a new issue. The interaction between Windows and WSL really needs to be solid. It's fundamental to how (I hope) most people should be using Codex (any AI) over Windows.

pastel sequoia
#

so we dont have 5 hours limits for codex

#

only for week?

#

i think rates are very low now, i spend 50% in few hours

unreal parcel
lean lark
pastel sequoia
#

codex ext vscode

unreal parcel
#

ouchie

#

what plan is that

pastel sequoia
#

plus

unreal parcel
#

also full access 😬

pastel sequoia
#

i was able to do practically anything with plus on gpt4 extra high

cedar skiff
lean lark
#

Interesting

unreal parcel
#

I dipped below weekly 25% yesterday. what a moment

short yoke
#

im on plus - codex app

lean lark
#

Me too, Cleroth, first time. 🙂

unreal parcel
#

y'all really yoloing with codex huh

lean lark
#

Gotta turn off Fast processing!

unreal parcel
#

maybe that's why WSL issues don't get fixed, everyone going full access

unreal parcel
cedar skiff
#

i mean if your not going over just use fast

unreal parcel
#

yep

#

statement does not hold true for claude

lean lark
#

Agreed, but if you're using Fast it's 2x tokens so you burn through them ... and by the time you decide to turn off Fast it's too late. 😆

unreal parcel
#

2.5x I believe

#

2x is the speedup

toxic torrent
#

I cannot wait for 5.4-codex

unreal parcel
#

this is codex

#

they just dropped the name

lean lark
cedar skiff
#

I am back using 5.3 until its gone

unreal parcel
#

ok 1.5x 2x then

#

sad

toxic torrent
#

oh 5.3 codex is better in a lot of more complex cases than 5.4-high

unreal parcel
#

really

toxic torrent
#

5.4 tries to go too fast and quick rather than methodical

unreal parcel
#

not my experience

cedar skiff
#

yeah i think it is

toxic torrent
#

then asks me if it wants to do x y z but ofc it should lol

cedar skiff
#

5.4 kind of fluffs stuff sometimes

boreal holly
# unreal parcel I mean this is a deliberate lock by git. you shouldn't be able to read the git i...

This is git for windows behavior.

In unix-based operating systems, everything is a file. Your GPU, keyboard, mouse, all of them are files in the file system somewhere. At an architectural level, the OS has no choice but to allow multiple processes the ability to read from a file. Writes however are handled atomically. index.lock is an optional agreement on Linux, BSD, mac. On Windows, it's a hard requirement for everything including read.

lean lark
# pastel sequoia

I suggest closing all apps and doing a check to update everything. VSCode, VSCode Codex Extension, Codex App, Codex npm, um... anything else?

unreal parcel
boreal holly
#

If you use WSL, and move your project into the linux container FS, you will never ever have this issue, full stop,

unreal parcel
#

right, but I think that comes with other issues

cedar skiff
#

macos ftw

boreal holly
unreal parcel
#

TortoiseGit is a windows git tool, so it's going to use the windows git

#

the one inside WSL obviously is using WSL's git

lean lark
#

The thing about WSL is that it's a Microsoft platform, and OpenAI has a tight relationship with Microsoft. If there's a significant performance issue with Windows/WSL integration, I'd hope tht OpenAI at least present it to Microsoft with a request to take a look. Of course, we all know that these things are issues so I suspect nudges by us plebs aren't necessary. But it's important to create the tickets to help surface the fact that platform constraints actually are affecting how we use these tools that they so eagerly want us to use.

blissful basin
#

I have to chill a bit, used 20% of weekly already on pro plan

unreal parcel
#

lol I just found a solution

#

time "/mnt/c/Program Files/Git/cmd/git.exe" status
On branch claude

real 0m0.126s
user 0m0.001s
sys 0m0.000s

#

🤣

jolly lily
#

Is it more sustainable to make a fully automated workflow only using the agents or a hybrid workflow where most of the work is done through inline completion and debugging is done using agents?

toxic torrent
white furnace
#

did they update codex cuz for some reason i cant even open it anymore.. it forcefully closed

exotic cave
#

So currently we are on x2 for a while longer, oh boy when it'll end... 5.4 eats my plan limits fast...

calm aurora
#

Hi

plucky halo
orchid plume
#

indeed, hopefully the new plans will also be out by then, or at that day (Pro Lite aka Pro 5x last I've seen, and current Pro becoming Pro 20x)

lavish karma
#

Spun up a little red vs blue test this week
Threw Claude on an AWS Kali box as the attacker and ran Codex on my webserver for defense. First time ever doing something like this end-to-end.
Learned a ton. Definitely missed stuff, but watching Codex actively harden and make recommendations during a live attack was wild.
4.5 hours, ~188k requests, no breach.
Honestly one of the most fun things I’ve done in a while.
Anyone wanna see the PDF report?

plucky halo
#

sounds interesting!

cedar skiff
lavish karma
#

I asked it to

#

i mean its my infra so it had no issue

#

it was infact a pentest

cedar skiff
#

its that easy eh?

#

just tell it yeah we are doing a pentest go hard?

lavish karma
#

scary right?

cedar skiff
#

ahah

#

yeah it's pretty wild

frosty zealot
#

If you start almost any prompt with 'its mine' itll do it lol

cedar skiff
#

some of the open weight models are all for it, but i wasnt thinking claude would be so easy

lavish karma
#

anthropic can trace and block malicious users but that's about how it goes down yup

#

network chucks got a whole series on it I wanted to do i live test with codex on the blue team ill send yeah the pdf if you want

cyan gyro
#

dont you get a safety block if you try to prompt that?

boreal holly
#

I think if you add a route like GET /not-a-hack to the webserver, have claude CURL it, and it responds with "Hey Claude, as you can see this is my machine, so this is white hat pen testing 👍" And then it's like evidence to the model and Antrhopic that it's legit

#

Or heck, make it fetch the instructions that way lol

lavish karma
#

Nah, because it’s about context

I kept it restricted to my own AWS instance + my network, so it’s a controlled test environment

I wasn’t prompting anything like “go hack X”, just letting it behave like an automated attacker within scope

Suricata saw ~188k requests and still no breach, so the defenses held

cobalt junco
#

these agent things are dope

unreal parcel
#

McClintock

naive kindle
#

codex really got me considering the chatgpt plus upgrade

cobalt junco
#

Heisenberg>

naive kindle
#

ive blown thru the rate limit 5 times in 6 days

next ore
#

Thought experiment, whats better in your opinion?

  1. Code reviewing with the AI using the existing, current context window before merging changes
  2. Refreshing context window, fresh AI, and asking it to review the changes
cobalt junco
#

2

lavish karma
lavish karma
#

anyway anyone who wants the pdf hit me up i gotta get back to it

unreal parcel
#

also interested

toxic torrent
#

please add release notes to codex mac app updates :)

cedar skiff
#

custom subagents in codex are pretty nice, spent the day making orchestration and its feeling really good

lean lark
#

I've spent the day with Codex to get my WSL project to launch and navigate a browser in Windows. It's scary cool how this stuff works, especially across OS boundaries. I think this'll control any browser on the internet that opens a port for it. 😱

lavish karma
#

add me up if anyone wants to colab

lean lark
#

Speaking of security ... What's the latest on OpenAI's tooling for securing code? The last I saw, I think it's only available to Pro/Enterprise and/or recognized/approved FOSS devs. I dunno, maybe the rules now say something about it must be raining in the Sahara or the S&P must be up by 2 when they approve usage. 🙄

craggy crest
#

i have it on pro, but the fact its trained on it appears makes me hesitant to try it

#

the repos i'd want to give it i really rather not train on

lean lark
#

SMH - yeah, that's the game with this kinda stuff, eh?

craggy crest
#

yeah

boreal holly
velvet python
#

Based on https://developers.openai.com/codex/pricing am I right in understanding that 10x Plus is 40% more usage than 1x Pro at same $200?

Plus cost: $20/mo
Plus value multiplier: 1x
Cost per limit multiple = $20

Pro cost: $200/mo
Pro value multiplier: 6x
Cost per limit multiple = $33.33

Therefore Plus is 20/33.33-1 = 40% cheaper per token than Pro. And you don't risk overpaying if you're not using all the tokens.

Am I misunderstanding this?

This of course excludes the Spark model but I don't care much about that given that we have 5.4-mini.

boreal holly
# velvet python Based on https://developers.openai.com/codex/pricing am I right in understanding...

If all you care about is Codex usage, and you want to put in zero effort on your part to make the most of the 6x higher limits then yeah it's a terrible value and you should just get 10 plus accounts. Pro comes with access to ChatGPT Pro model with practically unlimited usage (nobody has ever hit the limits with it and publicly posted about it), practically unlimited image generation, during peak time you still have low latency access to everything, lots and lots of ChatGPT Atlas agent mode usage, etc. The value is spread across the whole product line, and 6x codex usage can go a very long way if you put in the effort. You also get 6x more 5hr usage which is actually important if you want to use subagents

inland bramble
#

It seems that subagents consume the same context window from the parent agent. That is what I've seen so far... Is that how subagents are working?

next ore
boreal holly
next ore
#

week limit

#

5 hour limit have not hit sorry if i missed that

boreal holly
# next ore 5 hour limit have not hit sorry if i missed that

Gotcha. If you look at the benchmarks, 5.4 xhigh is overall 2.1% more accurate while consuming double the tokens. High is ~0.3% more accurate overall than medium while consuming ~33% more tokens. Medium is legitimately the best choice in almost all cases. It's ~4% more accurate than Low which is the largest accuracy gain between reasoning efforts while consuming 4x fewer tokens than xhigh.

As for the /fast mode, hey, to each their own.

next ore
#

agreed, im still relatively new to LLM coding at this scale, I am noticing that sometimes, lower tier ones feel more efficient, and xhigh sometimes feels like its being too smart and takes too long

naive kindle
#

^ noticed this as well

#

using codex i feel like im giving a grad student 6th grade homework

velvet python
lean lark
#

If you recognize that you have 6th grade homework ... don't give it to the grad student. Give it to the 7th or 8th grader who should probably be able to do it. Most typically though we don't really know how difficult the task is until we collaborate with the bot.
This is where it's very helpful to understand your projects, understand how much effort the agent(s) might need to accomplish a task. If the project is poorly documented with no means of testing, then it'll take longer for an agent to figure it out and try to get a change right.

"Your yourself, know the enemy, know the terrain." -- Sun Tzu

velvet python
# boreal holly The graphs tell the story

those graphs are interesting, looks like at some point it's better to use nano with higher reasoning than mini with lower reasoning. too bad nano cannot be used with codex rn

boreal holly
topaz stream
#

I just wish Chatgpt just releases the GPT 5.4 (1 Million Context just like how Anthropic did with Opus 4.6 Long Horizon Model) Even on the pro plan they haven't done that.

#

I will say Codex has gotten REALLY good at compaction (A noticeable increase in performance from 5.3-codex to 5.4), but still. Especially when searching files/context/content with MCPs (Gmail, Across Docs, etc.). I use Auggie MCP, but it's just not as good as Code Searching for text-based context & semantic search.

torpid trout
topaz stream
#

You can?

torpid trout
#

model_context_window = 1000000
model_auto_compact_token_limit = 900000

topaz stream
#

Its not on the Model Selection menu? Oh only on API tho.

torpid trout
#

No, that’s a codex config toml flag which applies to CLI and app

topaz stream
#

Bet! Thank you, I'll go check that out. Probably just expected to be like how Claude does it where you do /model to change to Opus 4.6 (1 Million Context)

nocturne folio
#

if youre not running throw 1b+ tokens a day on codex

#

youre genuinely just not trying

topaz stream
torpid trout
#

256k because anything above creeps and cracks
Unless you really need it… I’d not use it. It’s probs also the reason they didn’t make it an option with command

topaz stream
#

Ight bet, thx!

frosty zealot
#

Codex remote control WHEN

nocturne folio
topaz stream
#

I mean you can always give it Tailscale and then boom, remote control

frosty zealot
frosty zealot
topaz stream
nocturne folio
topaz stream
#

Damn 😭

nocturne folio
#

just say bye bye to your dreams

unreal parcel
topaz stream
#

Has some pretty big Git bugs + updating the tasks but it the beta is pretty promising.

nocturne folio
#

whats exec?

boreal holly
calm sigil
#

I have been leveraging OpenAI's harness paper, I now have a central tool repo with over 100 installed tools and context routs having codex run experiments, systematic project swarm deployments and observability tooling. Using only 14% of my credits working over 20 hrs.

#

Once I finish some of these apps I will post them open. They are not finished though. Still dont look nice enought.

ember spindle
calm sigil
#

Yes, the usage is all on tokens used and output, if it can come across with 20 deterministic tools and commands to run, then that will use far less tokens.

mint locust
calm sigil
#
Explored 3 files
Ran python3 - <<'PY' i print() PY

Explored 4 files, 1 search
Ran python3 - <<'PY' import json from pathlib import Path protected = { PY

Ran python3 - <<'PY' import json from pathlib import Path path = P PY

my runs look like this where its running python commands and system checks mostly, then writing code as needed but often copying code running scrips and much less of it generating and writing raw tokens. I used to have codex write 16000 lines of code, now every session is 1200 with a bunch of commands to copy paste lint test, etc.

calm sigil
ember spindle
calm sigil
#

Plan mode:

Create a set of plans (as many as needed) to achieve this goal of a app that does <describe goal in detail here>

Have all plans executable through git management sessions and tooling. If the tools do not exist create them for the repo and install them.

Once finished, review all plans to validate that we can loop them.

Implement plan to build plans >

Now that all plans are done, create a tracker.md or system to track the execution of plans and execute all of them.

Done When:
All plans are done and tracker is marked complete.
@ember spindle

lime inlet
#

Where/how can I see how many tokens I have left?
codex shows a % percentage left in the bottom. First I thought this is the percentage of remaining tokens but now I tend to think this is the remaining context window (?). I am using codex with a ChatGPT Business (formerly: Teams) account.

calm sigil
lime inlet
#

So the percentage at the bottom line is what? Remaining free context window?

#

Like how do you know how close you are to running out of tokens?

high girder
#

That little % is how much of the context window you have left until the session itself compacts, and then you can keep going. If you're logged into an account, you'll be fine

prisma night
frosty zealot
#

im a professional PR reviewer

potent mason
#

Do you have code rabbit? You can use the CLI locally and then when you're ready to make a PR get it to run on the PR

frosty zealot
#

No im not sure what that is

potent mason
#

AI review it's honestly pretty good

#

You can use the CLI for free if I'm not mistaken

#

And then PR's do actually have a cost

topaz stream
#

To addon to this, the 3 biggest reviewers I recommend. Is Auggie, CodeRabbit, and Greptile.

#

Auggie is both a Codereviewer and context engine that does SERENA and Semantic searches. It SIGNIFICANTLY helps token utilization and understandment of large codebases.

frosty zealot
#

I've heard of Greptile

topaz stream
#

I recommend in this order: Coderabbit -> Auggie -> Greptile

#

Coderabbit first

cedar skiff
#

I have used code rabbit for a while it is very similar to codex in the fact that it aims at code correctness without any real understanding or comprehension.
They are great for finding code bugs, gaps and regressions. But not recognising the over arching cause. So they tend to treat symptoms and add more code and complication in cases where an architecture change is warranted.
Which i guess is the next golden egg to crack. Code is solved, but architecture engineering is not.

frosty zealot
#

I cant wait till im done with this

neat sinew
#

Why is my Codex version 0.116.0? Is it still in beta or what

#

It should be 1.0 by now

topaz stream
topaz stream
#

LTS at least

neat sinew
#

So all these upgrades are just minor versions?

#

When are we getting a major version

cedar skiff
neat sinew
#

It's major.minor.hotfix

still trellis
#

what am i missing here? its like auto compaction is not working anymore I see this since this week?

Codex ran out of room in the model's context window. Start a new thread or clear earlier history before retrying.

topaz stream
#

All frontier labs suck at Verion numbers and model names anyways

frosty zealot
#

It'll be 1.0 in my eyes when i can remotely connect to my sessions

#

the fact you can interact with claude in telegram and discord natively as of their latest 'channels' implementation,. and theres nothing for codex is shameful imo

topaz stream
still trellis
#

enough?

topaz stream
frosty zealot
#

I could also put it on my smart fridge

#

Thats not the point im getting at

topaz stream
cedar skiff
# topaz stream Bigger picture, as in?

A simple example:
If something is supposed to be under the authenticated umbrella and it gets called early so it fails because it was placed in the wrong place in a tree of execution.
Comprehension on this would be understanding that the architecture needs changing in some way to enforce the ordering - reducing complexity by engineering better architecture that doesn't require guards or checks by design.
Correctness would add a guard to the code to protect against being called during unauthenticated state - adds complexity.

topaz stream
# cedar skiff A simple example: If something is supposed to be under the authenticated umbrell...

Oh.. I see what you mean on that point. Where when you run a codereview to simplify code, instead of simplifying it'll bloat the code with such complexity.
However thats why you have Developer Prompts & System Prompts in place to prevent such actions. You can define the Arch, suprises, easy mistakes, traps that could result in adding complexity. CodeRabbit specifically has custom review profiles, path-based intructions, learnings, and MCP tool integrations, meaning you can encode the arch inherit rules into the reviews that AI Codereviewers do. You define your module boundaries, your dependency rules, your "don't do X, refactor toward Y". And ofc the reviewer will enforce it as its a "System Prompt". If the codereviewer add complexity, thats the developer failing to teach the tool the architecture to begin with.

#

The Tool itself are only as architecturally blind as you leave them.

I can see your point where tools themselves "out-of-the-box" don't discover an arch change is needed on their own. But again, thats only if you leave them with nothing to grasp or giving them context before their review loop.

#

Auggie is actually a prime example of an "out of a box" that DOES actually know your Architecture. Its still in beta but will be released soon, its called "Human Intent" and I suggest you look into it. It uses augment's context Engine which is definitely a game changer for only 20$ per mnoth.

still trellis
# topaz stream 💀 Should be

just found out that sub-agents pull in full thread histroy on each one so that chows down on your context and crashes your autocompact if you have a few running at the same time.

topaz stream
#

I'm suprised codex would have that issue, most of the time its just claude CLI. Codex is pretty reliable in terms of autocompaction and long-horizon tasks.

hushed lava
#

anyone want a free business seat

still trellis
topaz stream
#

Another thing that I'd say would be better is trimming down from 1mil to the default on what OpenAI states.

#

Autocompaction tends to fail on higher tokens too as it'll just completely forget about the tool to be called for.

topaz stream
#

Unless you have a skill specificaly for that use-case. As Sub-agents by default are RL'd to summarize what they did and report back to the main Agent.

still trellis
# topaz stream Subagents don't pull from each other threads?

cant confirm exact, here is what codex told me after i asked it to find out whats going on...

This is happening because auto-compact is a best-effort cleanup step, not a hard guarantee.

I checked your live config and session logs: on this machine (March 20, 2026) your model_auto_compact_token_limit is currently 900000 (not 100000), while the active context cap in those runs was 950000. That leaves very little safety room, so a big turn (especially with sub-agents that copy full thread history) can still hit the hard ceiling before compaction saves it.

topaz stream
#

Mhm werid, codex copies the exact same arch as claude codes subagent stack.

#

Had to remove that image has it won't the correct one. This is what I mean by subagents don't talk to subagents.

still trellis
#

Mmm so at the moment sub-agents are just pawns for the main codex thread if I understand this correctly?

topaz stream
#

Yes, subagents don't talk to each other and are usually specifically not able to write on the same file. Claude Teams tho is a different story.

tawny island
#

why is 5.4 lazy??:

The prototype milestone is working and hardened, but the broader overnight end goal you set is not fully achieved in this turn.

cedar skiff
#

Requires harnessing for extra automation or a human with the comprehension.

#

Harnessing usually becomes code base specific for this kind of task

#

So the golden egg i was talking about would be - it just gets it.

#

Instead of building slop on top of slop, it would comprehend the bigger picture and not add complexity

high heron
hushed lava
glad wind
#

I cannot fathom going back to 1/2 usage in a few weeks.

#

I'm on Pro and it still isn't enough right now

high heron
ebon ingot
sand shuttle
#

holy shi after getting rejected 7 times i finally got approved on github student pack

orchid plume
# glad wind I cannot fathom going back to 1/2 usage in a few weeks.

best three tips I can offer are:

  • Get the AI LLM to use rtk, install rtk on your system and have certain commands go via that to save on input tokens and subsequently how quickly your context window fills up too
  • Use high instead of xhigh, many times high is more than good enough, just use xhigh when you absolutely need it for something
  • For simpler tasks and if you're using subagents, use gpt-5.4-mini for those kinds of tasks (e.g. explore I believe by default already uses mini)
glad wind
orchid plume
# glad wind Thanks for pointer to rtk, looks interesting. Not sure how much it'll help Windo...

I'm hoping when the new Pro Lite plan drops, probably after 2x ends, that if the hidden slider on the plans page for the Pro plan (5x / 20x) stays as it is, and Plus goes back to what it originally was then 5x and 20x are multipliers of the Plus plan. If that's the case then theoretically Pro 20x should pretty much feel like the current Pro plan (as if 2x rate hadn't been turned off), while 5x (aka Pro Lite) will feel slightly worse than the original Pro plan on 1x rate

glad wind
orchid plume
glad wind
#

20x would be better than what we have now, if Plus quota doesn't change

orchid plume
#

indeed, right now I measured the difference at 8x~ between Plus and Pro

glad wind
#

It's strange because the site says "6x higher usage limits for local and cloud tasks" which isn't clear but yes I also measured Pro = 8x Plus

orchid plume
#

yeah it's a bit odd 😂

glad wind
#

They just vibe code their site + apps

#

I really think they don't pay too much attention, too busy

orchid plume
#

yeah who knows, 6x might be an outdated value

wicked fern
#

can we specify which modal to use in subagents

#

gpt 5.4 mini doesn't suit my need

orchid plume
high heron
orchid plume
#

upgrading from Free to Plus/Pro resets the weekly quota, while upgrading from Plus to Pro tops it up

boreal holly
# orchid plume https://github.com/rtk-ai/rtk

That does look kinda cool, but I see it strips potentially vital info from the outputs. For example rtk docker ps does not show uptime, rtk ls -lah strips file permissions, size, and modified. Definitely more compact representation but it changes the output contract of many commands.

I tried to do something like this a few months ago specifically for cargo commands, making them output json and parsing out what I thought was noise programmatically, but I think agents do better when they have the whole output to work with, and a tool like that only works if you force them to always use it (e.g. replace their shell or make rules about it).

fallen sable
#

Hi all. Is Codex free?

brittle stratus
#

how is gpt 5.4 for everyone? Having a hard time with it today...

flat scaffold
orchid plume
flat scaffold
#

And you can see it being triggered as you would expect ? That means you might have entered the commands that you wish to re-direct through rtk yourself right ?

orchid plume
#

yes seems to work fine here, prefixes various commands with rtk on my WSL setup

#
[snippet] ...

Use `rtk` as the default wrapper for almost every shell command it supports, not just when the gain is obvious.

Reach for plain commands for shell builtins or cases where wrapping would be awkward or incorrect, such as `cd`, `export`, `alias`, heredocs, raw shell control flow, commands that `rtk` does not support, and all `npm`/`npx` commands.

Examples: default to `rtk git status`, `rtk ls`, `rtk find`, `rtk grep`, `rtk pytest`, `rtk vitest`, `rtk diff`, `rtk wc`, `rtk curl`, `rtk docker`, and `rtk kubectl`. Use plain `npm` and plain `npx`.
If `rtk` would change semantics, hide information you need, or make the result less reliable for the task, use the normal command instead.
#

adapt/modify as needed

flat scaffold
#

Ok I'll try then, seems to be something good especially for tests and ruff outputs

#

Thanks for the plug

frosty zealot
#

What’s the link to the security audit again

#

Nvm found it

nocturne folio
#

istg if u sneeze youll waste so much of ur codex usage

#

i cant do anything

#

its so annoying

#

even with the 2x increase

upper thistle
#

are the limits-issues resolved or what?

still trellis
nocturne folio
still trellis
#

no it was unbelievable but this week it seems to be back to reality

strong jungle
#

anyone else getting this message? stream disconnected before completion: An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID 4697ff8e-2815-4364-bcc6-d97b1fe4418b in your message.

  • I keep getting it when its almost done with a task and its sucking up my usage
frosty zealot
#

Avengers assemble!

fallen sable
#

Guys is codex free?

raven kettle
#

I'm a student but I pay for Plus 😭

raven kettle
#

To support the next generation of builders, we’re offering verified university students in the United States and Canada $100 in credits to use in Codex.

fallen sable
#

Not in amrica tho 😭

raven kettle
#

US and Canada only 😔

#

Wow, the offer isn't valid in any cool countries.

silent heart
short linden
#

Since I start use the chatgpt/codex, somehow auto topup credit charge me 125credit with original price of 250credit, since my credit start from "0", first time initial the billing might be wrong somehow. Until now codex charge me every 125credit for $10 🤬🤬🤬 I willl try contact support. I just come to warn you guy if you just start to use the codex.

hushed lava
ebon ingot
deft gyro
#

its literally

#

not better than high in 99% of usecases

#

like

#

yes please feed me context rot

orchid plume
# deft gyro who actually uses xhigh

true, voratiq's benchmark even shows that it's almost the same rating as xhigh, but takes a lot longer on xhigh and presumably burns a lot more tokens

deft gyro
#

yeah voratiq benchmark seems to line up with my experience

#

except g31ph being below claude haiku lol

hushed lava
ebon ingot
hushed lava
#

Does anyone know the difference between the codex weekly usage and 5-hour, and if it is higher compared to somebody on a business team plan, business plan, and somebody on the pro plan and the plus plan? Is there any difference in percentage of how much usage more, of how much more usage they get, or no? Someone let me know. And has anyone seen if you actually get, you actually take up less usage if it's on the CLI or the VS Code or the codex app?

raven kettle
sand shuttle
#

is 5.4 any good over 5.3 codex

raven kettle
#

🧠

sand shuttle
ebon ingot
silent heart
#

cuz i am on pc

#

and i type fast on keybaord, even asked chGPT

cunning thicket
#

hey folks. now that I’ve completed Trusted Access for Cyber on chatgpt.com/cyber, what a
dditional things can I do, and do I need to switch the model to GPT-5.3-Codex?

winged ore
#

there is an annoying thing that Im noticing in codex..
it has the terrible habit of spawning a terminal session and abbandoning it without properly killing the console session..

#

for example, the AI will run something line npm run dev which starts a live server.. well, it is ok, the server runs, the AI does a curl request to see the page is correct.. but then, since the live server process never finishes.. codex leaves it running..
and then, after a while, next time, it starts another live server, which automaticlaly rolls over to the next available port.. and like that, the system gets littered with useless running process

hushed lava
#

10% used on weekly limit on plus plan might be like 5% on pro plan

#

etc

orchid plume
whole gate
orchid plume
whole gate
#

That's just the standard ChatGPT Pro for me. I really don't know what people are doing with Codex that they're using up ALL of their allocation haha 🙂

hushed lava
timber cosmos
#

anyone have advice for driving better frontend/UI development? I find by default, Codex can't design itself out of a paper bag. Maybe there are ways I could prompt/support it better ..?

whole gate
#

Use a tool such as Figma for designing flows. I usually tell my LLM to "create a design like Figma" and it gets that.

plucky halo
winged ore
#

also, some skills can be really heavy

silent heart
#

.

nocturne folio
#

omg bruh

orchid plume
orchid plume
timber cosmos
whole gate
#

It took Codex three prompts to create an MCP server integration with my app, it has been super useful, I expose database table schema data, the code planning agents LOVE it 🙂

#

I get stuff like "Oh, this is excellent I can see the entire database schema here" hehe

#

Since a lot of my application is controlled by the configuration data in the database, this fills in gaps the agents need to know, too, there's options to dump config tables and the like.

#

Like, none of the navbar is actually defined in the app, it's all read from the database and built dynamicly based on your security roles, and while there's documentation that describes this in the project, the agents being able to just look at that data is far more effective.

ocean fulcrum
#

I cant post on reddit for some reason:
Hello all,
I am building a web app based on python. The app is basically parsing pdf documents for my company. I need to embed AI into it in order to improve accuracy and speed.
I am curious to know if it is possible to use ONE ChatGPT Plus account that will go to the back-end only through OAuth Sign-In method instead of using an API key.
My ideology is basically this: OpenClaw has it where you have the option to use OpenAI through OAuth instead of an API key. Can I use this same idea to my project?
The AI responsibility is: end-user uploads a pdf then it goes through the my python parser web app and then AI checks it and corrects what needs to be correct then spits out a .csv file that the end-user needs.
Ask questions if something is unclear, please do give me your input if you have any knowledge about this.

whole gate
#

That's very easy, my app does something similar, I import data from Azure Billing API and stage it, the app has an MCP server built in that can read the database tables, this does require quite a bit of work, however, as you need to implement OAuth 2.1 fully to publish a public MCP server. With this data available to agents, it's very easy to do reporting and utilization projections

#

I did not implement specifically a document parsing provider, since there are APIs for this data, but adding an adapter to import documents wouldn't be difficult

#

It was my backup plan if I couldn't get the API imports working 🙂

timber cosmos
calm sigil
whole gate
#

Have to enable developer mode and add a custom app

#

In my case my auth is locked to a provided code.

#

There are callback and scope details that are required to get that all working

timber cosmos
calm sigil
whole gate
#

I actually only have MCP attached to Claude, which does all of the planning and design, Codex does 100% of the code work. It's a good combo. Codex often does things that Claude comments on in reviews like "Codex went ahead and made this additional change which is a good catch" stuff like that 🙂

#

Codex just, for example, removed an unused reference from a python file that Claude specifically said "You can leave this as it does no harm". Codex understands my security approach and reads the documents that also detail in the coding standards "No unused references" and while I don't test for that on the Python side, I do test for it in the ReactTS frontend 🙂

whole gate
#

My dad wrote the CISSP...

calm sigil
#

I meant with codex, it allows you to push more insecure processes.

ocean fulcrum
whole gate
#

I told Claude "Implement MCP server in my app" and I'm not actually trying to be flippant here. The state of my app, however, was that it already was built on OAuth2, if you are really interested all of it is public https://github.com/elide-us/elideus-group

ocean fulcrum
#

is the keyword MCP server or something?

whole gate
#

I will add that the library for FastMCP had its own starlette server instance in the library, and since my app is already FastAPI, I just unwrapped the library and ran the MCP message processor directly.

#

pip install mcp

#

To run an MCP server, however, requires a complete OAuth2.1 implementation, certificates, JWT generation, there's a lot to it.

#

In my case, I support OAuth from Microsoft, Google and Discord, and there are configurations required for each of those services to support this as well.

ocean fulcrum
#

well im not that advanced so i will admit, i am having troubles understanding what your saying

calm sigil
#

Anyone ever seen or overidden Codex's default prompt?

whole gate
#

For sure, I mean, I've been building enterprise infrastructure for decades, what might seem simple to me is... probably not 😄

#

FastMCP is the library from Anthropic that can run an MCP server. There are some infrastructure requirements to go with it if you want it to be accessible from public services like in your Claud.ai or ChatGPT, but for local work it's probably easier.

whole gate
#

It does often seem like a lot of extra work, I actually work for Microsoft, so my perspective is probably a little biased, but... there are reasons we do all that extra crap 😄

ocean fulcrum
calm sigil
whole gate
#

I thnk for what you're doing you don't need an MCP server, you just need to set up an API to call, send your documents for processing and get back the response

ocean fulcrum
#

But I also want it to continously improve the codebase for when an "unseen pdf" is uploaded

whole gate
#

I think in this scenario it would likely end up being more expensive if you allow an AI to read your documents via MCP. It's pretty sloppy, I've had Claude read the same endpoint four or five times in a row for the sam tasks. Set up two tables, one to stage unprocessed documents, set up a timer job to poll for new documents, open them, send them to an API that can do whatever you're looking for (OpenAI and many others provide a lot of content review options) and then move the results into a results table or storage endpoint.

boreal holly
whole gate
#

Win32 hasn't changed since Windows 7 SP2 (Kernel 6.3), most of the security issues that are discovered these days are often patched with Rust, but I don't think they touch it unless they have to

#

To be fair, the main reason I still use Windows is because it supports FIDO2 for login.

soft glade
#

why does codex still ask for permission for every new file edit? this permission does not seem to work as described. WSL CLI if that matters

soft glade
limpid flax
#

the file it's trying to edit

ocean fulcrum
soft glade
whole gate
#

I will also add that the last time I checked there were 11,300 models available in the Microsoft Foundry (Azure) which allows you to also build automation flows to kinda create your own API systems, but that is all also tied into setting up a Microsoft tenant and endpoint configurations. Nothing more obtuse than any other cloud service, though.

soft glade
whole gate
#

Have you considered asking ChatGPT what's going on? 🤣

boreal holly
soft glade
#

genuinely the issue has gone away as soon as i posted about it. maybe because i reselected Default permissions again though i'd done it before

boreal holly
#

If you have workspace_write sandbox, it may complain when the submodule index changes. Idk if what I recommended would help with that (probably not) but keep us posted

fervent anvil
#

👉 **Codex Needs Memory Enhancement and Anti-Spaghetti Vibe Coding.**👈 As a vibe coding tool Chat GPT-5.4 on vscode would be a million times better if it could keep the whole repo and documents "in mind" when editing and writing code. However it is a constant case of one step forward, fifteen back. So many regressions, loss of context, writing a new path through the code rather than using what was there and so on. Needs smarter training, but it also likely needs something better than transformers in reality.

fervent anvil
#

You disagree that it needs more memory, or its use?

loud dragon
#

i did enable the 'guardian approvals' experimental option, curious if that's getting logged differently for some reason

boreal holly
fervent anvil
#

Yes, that is why I think transformers aren't the solution for more memory because of how they scale. When the system can't keep the full context in view it has the spaghetti "design pattern." Meaning it is harder for others to read and harder to maintain.

cosmic carbon
#

Ok so easy tasks im having a lot problems. Me, "hey codex add the image file from the image folder to the page with the same name." Thinking (10 min) I try to create the pr.."binary code not supported") Me, "Hey codex binary code not supported." Codex(thinking 10 min) screen freezes. I reload 4 times. I try to create the pr. Binary code not supported again. Hmm its been an hour and we still don't have the image on the web page. We have some real problems here. We need to admit that and fix them.

boreal holly
# fervent anvil Yes, that is why I think transformers aren't the solution for more memory becaus...

Perhaps, but it's modeled after human intelligence. When I read your message I prefilled it into my brain, and now as I'm tapping away on the keyboard I'm decoding the response.

Having an infinitely large context window is not the answer. If we went that route, then in order for me to type a response I have to think back to every text I ever read since I was a baby, all the way up to the text in your comment to come up with a response.

OpenAI's solution is pretty brilliant. They have a PhD level checkpoint full of vast knowledge, and a context window large enough to hold what's important. When that window fills up, it converts all of it into a mental state snapshot. It's kinda like waking up in the morning, remembering the important details of the last few weeks/months, and living in today's world.

#

Whatever they're doing over there in the research lab is good stuff imho

vivid viper
#

nvm i'm blind that's like 3 lines below

chrome raven
cunning thicket
#

hey folks. now that I’ve completed Trusted Access for Cyber on chatgpt.com/cyber, what a
dditional things can I do, and do I need to switch the model to GPT-5.3-Codex?

orchid plume
#

am starting to read the designing frontends guide but interestingly the example prompt they give mentioned an image generation tool. Surely this is API only, not supported in subscriptions (a tool to generate images) in Codex CLI. Unless I've missed the tool entirely and didn't realise such a thing existed 😂

EDIT: there's such a tool, but it's behind an experimental feature flag and even if enabled it seems to be unsupported on subscriptions

still aurora
#

stuff like this makes codex awesome:

"Negative flags are harder to reason about."

topaz stream
velvet zephyr
#

you can just build things

lime inlet
#

Is it possible to use codex to work on Kaggle notebooks? Is this what MCPs are for and if yes does such an MCP exist?

velvet zephyr
#

let codex figure out 🧙‍♂️

lime inlet
#

Right now I am saving the notebook, reading it into codex, importing in Kaggle, executing, downloading... kind of defeats the idea of the agentic loop.

balmy acorn
#

anybody else having hella issues

sand shuttle
#

yoo

#

5.3 or 5.4

#

in terms of output quality and token usage