#codex-discussions
1 messages ยท Page 58 of 1
I wouldn't say so. I would say that we game devs are a different category.
yeah in the future for sure, I see a future in which you only need a one line prompt to make a AAA quality game. It's just going to be very surprising, one line isn't much to go on when you're essentially asking an AI to write a novel, so are the users really getting what they want? Maybe, if what they want is vague enough.
I spent 8 hours once adjusting font outlines and rotating text a bit to the left until realizing I didnt even need any of that and eventually I completely rewrote the Ui into way better one in less time anyway
I was thinking like a technician, not a gamer. Because gamer would never notice that font outline
they would just either get bored, or have so much fun they'd be tunnelfocused on gameplay (not on font, like me)
Although the gamer might have nocticed it unconciously, and preferred it that way without realising. As a game dev it is our job to figure such things out ahead of time.
exactly, something that a gamer wouldn't notice. Trouble with things that are perfect is the user only sees perfection, they don't see the 8 hours it took to make it happen
800 hours in some cases. ๐
yeah maybe you didn't actually end up using it but that's just an example
Over 9,000!
I've only spent like 2000 hours on this game so far.
I came to my conclusions and concerns because I was playing lots of successful games (still made by traditional AAA companies) that consistently leave small bugs out there in releases
"damn the button text is slightly overflowing in English locale here"
I am sure at least 1 guy noticed this in that 5000 man company. But they purposely gave no fuks.
Which works out because I respect that particular game and it is widely succesfull too. They created value. They did not have to bother about overflowing font like I did.
I love agents though, I'm doing stuff I never would have done before and faster and better than I could have done before, but that's coming from a place of sort of already knowing what I'm doing and being able to see beneath the surface
In fact i am sure nobody noticed like I did
from players I mean
If you would rather add new features every week/month and release it to world. as company or as individual its smart af not to fix overflowing text if that gives you more time
I am sure thats a purposeful prioritization framework utilized by many game devs who ship product fast and effeciently
Not me. ๐
I am not close to efficient
I have only artistic vision as a controlling factor
I mean like im perfect lol
but ofc its fun to work on important stuff and move on even when things arent perfect. and thinking like a gamer rather than engineer seemed to help me lately
I understand that drive, but I don't share it. I can't move on when something isn't the way I envision it.
haha fair enough
I mean thats whole point of being a dev working on your vision
if overflowing text is not something you would rather ignore then you wont, ezpz
so yea
ya
so that represents another way to make the same argument, a human felt something when they played this game and they said "this is good" so little things like the overlapping text didn't matter in the grand scheme
yea
I am saying this from personal experience. I absolutely love that game. Its really well made. But I consistently notice "damn devs got REALLY lazy here" and yet... it doesnt even matter. Its almost like they mastered taking shortcuts in the ways that dont matter, yet some solo devs including me would probably spend 10x more time for barely any gain
I do wanna learn that.
it's about seeing the forest for the trees, or in this case, taking as many shortcuts as possible to meet a production deadline? lol idk pick one
We ended up creating a full language for our game. That idea is incredibly inefficient.
You can only ship the initial release once, better give it the time it needs before pulling the ripcord.
indeed
๐ซฃ
Updates are great but everyone remembers the launch lol
unless they dont xD
This galaxy generator I'm working on. I don't think I gained anything by using AI. It's been taking so long I probably would have finished it faster on my own.
This is probably going to get more and more rare as time goes by
if you're talking about games, oh the hell they don't lol. KSP 2, Cities Skylines 2, Roller Coaster Tycoon World, SimCity (2013), half baked products that lost steam because no one could forgive their poor launch state
and the list goes on and on
I think devs could not deliver the product so they gave up
Most of those games were developed as AAA 1.0s that were intended to meet all their financial goals at release date, same fiscal quarter. And if dev team couldnt deliver a stable valuable product and customers did not shell out enough money, spending more millions on completely overhauling the game would be a disaster
Meanwhile paying $20-200 for codex isnt some sinking disaster situation
sometimes that might be true but it also has to do with being devastated by the real life reaction to something you care about, so why would you want to keep caring about it? That bad launch drives good devs out of bad companies like ants from fire and the business collapses so the publisher has to find another developer and the cycle continues
And I just crashed Unity...
yeah ofc.
haha you are right.
for indies its emotional damage
for AAA its bank account gone
congrats, that takes work lol
Thanks. I'm a bit of an expert though.
lol
now, that said, who wants to playtest my half baked barely working game? ๐คฃ ๐ญ
lol jk, gimme a week
i can but you gotta playtest mine ๐ผ
yeah definitely
Mine isn't even close to playtesting
I've been working on my own game using Codex exclusively for about 6 months, what kind of games are you guys making?
which stack?
4x spacegame
mine is closer than I ever thought possible, it's lacking content and the UI/UX needs polish but I've been mainly focused on underlying functionality and the client/server crap and it's SO close to the point where I can finally shift gears into content creation
Im using raw C++ using SDL
I started in Unity, switched over to Monogame, and now I am back in Unity.
love Codex so much
nice
I almost went with Monogame
I liked it, but it wasn't right for this game
I'm using a JS stack for Android deployment, should be enough for a 2D game that's mostly about numbers
yeah I like JS for its codex agentic magic
Do send me a link about it in a pm
will do
BRO BE SO FOR REAL LIKE COME ON
hollandianish?
YES HOW YOU KNOW
hahah guessing
that's not such a weird reaction of it
what did you ask for?
Basically asked it to find vulnerabilities in his software
"It might be hacking!"
WHY
like people were reverse engineering stuff?
to basically scan my database and find vunerarbilities
maybe he just needs to be more specific about the fact that he made and owns the software he's scanning
idk
i redifned the prompt to not do any sql injections LOL
now it works
ooh
Their safety AI (probably a separate AI that reads stuff first, and is much much simpler) flagged it because it thinks maybe this is someone who is trying to hack someone elses software.
yeah
@signal tapir how do you know what i asked
how do you guys prevent this behaviour: I ripped a spec from a site and then got 5.5 to make a skill from it.
Part of the rip process is to take any vids and convert them into 5x5 frame captures and replace the links to target the frame captures instead. This is workflow history, doesnt need to be mentioned in the skill. 5.5 just goes and adds the history of the work flow to the skill. It doesn't understand that the history doesnt need to be in the document. It does this ALL the time.
WebP contact sheets replace MP4 animation examples in the bundled rip.
Well I am dutch, like 5-6 generations back, but I know none of it. I do know an AI that can translate though. ๐
HOPPI ANNOYINH it looks like its working now
oh lol
well in that case you might as well call me Norwegian lol
then you can call me a south african
you guys speak English as native langauge or
lol idk, I think we're all just citizens of the world. ๐
those are my natives
my native language is only spoken by 3m people so
dont know what that is
^^ case in point
YOU SPEAK THIS
I'm Swedish. We have Swedish and Swenglish as native tongues.
Swenglish is English with a really strong dialect, and a tendency to just anglify words we don't remember in English. ๐
looool
yep
interesting
I mean arabic also looks different from most languages similarly
i need to learn how to speek greek
written
ITS EZ AS HELL
yeah but its not like latin/english
My game has a language where two voices speak at once. One carries the semantic meaning, and the other adds emotional context.
yee
its just some drawings aswell
lolol true
like i can speak it
but draw
no
just no
last time i wrote a cursed word instead the word i needed to write
bro anybody has now issues with codex? like after last update i start it and then codex simply doesnt answer?
you press enter and nothing happens?
ok back to my questions and answers, I opened a can of Monster and brought my laptop outside for some fresh air and a fresh perspective lol
then tell claude to make codex and make no mistakes
it just goes liek this
dayum
these are just drawings for me too
thats russian
it looks like latin
Ukrainian I think
codes isnt responding
codex is beaten
lolol
LOLOLOLOLOL
Ik praat nederlands, is dat toegestaan
codex aspp
HOW IS THIS NOT BLOCKED
well it doesnt actually read content it reads letters
if one of them is not English maybe
what if its mostly english but
oh
idk I only have english keyboard...
thats yex
And then there are those of us on Linux
looool
try updating from windows store, or unistalling/reinstalling. or use different PC with different windows ver
Try asking GPT
if i uninstall my files will still be there when i download it back
I dont face any bugs at all on mac
im jelly. need to get macbook
I didnt realize how windows got out of fashion as of late lmao
yes
codex data is saved separetly, i doubt it will be gone but you can try archiving it and moving it for safety. ask chatgpt for exact instructions it'd explain better than me xD
agree. be safe.
its annoying how those blue blips on completed chats dissapear after restarting app or updating though. I keep forgetting unfinished work still sitting there or unexecuted plan.md ๐
i do
lawdy. I just have a $5 api bill
"Hey Codex, uninstall all my files."
i meant if i uninstall codex
would mi files be safe
that has to be mostly cache though right?
My agent instructions are stored in the project directory, so the only thing I'd lose is settings in the actual Codex app which only include some minor PR draft instructions and a couple plugins. Would take like 2 minutes to get back up. But that could be different for you if you have more going on in the app.
I'm handling most authentication via command line and environment variables so I could use any agent and they could pick up where the last one left off
gonna try Claude Code at some point
I've been using 4.7 Opus via Cursor so that necessitated the interoperability and modularity of my agentic workflow
Fellas, I have been thinking about something... how about creating a Linux distro that is mostly AI driven? Complete access to everything the system has to offer... Would you use something like that?
Well the real question is, what does your specific AI oriented distro do that another agent installed in Ubuntu can't do?
I mean Codex seems to have pretty comprehensive control over my Linux server via SSH
Basically having total powers, meaning it can use any app (terminal or GUI), read any file, make any sudo changes... basically full powers, comparable to those of a real user.
Cant we do that already?
if it can be done via CLI, Codex can already search the internet and figure out how to do it. Not sure I really see a market for an agentic distro, except maybe for a distro that is specifically intended for high load AI tasks like protien folding or something
but those already exist
I told you, complete root privileges, and able to use the graphic capabilities of the distro
Im not talking about GUI
Graphical is literally the first word in GUI but okay lol
im talking about anything you use. You use always a GUI, when you use a computer, no?
I am talking clicking on a page, watching a YouTube video, etc...
not always
The current tools (like Codex) cannot use websites or things like Photoshop
I think apps like photoshop are changing to include an api for llms, and eventually just an api for llms
A lot of the stuff we use a GUI for or even Photoshop can be accomplished via CLI with Codex, also Codex actually can use websites via Computer Use with multiple browsers
"Computer use" <- MacOs only
That is why I said "Linux distro"
I use it on Windows every day, it's not MacOS only
but it's not on Linux so yeah
"Computer user" <- MacOs and Windows only
That's why I said "Linux distro"
Specially since OpenAI seems to have zero interest of even making a Codex App for Linux
I still think it's a solution looking for a problem, and you're talking about something on a pretty massive scale, designing an agent that can use graphical tools isn't an indie project
imo
Im not trying to build the next Ubuntu. It's clearly an experiment. Not something where I would put sensitivie info.
My plus expired If I buy again will my usage instantly grow back to 100% or not?
yes
cuz my weekly usage was alrady like 0%
if your subscription expired, resubscribing will bring it back to 100% available. Same thing with upgrading.
cool
but why my plus was on like 10% weekly and when downgrade to free it shows only have 49%
remaining
oh prob because it expire an hour ago and I used in CIL and didn't notice it
guys what ur opinion whats better claude code or codex?
discussion about non OpenAI models is in #ai-discussions
oh oki
no it didn't
rip
that doesn't make sense, if you had no subscription, subscribing obviously increases your usage. When I upgraded from free to Pro, it went back up to 100% right away
so yours and my definition of "resubscribe" might be different
if it barely 0% already I don't have to waste my 3 days money I can just wait it reset
it reset my 5 hour usage limit but not Weekly
unplug it and plug it back in
Yo guys wsp
lol I mean, try logging out and logging back in, I think there is a known issue with caching
but hell idk, seems like any time you pay money, that resets your useage. So if that is your first subscription or if you're upgrading from Plus to Pro, any time you pay them for a subscription the usage should be reset.
but maybe I'm totally wrong
Yo i am new to agentic coding and kinda broke any advice
Kinda broke or flat broke "how are you even online right now?" kinda broke? lol if you can, the $20 Plus subscription has a lot of value
you can get some pretty decent beginner stuff fully made and working with less than $20
but my biggest advice is, try to understand what it is that the agent is doing, read along with it
Yea but the condition of plus is bad because people use free trial and plus quota is also filled with people trying to abuse it
oh I didn't know there was a queued system or overcrowding or anything, I'm living that Pro life lol
Until the end of May, they're offering 10x useage on Pro compared to Plus instead of the standard 5x.
Claude is also making massive moves like they are partnering with spacex and aws just to fix overcrowding gpt has to do it also
Hi guys im new
Hi
Hi bro
hey you're here just in time to witness the downfall of humans and the takeover of our new AI overlord. Welcome to the beginning of the end! lol
looks like it doesn't reset quotas after resubscription. The reason I need that methods is I need 2x Plus usage this week but 10x Pro is too expensive for me and it will have a lot waste.
Lol
Just use glm 5.1 guys it is also paid rn but atleast it is usable and better than codex for now only for now
o
looks like GLM 5.1 falls significantly short of GPT 5.5 in agentic coding
but then GPT 5.5 is about as expensive as is gets
Yea glm is like 1.40$ input per mil token only
Claude is more expensive than gpt it is hella expensive like 10$ man thats too much and they have an 20$ base subscription
Thats subscription used in no time
Claude models only seem more expensive because they're less efficient and uses a lot more tokens, Claude is really falling behind these days imo
Yea it uses alot of tokens but it also leads in coding and thats their main thing and they have more profit margins because of tokens from 3b to 30b in 1 year
true, that's why I pay the money and use 4.7 Opus for high horizon tasks
but not for anything other than audits and planning
I dont think openai has that problem they already have more compute
AI support
I was conflating the upgrade path with the renewal path, my bad, never trust humans. ๐
Copilot only give AI 50 lines codes, but CodeX give the whole files which spending more tokens, looks like I need make it start spending less.
Yea but if had the chance i would make an community api for people it will have que system if anyone don for it then I'll mak it lol
Any methods make me save some tokens?
But i will need contributers
Use specific prompts to get the work done faster and with few prompts and tokens
Create a list of architecture invariants to tell the agent what to do, what not to do and where to look for things so it doesn't have to guess and use more thinking tokens
that's just the very tip of the iceberg lol
Yo andrew how many subscriptions do you have btw lol
Never ask a woman her age
Never ask a man his salary
And a dev his ai subscriptions lol
$200 Cursor Ultra, $200 ChatGPT Pro, $115 Scenario (gonna cancel that though, probably) and soon I'll have a Claude Code sub as well
and I've been at this only 20 days so far, I first learned about agentic vibe coding and left the rest of my life behind lmao
Fahh lol how much do you make or its just hobby projects
I've been hyper focused on developing this game, a game I've always thought about but I assumed it would be too complex for me to actually accomplish and then I discovered agents
Oo topic?/name? Of the game
Yo they added live animation of loc beeing changed haha looks like its about to TP
This is just a personal project but I'm calling it the million dollar plan lol. Trying to make this game as good as it can be.
it's an Idle clicker I'm calling The Bureaucracy, an absurdist 1930s office style paper pusher game
the whole game is centered around the stamps, not postage stamps but rubber stamps with handles. You stamp paperwork. That's the game. lol there's a lot more too it but that could get long.
Oo nice what any help because i dont have ai subscriptions but i specialize in marketing and helping you market this niche game will make me feel good so if you are down then tell me
haha I appreciate the offer but my game isn't anywhere near ready for that, I will need playtesters in a couple weeks though
Yea i can test it
If you need any help in development then tell me lol
yeah you're not wrong, but at the rate I've been going with agentic vibe coding, it's coming along way faster than I ever expected
Codex is blowing my mind every day, always learning something new about what it can do
The curse of vibe coding is that if an error appears you have to make 2 more errors to fix the error thats why it takes a life time to code a real project with ai
easy solution, tell the AI to keep working until there aren't any errors lol
well that's a very simple answer
but I'm tired
Lol but then your nightmare comes credits expired no more credits lol
If youre ok having 100k lines that does 35k lines of work
then u tell optimize
and it pushes 200kloc
maybe at first, the code can be a little messy and that's fine, optimization can come later
It will add complexity via abstractions and you will get MORE code.
Is it okay to have two actually paid plus subscriptions on two different accounts to work on a single project?
Optimization can come later is a lie lol because after you finish the game without optimization it will take a hell lot of time to optimize it
ye but it just needs to create a wrap around the code to know how to optimize it porpperly so then it can remove such wrap
Is the Codex app ever coming to Linux I wonder?
Any idea?
sure, except you might use a few more tokens because you'll have less to cache when switching accounts
Thanks!
Honestly i came here to get a free ai subscription because i discovered how to edit vids using ai models but i got friends in return fair trade
Well its now time to sell my kidney and give all the money to sam altman
I think what you need is a good archtecture. The problem that is faced is the llms dont get that. They dont just understand architecture, they need guidance on it. Without guidances they patch things in place with out understanding the bigger picture even if you tell them to. Like a simple reorder and an async gate might solve a whole arm of null value gating in 20+ files, but it wont notice - it will optimise the null gating into an abstraction and make it dry, that means MORE code. It just doesnt see this sort of thing with out guidance.
Maybe one day soon it will, but atm it doesnt
it just brute forces its way through fragile solutions
I actually solved that problem, you're totally right about this. We struggled with this so much that we decided "nah I'll fix this myself for AI"
Improved the performance so much
Yeah i solved it as well, but we are talking in the context of vibe coders
Can you share your solution? I'm very interested to improve it on my side if you got any insights
Or at least method if not open-source, would love to learn
There are multiple approaches of course but I just have Opus run an audit between major phases of development. I ask it to "comprehensively evaluate the execution of X plan and PRs ###", and it can use the PR body that has been molded and shaped by agent workflow so it contains blast radius, change contract, migration notes, acceptance criteria, test results, agent handoff notes, content change class and a bunch of other stuff. So Opus has the context to understand exactly what changed and why. The purpose of that is primarily to make sure everything is in order after large changes but the same sort of approach can also be used to optimize the code as well.
it is a rule on this server that all non OpenAI discussion take place in #ai-discussions please move discussion of Claude there
ah sorry about that
I have instructions in the developer-instructions that dictate writing code based of local exemplars if its available. I have an architecture skill that is toical and has reference files that define the seams of the layers, the layer constraints and purposes with code examples etc.
Then for idiosyncrasies i encounter that slip through the system i have another topical skill that is project guard rails where i add generalised rules that cover those as they show up.
When the project is first spinning up i build this stuff as i go in the first stages.
In the end it just works.
I generate skills for packages as i need them. For example if i am using stripe i get chatgpt to do a deep research on the latest api and best practises and i build a topical skill for it.
AND i look at the code.
The only part i dont pay too much attention to is the front end
If it looks good and is responsive i dont care how it got done
How do you do it?
Oh that is actually pretty solid, I do have a baseline layer of exactly this but I've mostly limited it to rules, global constraints and an outline of 'where lives what' on a high level + a basic description of its purposes.
The core of my method lives in entirely removing any semblance of human-defined architecture baseline -- Basically a graph database that is aware of all code construct relationships (docs, functions, structs, enums, etc. etc. etc.) and creates semantic links based on references within code or docs, then use a accumulated spherical similarity index with subgraph recombination as a semantic injection into AI context based on current task / conversation context which always provides AI with all relevant documentation context and relationship between code and its function.
The graph embeddings rebuild with local AI for every batch of changes
I see a lot of this sort of approach for delivering relevant context to the llm, haven't seen it for architecture yet
I might make it today over the 2h goal
The best part is that it is all compiling, so I am barely using tokens ๐
Make sure you don't let your AI do the compilation
Yeah it's unique, works only with Rust though
huh? that's one of the best things about agents, compiling, discovering build issues, recompiling, then handing you working code every time
This happens everytime I try to resume a session. Any idea?
I get this caution after 5 re-tries
Guys, does Go plan give more codex usage than Free plan?
How is your experience with goals? i am running one now on a mechanical refactor, hoping for the best....
mechincal refactor?
as in no business logic changes, but the refactor is wide, say when changing terms/naming's both db and code.
I wonder how mine will go cause without goal mine worked for 8h straight
This is bad really bad, you need hands on in majority of those cases. it's a carefull process. You can't just send the AI to fix things like that, many of those require broad business logic knowledge and broader view of the codebase, AI is not there currently. AI made typing faster, but in the wrong hands you just get bad code faster...
when agent is coding it tends to close prematurely. so you can't really trust it to spot edge cases and have a very clear understanding of what your intent it, without making your part of explicitating a lot
I find it very useful actually. like "stop when no high/critical bugs are found over 4 iterations" or "<some KPI> is under <some measurement>" while its doing a related task
I dont just give it vague high level tasks though, I think that could get into a lot of problems
usually I forget to use it though ๐
My idea with the mechanical work is that it's seemingly easier to test, because logic stays the same but namings/terms change... I for one will never use /goals for any logic refactor/work, maybe when GPT-6 will be released it'll be strong enough for this things...
yeah double edge sword for sure, b/c you are basically telling it "do this, no matater what"... so then do you really know how aggressively it will make decisions to realize that goal...
The goal i gave it has a very concrete plan, with tests, what the end goal etc...
or /goal the e2e test xyz must pass
crazy im on $100 Pro plan and run out of weekly usage in 3 days even with it being on x10 usage. When it goes to 5x, its gonna be completely un-useable for me..
Are you using /fast per chance? if not then you need PRO $200.
you sure use it a lot then
nope, never /fast
what are you doing with it?
building iOS apps/jailbreak tweaks
sounds about right. I use my $200 plan in 5-6 days. with or without rtk
i had Claude's $100 plan before and it would last me longer
I have that also, I can assure you the claude plan is used much faster. even with the double limits they just announced
claude token rate is very extreme on xhigh/max compared to codex
they only doubled 5 hour usage though right, not weekly usage
maybe. I still hit 5h limit. its a bit better but it was absolutely awful before
no they increased opus limits also (not sure how much)
how many projects are you running in paralel? i'm working on 2 on 100 plan and stay like 8-12 hours per days and current 10x is just about right
ive been working on several projects like 4-5 but not all at the same time, all the time
ok then i guess you're current usage frequency requires more quota. or lesser models. or optimize the way you work
There needs to be better coordination between GPT and Codex. The pulse is awesome it gives good insight but a lot of times it is not on the same page or up to speed on current projects in codex. For example it will make a recomendation for improvment that is already been done in codex. It would be great if the two communicated better. Anybody have way to make them work better together? I think a easy fix would be the "pulse" channel could just be shared between both apps
yea i guess
but yes when limits will be halved...:)
๐ฅฒ
Guys how often do you refactor code?
depends what you want. some things can be done from your side. some others like telling pulse to check your github you can't
currentlyt using notion to manage my projects so having both chat and codex refer to that has helped
github a good idea
for example i made myself a custom UX so i can have linked threads from codex and gpt linked in my app
i would probably wait till they release more features to connect the 2
you can make things like having your own custom connector in chatgpt that has a codex from the other side. so you can have chatgpt communicate with codex via that connector for example
but right now that's only one way, as in telll gpt to query codex. cause gpt doesn t accept incoming messages via connector
i like it. Sweet UX. I used to uses openclaw and stilll do and it built a pretty cool mission contol. But way to many headaches with all the openclaw updates and things breaking. Trying to bring all my workflow to gpt/codex
yea, the way i see it, i have all my context between the 2
but they serve very different purposes
chatgpt has access to a huge context inside our projects there
whereas codex i see it more as the on the ground engineer
like i'm using chatgpt to review specs drafted by codex. precisely because with chatgpt i brainstorm and it holds a broader view of the intent than codex
which is one reason i made that ux, cause i was tired to keep copy pasting in various windows
Anyone having issues?
for sure! Gpt is a great copilot
Has 5.3-spark disappeared from the /status ??
new to me
Ayyy you integrated ChatGPT into your Codex wrapper? Sick!
yea it's a dedicated chrome tab just for gpt, and i link a specific codex thread id with a specific gpt chat so i can have both on at same time . plan is to also have various automations like one button to send new specs for review in gpt chat and so on.
elctron seems quite cool to use it as some kind of sandbox for your workflow
like it was annoying when i checked the opened PR's by codex to get a new page opened in my main chrome. So i made it so it opens a dedicated github right there inside the middle plane, whenever i click a link from inside codex chat
Wait but how have you connected codex context with gpt?
That is pretty cool... I've been thinking, I have a custom UX as well, but it's native. ChatGPT Atlas has agent mode and access to localhost. So if converted to a web app, the ChatGPT Atlas could pilot the local codex agents.
Share context in a streamlined fashion
dang i cant wait till my limits reset to test goal o.o
they're not directly connected
they just live together in the same electron wrapper
and at the wrapper level you can define persistent actions, like always have codex thread x show up with gpt thread y
I don't need to wait for reset. I still have a whole 1% left. ๐
i got 14% but im using it slowly
the codex UX is also by default native in how i made it
i mean it still connects to vanilla appserver/ codex exec
it just rendered inside the electron wrapper
donte reset I have 60% man
i'm experimenting separately with making my own backend to connect directly to oai server, but that's just gonna be a different option toggle
like i want vanilla codex or my own
i changed my approach this week, i left 20% for the last 4 days for planning and start of next week ill blast 80% early again. That i way i dont get ripped off by a reset
Oh yeah, with Electron you might be able to inject javascript into the chatgpt iframe ๐ like how insane would it be to have a button on generated images that inserts it right into the selected codex thread? I feel like ChatGPT image gen is generally better
it also doesnt cost subscription tokens
had same plan but had to switch enviroments and ye could not just say continue untill 100% plan completion
I just hope no reset
My bad. I had 1% left. When I got back to the computer it was 0%.
yup. tweaks like this is what i had in mind when i started it ๐
Wouldn't it be nice if they announced the resets in advance? You you could throw on /fast for a few hours haha
It would be nice, or they could make it opt in
they sometimes do a few hours heads up haha
Thanks to CaptainStarbuck, robertmsale and smileBeda from yesterday for the recommendations on using seperate context files & detailed prompts and making it all clear before telling the agents to start to code, it has significantly reduced my token usage on both Claude Code and Codex
reset uncertainty feels very much alike with trading uncertainty tbh
on one side you have a hard time over using qouta early on because IF they don t reset you ll have some days without enough usage
on the other side IF they reset you will miss out by not going more early on
It really helped me to focus on the quality of the project & product instead of worrying about the limits all the time
id usually have heaps left but had to change to 5x this month ๐
what are you building?
Specifically a game project on Roblox that is built on a dedicated framework and includes a detailed combat system
This is gonna sound stupid, but I've been trying to finish my weeks with >50% because at the end of the month they're dropping the 2x and I wanna plan my usage around that. Every time they reset I'll have like 80% left ๐
interesting, how is it all going?
Pretty good! Thank you as well for all of the recommendations, they have helped me more than you'd expect
it's amazing what we can do now with these awesome models and harnesses
That's awesome! My oldest son and I build roblox worlds during summer break. You using Rojo/TS?
lol yea it's sounds funny
oh man are we still on 2x? I'm totally cooked if we are
Yeah, RoJo specifically! It is helpful for syncing the local files on the Roblox side
yea till end of may
let s hope they find a reason to extend the promo haha
1 pro and 2 plus accounts seem to be the move
1 Pro $200 or 1 Pro $100?
I think either 2 pro $200 or 1 pro lite with 2 plus
Sure at least give it to a sub-agent with separate context and a dumb cheap model, to report findings. Cause dumping 10k tokes worth of compiling log to your main expensive GPT 5.5 model is a nice way to ruin context and rack up costs ๐
these days i'm working to have my go with my custom workflow at this benchmark
seems a very good bench to me, especially that vanilla models are currently at 0% in terms of full completion of a task
Tmux and LLM friendly logs
Or make files
โMakeโ
did u guys notice gpt 5.5 got better?
Nice! roblox-ts is my favorite tool. You can build the object graph using a json file which turns into a strongly typed javascript object. Integrates really well with VSCode's LSP. Codex is particularly good at building the world with boolean shapes! beats dragging & dropping bricks in the studio editor all day
hmm, not really, it's felt consistently good for me but I only use xhigh on 5.5, if I have to go back to 5.4 for some reason I only high on 5.4
OpenAI might've updated the system prompt on the backend recently, which they can do without updating the app or the TUI
yeah i think so aswell.
Isnt the 200$ plan already cut at what will be? I thought only the 100$ plan is sponsored right now, the others terminated first of april?
I mean I can't keep track of all the promos, but I distinctly remember at the $100 launch they were like "we are doubling the Pro plan until May 31st, and shortening the Plus 5hr limit", then folks were like "Tibo, it sounds like just the 5x Pro plan is doubled" and Tibo said "Fret not my child, all Pro plans are doubled" lol (paraphrased)
hmmmm
That would be bad lol
And I was sure only the pro 5x is doubled
Well, anyway, I guess I will just have it index my ~/ when I run out of credits, so far it has added 1% for each scan ๐คฃ
Also resets every 1m up to 10m subscribers. So Idk. They don't have a page dedicated to active promos
yeah that is the worst because you cannot actually really deduce what you use, due to all the resets
If they just represented the quota as credits instead of % it would be soooo much easier to do
I am not even sure what that should mean
"now includes 20x on an ongoing basis"?! That plan is supposed to be 20x, as per the very contract you buy
https://help.openai.com/en/articles/9793128-about-chatgpt-pro-tiers#whats-the-difference-between-the-two-pro-tiers
However, they probably have wrong faq, since of course, 200/20 == 10x
not 20x
When did this kick in
on an ongoing basis
does that mean forever?
Says 20x plus (the plan)
its a mess ๐
OpenAI makin us do math n ish
So the pro $200 gets 20x of a plus account limit extra in the promo
more like making us interpret english phrases
on an ongoing basis
...
what to trust ๐
They covered their bases by using the term "ongoing basis", nothing to sue over. Companies change prices all the time. And yes I'm going to sleep @signal tapir lol
That's the thing... I think they had a "double usage" promo before the $100 plan, and so when they say "ongoing" they mean that they are keeping the $200 plan at double. Idk, it's all very ambiguous
Yes exactly
They probably created this ambiguity on purpose ๐ marketing master class
Iโm highkey glad for the usage
maybe i do need the pro plan
Guys
pick a random number between 1 and 100
Try it.
I mean. 42 was long the one and only.
I guess gpt is highly religious?
Honestly I've been trying to do the same as well not to struggle with the lower limits in the upcoming month ๐
Mine is apparently not religious ๐
Whatโs default
that number is exactly what it will repeat over and over
auto, not thinking, just gpt chat as it comes
I donโt see an auto option what
this
its reliably giving the same answer
input prompt is source of entropy ๐
of course, in the app
Can anyone tell me why my codex isnt able to use the @Browser plugin? apparently its missing the "node_repl js" tool?
web chatgpt has the model picker inside the input box for a while now
and there is no auto
thats even crazier. research 73 and 37
right, I never use the app
Its a mathematical easter egg so to say
Iโm in the app I donโt see auto
Bro I just noticed that ๐ฉ what the heck lol
and..
My bet is the Pro plan will stay at 10X due to popular demand (and it fits sensibly in their pricing scheme). It's a marketing ploy to get people to try it out (double usage because we halved what was going to be in the plan).
sorry for the rabbit hole but it gets deeper
73 is very important in the three abrahamic religions (probably due to its mathematical uniqueness)
If you talk to it about religion you might have the distribution point you to that number
Do you have any custom system prompts/
No
of course not
None that say give me 47
I havenโt
try pure gpt, no prompts, memory, etc
New account
then you are unique
There's a fb post with hundreds of screenshots, all give 73 lol
Bruh
And I confirmed it in now 4 chats lol
Yall getting psyopd by probability
hows it probable that it always gives the same result? Lol
Itโs probable bongosort can sort something on the first try
whats bongosort
Itโs a joke sort algorithm that just shuffles the list and checks it it fell into place ๐
No embed ;-;
tell it to use a script and not its training
thats not the point lol
the point is its supposedly inferred answer is deterministic
It is the point, it picks the same number everytime because of the training
OK, so you capitalized the P that time, but in the model weights & vocab Pick and pick have practically the same logits. The entropy would be adding more words or punctuation
sure, but, ask it to pick any color
Still going...
wait, I am going to try pick any number over between n and n
I just told it to make a ai-centered linux distro ๐
you asked it to pick any numbner in goal?
In a different chat yes?
๐ nevermind...
and pick any number = 7, reliably
so, theres some fishy stuff going on here lololol
AYY I GOT 73 on third try too
it picks different colors on the same prompt, but same number on the same prompt... I dont know lol
funny easter egg
๐คก
Fun
Ohhh my god dude yes! It's trained heavily on prime numbers!
I mean every screenshot I've taken is a fresh chat. That one I just tried 2-shot sequence to see if it would select 7 and 3
but why the 21st prime. why not the 3rd, or 10th lol
The first 3 lines in my ss might add to this
This is fascinating
need to compare other ai
and, other repeat questions
Like, pick any city, pick any planet, pick any whatever
lol
meh these are pre-saved results for question like these
they won't waste gpu cycles for it
๐
And ultimately
ask it the reason, over and over again
I picked 73 because it feels โrandom enoughโ to humans without being too round or too obviously patterned. People tend to avoid very low numbers, very high numbers, and multiples of 5 or 10 when trying to seem random, so 73 lands in that sweet spot of looking arbitrary.
[...]
token waster
I have done worse. like bits for free stuff and such lol
right, but that is not the point of the issue ๐
Plus you got 73 even there lol what the heck
all complete to 10. dont get me started now lol
what s the point then?
that noobs get pre-stored answers that waste 0 compute?
we know that like since 2 years ago
๐
I mean it's not all that computationally intensive to ask for "pick a number between 1 and 100". The ChatGPT system prompt is cached, you're inserting roughly 11 tokens, and getting back 2. The KV Cache is already prefilled with the most intensive part of the prompt. It's only when dealing with massive uncached inputs does it burn massive gpu cycles
cache, that could be the cause ah
but then why are color questions not cached?
That would make sense to do as well, right?
Nah, the system prompt is always cached. We're likely seeing massive amounts of prime number training. With reasoning turned off it flows right through and the only possible answer is 73
guys try change the numbers a bit lol
you only have 73 acched for 1 to 100
if you change the sequence you ll get different numbers
bro i thought it was a work day today. i hopped out of the bed like a mf
Ah ha, so changing the min affects the outcome
I tried llama - each chat a new answer
Served over ollama
So i guess yes this is cache
yea..it probably has a very high prior to give 73 for the specific 1 to 100 interval which was prompted by a lot of people since these tests are memes
but if you give it different intervals you don't get the same answer everytime even with instant
oh wait, llama picks 27 and 53 interchanged
once 27, then 53, and back
Something sequential about this
i remember back in the gpt 3.5 time. i gave it some tasks and it had to pick top 3 choices with %
what it did it was always cycling the same 2 profiles of top 3 choices
๐
first time i have used the codex image gen, it's pretty good o.0
It can create production ready mock ups that you can loop on
like content could change but if you asked for top 3 % you always got either 60 25 15 or another one foir same prompt with 0 temp and fixed random seed
yah lol
plausible
I went from 1 to n (and 100 as max), and it responds with 73, up and until 59, then it started get different
weird cache ๐คฃ
Ok, stop the token waste now lol
we should have a way to use tokens at the end of the week for some sort of a charitable cause, like folding proteins used to do.
Well you could always have GPT do some amazing stuff with the remaining tokens and then gift the product
Like, pep up the website of some org in your area, or so
I guess that wouldnt work because it would likely bankrupt the providers, they would have to tighten plans
Or offer that as a free service
Hmmm, I feel like the Free plan accomplishes this to some degree
Giving access to those who otherwise can't afford it
I did actually get to use codex on my free account too that I bought pro on eventually
Not a bad amount of usage plus it finished task despite me running out of usage on the first prompt
I was thinking of something like putting the compute towards some large project that would be working towards a common goal that benefits humanity, like the folding proteins used to do with private computers.
One thing I used to do on plus was in the ChatGPT native app, you can connect it to your IDE with accessibility permissions and basically have infinite "codex lite". Idk if free works on the native app but if it does that's a viable alternative right there!
That's honestly not a bad idea! Shoot, that could even be an OSS project (like a codex plugin). You maintain a list of projects you wanna contribute to and throttle them with /goal based on quota
Codex bar calculates reserves, so those reserves could be the throttling metric
scam?
yes, phisicyan
that s good idea. but well you do not have those tokens on subscription ๐
also this would strain compute a lot because then there will be incentive for people to promote that in campaigns and all kind of people who subscriobe to gpt but never use codex would suddenly doinate their compute haha
He wont be here for long, already modmailed the user id
you've come to the right place
but it mad em wonder if its possible to create virtual box log in to codex and make codex some instructions like remove virus keep eveyrthing else or smthing or determine whether a certain thing is usable cause rn Idk whats the solution when zip is larger than 2gb or some for ex dowloading course from meganz or some
guys if 5h runs out while using codex and the prompt is still going, is it still using 5.5 or a lower model
mine just stops and tells it runned out of usage
well its still going on
I mean if it thsoe not call otehr agents then it mithg continue cause its just 5h usage not weekly
Does it provide any more info than "Failed to create PR"? Like a specific reason it failed?
gm
is there reset yet?
I'm not going to do it... someone else has to tho
OK, what if you run the task headlessly? Like send the prompt, close the window, watch the PR activity on github, reopen the cloud thing when merged?
Seems weird to me that the browser has any effect on the outcome of a cloud agent working autonomously but idk any other option
Hmmm, maybe if you install codex locally, you can do codex cloud exec --env ENV_ID "Summarize open bugs". Basically clone the repo that cloud can see, run that command, and it should trigger a headless cloud task. No stupid browser bologna
would be interesting to test this with subagents, probably wouldn't consume a lot of tokens if done with 5.4 mini
when reset
I got an new feature needed to be implemented into our existing app, the UX team design several use flows and new UIs on the figma.
Am I able to let codex read the whole flows then design the code architecture of this new feature, based on my existing app architecture?
guys with plus i should use 5.4 or 5.5?
5.5
5.5 low/medium
true
high if low/medium cant solve
but stick to medium as your default
gpt 5.4 now spend tokens like 5.3 and 5.5 like 5.4?
gpt 5.5 spends "less" tokens but higher usage
also cost
yea i mean usage
i mean
now 5.5 spend like old 5.4?
and now 5.4 spend like old 5.3?
My feeling is that you get more out of 5.4, unless you are doing very complex stuff
we chat about non OpenAI models in #ai-discussions
Thanks, found it anyway
Anyone's using Claude Code inside of Codex app? Is the terminal randomly closing for you too?
Reconnecting....Reconnnecting... what's up with that
5.5 Extra High couldn't even script a proper ragdoll system on Roblox and it apparently consumed all of my 5-hourly limit, geniunely annoying
using xhigh
lol
xhigh is pretty useless ngl
use high
I've tried that as well, it didn't work
Have it check the math in lean
use medium
i use medium with 5.5 in roblox
i do crazy games
Used medium as well, it still couldn't get a single ragdoll system right
ohh ragdoll
ragdoll is a bit more hard bc you need to know what you do
you need know a bit of program or make a very specific prompt
Yeah I would have literally completed it in a shorter time if I scripted it myself
search in roblox forum for help
try make a example for codex
I'd use that example if I were to make one
My laziness costed me an hour + 100% of my 5-hourly quota
try search in roblox forum
and say codex the context
what plan are you on?
Plus
I do use seperate context files as well besides a master one, apparently an AGENTS.md file
ok
The issue is that the player is not getting ragdolled necessarily
glad I switched form vs codex ext to normal codex app oh man it is so much easier
how
im still on app
what if you're making manual changes tho
I dont cause I isntructed codex to make orchestration that registers every change setted up git propperly and eveyrhting is tracked I mean if I Were to change code manaully I would just open it thruht \wsl.localhost\Ubuntu\home\metalicpen\code\project1 in VS and change manaully files and just report back to codex app waht I changed manaully and he owuld run check waht I edited and sutff
was my answer related to your question or I missunderstood it?
How is wsl
with right setup its faster I guess like it feels native
had the other thread not use it propperly and the commands where erunning 3x slower
is it real tho?
holy slop
what is
I mena waht is thsi virus
are u fr or trolling
the idea is there tho
i like it
been seeing it everywhere
why didnt i think of this
lets just focus on codex
wdym havent you seen a million covid virus tracker websites
umm
does this look like covid to you
there no vaccine (yet) for this one.
we not taking the vaccine
Also, an extreme panic-grab. Hanta virus is around and active since thousands of years
Also humanly transmittable - the variatn that ship got at least - is nothing new.
In argentina yearly 100 people die of it averagely
So now a swinger cruise ship spread it... and everyone outcries
I like the "rodent control" prevention. Very useful against humans. ๐ lol
Back to codex.
yeah codex, what do you have to say about it
I actually did ask it because I have toddlers cleaning the porch with their all and everything every day, and the thing is, as long as you do not live in a trashbin, and have no dead animal there for dinner, the chances are almost non-existent, even with the andean version.
you need extreme close contact (ingestion, very close contact) of infeceted particles, in close rooms, or liek directly ingesting rat poop etc
The problem is, many people do live in trashbins, voluntarly or not, and many people do not understand what minimal hygiene is
Last but not least on cruiseships (regular breeding pool for stuff like noro virus and such)
I would not worry about this one other than normal common sense.
depends what they are wearing tho
the nuns i mean
brother thank you all for giving tips I finally made codex unstuck the project that was made by codex and me prompting oh man
"codex make an app to track rat excrements on my ship"
"codex extract any info which is considered consiparcy and create a trading strattgei if youw ere to believe them"
This is the best situation to end up with: it means you can run one big cleanup (or similar) and it will finish no matter the token cost. But: do not send any follow-up prompt, else it will stop immediately and even mid-edit.
Or you can take a break for a few minutes, say Hi to family and friends, get something nice to drink, then come back to your desk where you'll find a fresh supply of tokens to continue work...
I just came back from that when it ended at 1% from the work I started before leaving. ๐
And when you look at the reset time you'll notice that the family might be sleeping at that time already.
I saw that. ๐
Seriously though, if you're running out of tokens, look at how you use the tooling and what's happening under the Thinking (CoT) process, and try to optimize.
Give better prompts. Document code and project. Ensure assistant isn't wasting its time looking at things it doesn't need. Use the right model and reasoning level for each task...
There are lots of ways to use the tools better and avoid the rate limits.
These tokens were well spent. I mostly only use Codex in VS Code for review, refactoring and in this instance a bigger restructuring. Everything else runs through ChatGPT.
give codex more context
It sounds like you're doing everything right and getting the most out of it. Kudos! ๐
Using ChatGPT for most of the grunt work has the drawback of not being integrated and gitless. For my two small apps that fine and my rollback repository is basically a folder full off the ZIP files I download from ChatGPT, but Codex in VS Code would handle that part differently. But then again I don't have to handle GIT to begin with.
What I don't like about Codex in VS Code is how it just stops mid-edit when you send a follow-up at 0% 5h/weekly left. And since the limit report in VS Code is mostly broken that can happen more often than you'd think. ChatGPT has a similar problem when you reach the maximum length of a conversation. It does finish the last prompt and even offers a download link, but the link is always dead then, so you have to do the last step in a new chat again.
It's always funny when you exchange Codex/ChatGPT output between both (or one chat to another). It's a bit like them holding a conversation where one partially disagrees/corrects the other.
poor architecture
Here is ChatGPT reacting to Codex':
Main correction to Codexโs plan
The line:
Prefer persistent HWND trees for all non-modal managers/editors.
I would not generalize that yet.
but you can make codex itself analyze himself within the action he is doing
It literally answers in the form of: What I agree with. What I would do differently. What I disagree with.
There are a few issues at play there: Context is the biggie. ChatGPT has access to your Custom Instructions, Memory, and maybe Project Instructions. Codex has no access to any of that. Codex does have access to the system-level AGENTS.md, project AGENTS.md file(s), maybe Skills, and project code and documentation ... all things that ChatGPT does not have.
So they're working with completely different information and it's no wonder there will be disagreements.
That's why I keep documentation (including architekture, file ownership, ai-handoff etc.) inside the package.
idk why would you use chatgpt as planner if you can make codex do it better
Codex has limited tokens, ChatGPT has unlimited tokens.
I just use chatgpt if my codex says to do deepresearch about certain topic
Another issue is that these models are never happy with anything. Ask ChatGPT to do something. Ask it if there's anything else to do. If it says no, open a new thread, paste it in, and ask the same question : It'll find a lot wrong with it.
This is the telephone game, where doing this more than a few times makes the result unrecognizable. ๐
Tokens....
The tools are different. Use them for what they do best. If you bang everything with a hammer then all problems start to look like a nail.
Sure. There are various culprit to overcome in conversations with GPT. It often helps to tell it to plan as if we were starting from scratch with no regard to keep existing implementations. That opens new idea and helps breaking out of various "running in circles" situations.
And many people just keep using the same thread for everything. The context is polluted with things that were, that no longer exist, and that creates a ton of confusion. Use new threads for new issues so that the assistant gets a fresh look at the current state.
(It's like dating someone with relationship baggage...)
Anyone else seeing that VSCode Codex Extension takes a long time to load prior tasks - or can't load them? There was an update today, dunno if that's relevant.
Sometimes you have to flood GPT with data, like measurement/benchmark data and failing test-results of what it thinks to be the right solution just because it is the current implementation. Once it gathered enough data proving different it gets persuaded by sheer amount of new history. And once it is thinks that new/better solutions come from itself it's easier to make it apply real changes instead of just pretending to do so.
I just used clief knoledge + Jaymin-West and some of mine ANTI_SLOP_DETECTION.md
A few days ago I had this answer after confronting it with throwing more smoke than actual solutions:
I did not intentionally lie. I was wrong, and I overstated what had been replaced.
What actually happened is this:
I did remove and rewrite a lot of code.
But I did not achieve the strict architectural reset you asked for.
had he defined how to not make same mistake twice and what rules to apply in next session with oyou so he will perform better?
I suggest it was asked to do a little too much in that one task. These tools are great but they have limits that we need to accommodate.
To be fair, even after starting the module in question from scratch (deleting the file) the problem persisted. And in the end it wasn't GPT's fault, but a limitation of the programming language I used that was badly/not documented. So we had to dig into the problem from top to bottom layer by layer until we found the culprit and then it was easier to make GPT implement better solutions.
ye I am now on same stage haha
Which language - curious?
It basically came up with the same mistake over and over again, because we didn't know what the mistake was and thought it was a correct implementation, aka we looked at the wrong place.
Autohotkey v1.
Ahhh, love AHK.
GPT can handle AHK v1 rather well. In the beginning I tried to convert my old existing script into v2 before turning it into the proper app it is now, but that kept throwing errors with every run so many times that I had the impression that GPT doesn't handle v2 so well to begin with (likely because it is even more niche with less code examples on the web).
My main grief is that I have to make GPT build our own UI layout manager from scratch. Something like C# would need much less work in that department, especially because I only use UI for settings dialogs. But it's a learning experience.
The docs have never been great, even though the tool is. The forum is a great source of info/community. I wonder if anyone has dumped AHK docs and the forum history into some KB for a model to consume.
Also doubled the memory footprint of that app for the whole settings UI part. But around 15 mb is still absolutely small enough.
And today we restructured exactly that part, which is what cost me the 5h token limit. ๐
"How would you improve the structure if starting from scratch (we don't have to hold on to existing implementations)?"
"Iโd restructure around a declarative surface model, with feature modules providing data/actions and shared support owning almost all GUI mechanics."
Classic "make it think it was its own idea to change the implementation". ๐ค
55% to spend in next 24h damn
You asked it to actually do all of that work in one shot?!? ๐
Have you created a Custom GPT or just a ChatGPT project with a lot of AHK v1/v2 docs?
Implement this in one large restructure pass, you may even delete files and start from blank files to prevent editing existing ones. Also update all corresponding maintenance files, including the architecture files for future chat prompts to know the goal.
From the plan it looked like it was a large change, but most of the structure was already there. But the main goal was to make GPT get over itself and its often overcautious mini-steps to finally have the already implemented dynamic layout manager model be applied app-wide. Despite many former implementation steps there were still too many parts that kept clinging to the older pre DLM implementation. The real changes weren't really that big/many:
16 files changed +331 -133
And then in the following cleanup:
22 files changed +355 -252
Strikt AHK v1 with a couple of possible (as in seen happening repeatedly) "trying to implement v2 in v1 scripts" culprits specifically called out in the docs.
Hmm. Honestly I'm starting to get lost without context but I'm following that you didn't ask it for that much and it seemed to choke. It's possible that with each file it's struggling Very hard and consuming a lot of tokens in that process. Given the pattern I suggest much smaller tasks. And yeah, unfortunately this means babying it a bit cuz we can't expect it to do larger tasks with AHK v1 like it can with other libs.
And I did some experiments before deciding that GPT could handle AHK v1 well. Even though my files tell it to look at AHK documentation it really works rather well from its own inference without official docs already.
I'm fine with it burning tokens on going over larger structures. But both my little projects are also split into modules and GPT only looks at what it needs for each specific task.
`STARTUP_COMPLIANCE: read baseline owner docs and active revision headers; active revision context <package/file revisions>; task type = prompt/bootstrap; extra files read = none.`
- Docs-only:
`DOCS_COMPLIANCE: read baseline owner docs plus <affected owner docs>; active revision context <package/file revisions>; task type = docs/<topic>; source files read = none|<source groups>.`
- Focused implementation:
`MODULE_COMPLIANCE: read baseline owner docs plus owner docs <owner docs> and source files <source groups>; active revision context <package/file revisions>; task type = <specific task>.`
- Build/profile:
`BUILD_COMPLIANCE: read baseline owner docs plus build/profile files <files>; active revision context <package/file revisions>; task type = build-profile.`
- Broad integration:
`FULL_STARTUP_COMPLIANCE: read baseline owner docs, all maintained source files, and owner docs <UX/DLM/support/build docs as applicable>; active revision context <package/file revisions>; task type = broad integration.```
Gosh bud, it sounds like you're doing everything you need. Now I'm wondering if GPT 5.5 High will just chew up more tokens or if it will be more intelligent about doing that specific job for you.
I have no idea what those instructions are. Is that your AGENTS.md???
This is the current maintenance prompt for this one project:
`SysToys` in the current folder.
Start by reading the startup set required by `README.md`:
1. `README.md`
2. `docs/ai-handoff-notes.md`
3. `docs/module-map.md`
4. active revision headers across maintained `.ahk` files only
Use `README.md` for startup mode, compliance-line format, chat-title hint
wording, package rules, and when to upgrade from baseline startup. Use
`docs/module-map.md` for file ownership and task-file selection. Treat the
selected startup-mode files as active package rules, then give the applicable
short compliance line and the README-defined title hint only when README says it
is still useful for this chat.
Read the required startup and owner docs fully before analysis or edits. Treat
their instructions as mandatory for this task, not as background reference. Do
not rely on memory, older chats, or handoff summaries when the owner docs for
this scope apply.
When the concrete task requires more files, owner docs, or source bodies, load
them according to `README.md` and `docs/module-map.md` before analysis or edits.
When a task touches a scoped area, load that area's owner doc first and follow
it as the source of truth for that scope.
When AutoHotkey language/runtime behavior or Windows/platform behavior is part
of the task, consult the relevant primary documentation before relying on
inference. Use AutoHotkey v1.1 docs for AHK behavior, Microsoft docs for
Windows/platform behavior, and the smallest local evidence-backed test only
when the docs are insufficient or ambiguous.
Keep this prompt as a bootstrap pointer. Do not mirror owner-doc rules here.```
I have found GPT-5.5 Medium to be my sweet spot for SwiftUI/macOS development
It's part of my readme.md, which is basically the same.
I had GPT write this for me. ๐
Yeah Robert, but he's working with a different language (AHK scripting) so I'm considering how a change to the reasoning or model might help.
"When the concrete task requires more files," how would he know when is when you cant assume he knows cause tehn its just him doing a coin flip
When I look at that cryptic stuff, if I can't understand it, I can't imagine how the model can understand it. That's not some human hubris. I'm saying it's way too cryptic to make sense (I believe) even to a model that's opening a task for the first time.
installed the new app yesterday, used more tokens to talk about a project than I used on cli a couple weeks ago to develop, test and tweak a complete app on cli and its personality is annoying :\
also why can't you move chats into projects
Most of the code is written by ChatGPT (web app), so other than using "Thinking" and the model I have little control over it anyway. And then I mostly use Codex to review and cleanup or to try it on more complex code problems where I hope that a "different pair of eyes" can make a difference. In all these cases I usually prefer High. I might use Medium if I did more of the regular coding work there.
In ChatGPT I sometimes switch back to 5.4, because 5.5 regularly producing problems in its own backend/container is getting on my nerves, like Python timeouts. I don't want to watch it fighting its own problems instead of fighting mine. ๐ท
Because there is documentation about which files does what. I just didn't post all of that here. ๐
so, is the cli version just buggy in terms of usage?
should i be using the macOS App instead?
ah I did a fair chunk of my admin system via chatgpt web before I got codex, and it has quite a bit of style drift in hte code between files. Once it got too big I got codex to clean it up
i did last time have a multi-agent bug hog up my usage (i did not get compensated for a bug i did not introduce)
It's a good practice to switch models depending on task too.
I need to back away for a while. I'm not helping anyway. ๐ I have some issues I'm working through with Codex too. ๐
Good Luck!! ๐
best thing to do is use Codex exclusively for coding and do chatting with ChatGPT
My original idea was to use sub-agents of lower reasoning effort to due menial tasks like reading data out of log files or scanning data out of screenshot images. But unfortunately it turned out that Codex in VS Code always creates agents at the same reasoning effort that the main chat/agent uses with no way to change it (it changes back when you do so manually).
does seem that way, but codex can look at your existing codebases
I might get it to summarise and throw it at the grumpy cli
yep, use the git integration in Codex, works really well
And in the end you had to spawn new sub-agents regularly, which meant that every agent had to first use tokens for starting and understanding its task. Agents help to speed things up and parallelize, but the idea to use less tokens was a dead end.
came up with a dumb idea last night that would recycle a lot of stuff from my main project, since this one is the dumping ground for overflow anyway so it has a lot of common elements
And I just used my free month of Plus and now kept Plus for the time being. So my Codex token limits are generally just too small to use it for everything.
anyone had issues with long-running tool calls causing login token invalidation? one of my projects has probably 1000 different unit tests and lately I find that after about half an hour of waiting for the test runner script to finish, my codex oauth token gets invalidated and the CLI session fails saying the token was revoked, at which point I have to log in with the whole 2fa code song and dance again
Dunno if this helps anyone but ChatGPT with the @ GitHub app active can also read and write to a GH repo, just one file at a time and it prompts for everything. It also can't see the entire repo unless it's already indexed. So there are limits but it's zero tokens for the right use-case.
I spent the last couple days working on a drupal site and that's an absolute token hog, but my other sites are smaller and plus is more than enough for those
did you hatch a pet on the Codex app?
no ... should i ...
yes definitely
Codex can work away while you're in other workspaces and you can keep an eye on what is happening
Codex has some standard Pet options but I created my own Koala custom pet
Here is an example. I uploaded the complete current package as ZIP to ChatGPT and posted the maintenance prompt. Then I posted:
Discuss the following two alternatives: "Prefer persistent HWND trees for all non-modal managers/editors." versus "Persistent HWND reuse can improve open time, but it can also retain memory and AHK GUI state. For this project, the better rule is: Persist managers where reuse is common and state is stable. Keep modal editors lightweight unless first-open/reopen cost remains a problem. Do not prebuild hidden HWND trees at startup." plus "remove passive HWNDs only where no interaction is lost"
GPT's first sentence while thinking was:
Iโll ground this in the packageโs GUI/DLM/support rules, then compare the wording as a maintenance-rule choice rather than as a generic GUI preference.
And then the final answer started with:
DOCS_COMPLIANCE: read baseline owner docs plus docs/dlm-architecture.md, docs/support-architecture.md, docs/ux-style-guide.md, and docs/goal.md
As you can see it knew very well which files to work out of the whole package.
Oh I see then good, so You have good enough set up
I'm getting it to make me a bucket I throw cats into at the moment. I suspect it is not going to take long
this used to be a drupal 6 site. It got hacked ... I made it static. Talk about archaeology
XDD would be cool to have such icon above a live codex usage
what's stopping you? just hatch a custom pet
wait is it possible with a pet? haha would be very cool
yes, this is my custom pet made out of sprite sheets
Sometimes this can cause some problems. Like in my current case Codex did not infer the design goals clearly from the history and docs. It decided to prebuild GUI elements on startup after I spend half the day benchmarking exactly that and then deciding against it and removing it again. Now ChatGPT removes that part from Codex code restructuring again and hardens the wording in the docs.
So while I also like to use Codex to clean up code its lack of own chat history can make it misinterpret some code intentions. That's the danger of hoping back and forth between Codex and ChatGPT.
I didn't work on my personal projects for a good 10 years while I had a day job, I lost a lot of code from bitrot ... older versions of things just wouldn't update/run/work with my current setup OR my live host. I lost an entire fairly complex ruby on rails app and have got chatgpt/codex to build it in php with no dependencies. I've overshot that app now, its really nice
The core php site that faces the front end though, I made that in 2010 and its been chugging along happily ever since
oh god its built me a beige site with 20px round corners, panels with drop shadows, tan buttons, and Georgia for a font. My earlier joke about codex's favouite beige stands
I am not sure if this will be automodded but here is a brown and beige site from Codex https://dunolly.com
it loves those colours
that's not the exact browns and beiges I got but its within a solid 5% of that scheme
Wow, I had Codex make a spreadsheet for my Terraria appโs update roadmap.
Codex is great with spreadsheets
Itโs pretty good yeah
surprising that it went beige as there the code base instructions for at least 5.5 High are :'You do not make one-note palettes: avoid UIs dominated by variations of a single hue family, and limit dominant purple/purple-blue gradients, beige/cream/sand/tan, dark blue/slate, and brown/orange/espresso palettes; scan CSS colors before finalizing and revise if the page reads as one of these themes.' Did you have a prompt/spec that muddied/overrode that somewhat?
I am having far too much fun with the openai api. Its like chatgpt but .. deterministic
One problem with using Codex/GPT for personal projects is that its planing seems to default to "we still have to deliver working executable to end-users/customers". This makes it overcautious, trying to implements larger changes in smaller transition steps, as makes sense for more public/commercially oriented projects. Sometimes you have to convince it to "go for finish line".
Agreed. With that preference, my earlier phased framing is too conservative.
The corrected project direction should be:
Do the full surface/DLM ownership restructuring as a cohesive architecture pass, even if intermediate revisions are temporarily non-executable. Avoid leaving mixed old/new GUI ownership patterns in place. The goal is one finished GUI framework, not an incremental facade beside the old one.
this is 5.4 medium and this is a first pass on an existing site so I will be pointing it at the legacy stylesheet shortly
Is the current site beige-ish?
its retro. Its green and orange
do you have any tips to get websites that look better out of Codex? Because I am not a fan of the browns in my site
I use stitch for mock-ups - then tell codex to implant and take inspiration
U can also go and remove the system prompt they have for design, itโs horrible
heh. My sites are bright magenta/purple/blue/green kid's site and I had existing style and markup for a small part of the site so I just told it to match markup. The site I just finished was based on a gemini mockup and I had to get gemini to patch the css to get proper metallic gold. It works well off mockups
Physician detected
I use palleton to get colour sets quite regularly
Here's the 5.5 High base_instructions re: web ui's and the styles/colors prompt portion. I often wonder what it takes for Codex to ignore this.
the site I just finished looks like this. I'm not sure I am motivated enough to align all the icons right now
I mainly do windows app dev so not that sure about codex webdev, other than i heard it is lacking. That said, I have to create a static website soon and was hoping to use it.
I'm stewing a project in the back of my head that's pure front end and getting codex to play nice with it is going to be interesting
Are you doing direct calls to the api where you set the developer and user payloads, or API from within the codex Cli or desktop app? just curious.
for static sites I have been using Codex and Netlify for hosting, the Netlify skill is wonderful but Codex does very poorly and comes up with old ideas, browns, big rounded corners to page content and so on
api buttons for content generation on a website, its content hinting for admins. I want to put a user facing api on for TTS but I'll be caching that one. I've also got an api for image recomposing/upsizing and that one is expensive
are you using 5.5 high, medium etc with it?
not yet, I have only used 5.4 with web, I am going to try building today with 5.5
I think part of why Codex is using quite a lot of tokens is the thorough documentation and asking it to refer both to the project docs and AHK/Microsoft documentation. On the other hand it prevents it from making stuff up without looking into docs and keeps it from drifting away too much even in very long chats (as it keeps looking up the docs repeatedly).
people keep telling me I should start a business doing websites. I've been a frontend dev for [too long to admit] but not a designer. Then I compare my design skills to codex and yea.
if you ever figure out a good codex api developer prompt for website gui dev...that would be a great thing. I think ex. 5.5 just loves those ui styles and Codex tries to beat it out of the model by having a pretty direct base_instructions not to use beige etc. I wonder what triggers it?
I was thinking of making a static site generator using design patterns in templates, but it wouldn't be a dev thing, it would be customer facing. Been putting it off while finishing up other projects. Gotta deploy this new site monday, then I might actually start it with a clean slate
I'm thinking one needs to point codex very strongly to a really accurate styleguide or it will wander. I'll be finding out this theory soon enough.
occurs to me I'm currently asking codex to write a single static site generator from json and this other project is really similar. So I guess I'm on the bridge now
thing is if the generator calls a model ex 5.5 to create a site and you get the results just right, subsequent results over time with nerfed/buffed models, new models a year from now etc may give different results because of their weird biases. I'm trying to mitigate that so that when i get the model to create the perfect result, i have it document the path it too to get that result, then have it create a stand-alone script/binary that can replicate the result in an attempt to get some kind of determinism. tricky stuff indeed.
you can give it styles, hints etc but you really do need some kind of vague idea what you want the site to look like to start with
For app UX/GUI I pointed Codex/GPT to multiple styleguides and had it compile what these agreed on, then we coupled that with Microsoft's own fluid/fluent styleguide and added that as ux-style-guide in the source package to regularly reference.
Sometimes I take a screenshot of a dialog and ask GPT if that follow our ux-style-guide and then let it critique its own work. ๐ง
I struggle with getting idea in head out so I tend to go part way and then fiddle with css and I'm having huge wins pointing something that's not right at gemini and describing why its not right and it fixes it. Chatgpt just goes "yeah that does look a bit off, I agree" and doesn't help. Grrrr
Overall I get the impression that GUI/frontend might not be GPT's strong side. And I mean to remember that I also read about that opinion.
designers are not losing their jobs just yet
I wonder if it ever was or is just regressed in 5.5? I know from doing frontend gui language translations, that good ol' gpt-4o-mini is still king. Newer OpenAI models still aren't as good.
The main reason for me to build a whole GUI framework just for settings dialogs is that I got sick of going in circles with GPT trying to create them as static designs. While fixing problems in one area it would always (re)introduce problems in other areas. And it became a chore of telling it to move things left and right. Web design should be a bit easier to begin with, though, because AHK is really old-school Windows GUI based.
I can't say much about model differences, as I lack experience and history with that. And even the while I have GPT do frontend work as we type I don't even want to do this, I just have to. After a few days of working on that I specifically called it a day and went back to the actual backend work I was more interested it, even if just to get away from the frontend stuff for a bit.
But again, that's also a problem based on using the rather "classic" Autohotkey. There is nothing "fluent" or modern about it to begin with. It's really based on a "move this elements x pixels in that direction mindset".
From the 5.5 High base_instructions, maybe a reference layout style image is required? 'must' is pretty strong guidance.
Visual assets required, with game-specific exception
Websites and games must use visual assets. You can use image search, known relevant images, or generated bitmap images instead of SVGs, unless making a game. Primary images and media should reveal the actual product, place, object, state, gameplay, or person; you refrain from dark, blurred, cropped, stock-like, or purely atmospheric media when the user needs to inspect the real thing. For highly specific game assets you use custom SVG/Three.js/etc.
wait, codex swears? I don't recall swearing at codex
"Strong" wording to implement "rules" instead of "guidelines" also seem to become softer and softer as chats go on. So you regularly have to remind GPT about those instructions.
yes, and if compaction occurs it might wipe your overrides
Does increasing the context window finally work in Codex 5.5? Didn't check for a few days.
i just run it default size and start a new thread at 75% context window usage.
Codex 5.4 was funny. You could set any arbitrary window size, even something like 5M. Doing so didn't make any sense, but it would report that size in its status (and VS Code overlay).
Codex 5.4 worked well enough up to 500K, it could be stretched towards 700K, but then it really drifted into becoming very senile.
Once it repeated answers to older prompt inputs while completely ignoring what you just wrote you knew it became lobotomized.
Maybe that's why they limited 5.5 to a quarter of that size. Super-long context's would degrade too much, and perhaps they don't have a fix or it might be just how it is with a llm.
Now that i think of it, i don't think the standard public model ex. SWE benchmarks we see ever mention the length of the context window when the result was measured, whether compaction occurred etc. Hmmmm....
oh man I wwish my thread could send promts to another thread
I mean all I do is copy paste but would be cool if he could do it
I also do the copy & paste thing sometimes.
In ChatGPT you have memory, which relays information from one thread to another.
I mean I am in codex and have just 2 seperate projects anc I need to send pormpt from first one to another one but I guess its the only way for now
Maybe you could add an agent to each that writes/reads from a shared file.
Told it I need to have a stern word about its use of beige and it said yes, beige is not a hill worth dying on
they can already but I still need to send a message to the otehr thread like saing "done I sende the pormpt you emmited to that thread and he finished" I mean he preapres what to message s to send him back cause he preapres himself for himself them but I just need to eb the connector for now
I could make a pet for that but its more risky to autmaote such easy task than just make me do it
codex threads are just some files on your disk
and on the dynamic side you send some text block over api to the model and get back some response, based on some predefined rules
so ofc you can have 2 threads talk to each other ๐
but you have to make it by yourself if you really want it
make it myself? or isntuct codex to do it
codex is a tool. it answers the "how"
you can cut bread with your bare hands or with a knife ๐
my point was that what you want is not available in the vanilla implementation. so if you want that you need at least a codex fork of your own. or maybe look for some 3rd party app that provides that behavior
oh I see yeak thats what I thought, ye aint worht tto modify it for just 1 feature not optimal but good there is a thing like this
as opposed to chatgpt, where you can't make 2 threads talk to each other without automating some copy paste from one to another
with codex is different because you can program the interaction
but yea you have to mutate codex first ๐
Is codex genuinely good enough to code a sellable website. Obviously good prompting goes a long way. Iโve been using ChatGPT and codex side by side and playing around. So far Iโm just 3 days Iโve built a pretty decent website. Itโs not perfect but on the outside it looks just as good as a lot of higher end sites.
Yes, you can build a website with it. But i wouldn't be trying to build much more than simple auth with a basic backend that doesnt hold any pii. Fro a vibe coding perspective - if you know how to code you can build anything given enough time.
define "sellable". All my projects are websites, its fine, it does what you ask it to do
imma see if I can get this entire site done within a 5 hour window, for funsies
Codex just does what you ask. So if you already know how to build a good website, you can translate that knowledge into instructions for the prompts and have a great experience. But if you don't already know how to build a good website, it can also be a great teaching tool to watch the agent work and examine the code when it's done.
That's funny, I also have 38% left
what are you going to use it for?
64 with less than 3 days left over here and was struggling to come up with something to do, hence the sideline into cats
Dude I was just joking about helping me earlier but with 64% left, maybe I'm no longer joking lol
I'm on plus, its not as much as you think
I already have a plan to build my website but I just haven't gotten around to doing it yet
Ah alrighty
I have several websites, they all predate AI. I rebuilt a couple of the worst ones to very very simple static html last year, could have got codex to do that for me in hindsight
rebuilding a small one now from jekyll to an input form -> json -> static
Nice, kicking it up a notch with AI, as it should be lol
The primary function of my website will be a player dashboard with lifetime stats and a web shop that integrates with the game
got a lot of manual steps for monday's deploy so don't want to do a big project switch before then. Took me far less time for the reskin and closing the backlog than I was thinking
its usable honestly
I cant go back to plus plan it would feel like prison knowing that I have 20x usage of it and ye
yeah, it depends what you do
its like an hour of work on low
what do you mean an hour of work lol, can use it for quite some time. I'm having lunch now and letting it sleep
I use 5.5 xhigh speed
cause it requires it rn
I mean the speed no but I have 52 % left till 12th
i spend some time jot noting some problems, then spam threads
You could create a mcp server that each agent/thread can access, inform it that exists, get a list of other threads from any other event non codex agents. Develop a standard for interthread communication. After each thread prompt (detectable via skill or hook), tell it to update the mcp server with its status and check for any tasks assigned to it by other threads. Thereโs your vendor-agnostic v pattern right there. Prob take a couple hours to code that with 5.5 high. End result is thread a can pass info to thread b and even wait for a response. Achievable and testable.
15 cents omg
more expensive than opus per token
those your codex do it "Then Iโll patch rather than overwrite." haha
Something I didn't really anticipate, hosting costs for my game server.
I just calculated it would cost about $150/month to host it on AWS with absolutely no players. Fortunately that cost stays about the same for the first 1,000 players or so, but that is still a lot of money to host a game that may or may not ever see a profit. $200 for AI services and $150 for hosting, I'm spending no less than $350 per month to make that game happen.
So I'll just push forward and do the best I can for like a year. If there's no demand and there are no players in 12 months, I'll reevaluate the future of the game and maybe move in another direction.
Like a local business asks for a website and I charge like $150 for it basically
sure if it takes you like an hour
depends but how will you manage SEO and their website longterm
I plan to automate something like that and yeah, needs reporting, seo etc under the hood with it. And automated domain mapping
I dont thimk you can jsut build a website and tell you wont manage it at all
Put a zero on the end of that number. You're selling business services to businesses, they can afford to pay for high quality results.
do you have marketing plan tho?
Well not one I'm proud of, only a vague outline. But marketing isn't really on my mind just yet, still have a lot of work to do
Or make ai build website and otehr stuff and hire on upwork sm to manage it propperly
I can manage everything, already have a plan for the website but it's on the back burner because there are more important functional considerations to address in the game client first.
but isnt it better to spread awareness of it even at first stages? like posting on X and stuff to bring sm who has a tick about stamps and stuff cause look custommers give the best feedback if you buiild the right community or engage in ones you will gain traction and someone who can suggest featuers and stuf flike that
I had mine on a $12 linode plan but the traffic flattened it and I had to upgrade to a $24 one
build a connection so they feel like beeing part of your project
yeah true
ppl in gamedev overemphasize marketing as procrastination justification lol
would even make them feel special but not just menitoning "thanks to this user this feateure was revumped" not only it would bring more traction cause everyone would desire to feel as speial but as a bonus you would meniton taht "this user will get this and this in the game" idk I think its better to taltleast do smthing liket this than nothing
do you really have to use all those services? 5 dollar VPS can achieve a lot of things at low traffic if you custom code things and use open source solutions
emphasis low traffic. If you have a database backed site it can get swamped
yeah
It's an idle tap game, not exactly the most exciting thing in the world lol. But idk, I just don't feel like I should be promoting something that I can't even take a screenshot of yet. That feels like putting the cart before the horse.
putting the cart before the horse hey
atleast print out your whole idea about this game and make fiew screenshots that look somewhat good
I just like the security of using EC2 with RDS and ElastiCache and using a load balancer and Route 53, gives me piece of mind for production. I only set it up for live full stack testing but I have it shut down most of the time right now to save money until the game actually launches. In the meantime, I'm hosting the dev server locally.
I mean bring this game to a stage where you can descibe the idea porepeply and take good ss
one of my sites has like 600 idioms on it ... a LOT of ai pictures in that. That one was really hard, AI likes to put horses before carts
Oh you pulled that from your own website? With that description, I kinda want to see what else is on there lol
you'll lose an hour looking at the pictures ๐ I still have just over 3000 images to upsize, and a heap more to source. Planning to do that over the next few months, get codex to write a script to help find them for me
I can describe it properly, I have design documents and such, but I want to show the real UI and not some AI generated concept art so I need to at the very least focus on the home screen and the HUD. I can probably knock out a decent looking screenshot in a couple days.
close to top of google search for photographic idioms for me at least
The video I posted in #codex-show-and-tell barely resembles the game at all anymore