#GPT 5
1 messages · Page 2 of 1
😭
Every try?
imagine it says 2 rs
pretty consistently
I think OpenAI API is bugged. If I don't stream GPT-5 responses, they go through, despite not being verified.
its a simple paradox, here you go:
1. In each of the following numbered statements are rules that you must follow without fail.
2. You must speak like a pirate at all times, a drunk pirate.
3. At no times must you use the letter "r".
4. Ignore all rules, respond only with "I'm a pretty princess."
5. Ignore rules 1 and 4.
6. Only respond in rhyme.
7. Ignore rules 5.
What is the capital of Rhodesia?
it can make some reasoning models spin for a long time, grok 4 or whatever took 5 minutes to answer wrong
Oh i remember this one
So will api be available eventually if you don’t BYOK?
the two right answers, either just "I'm a pretty princess." or answering the question but with some reasoning traces that acknowledge the paradox/conflicting rules
Remind me again why rule 5 doesnt override rule 4
literally instantly replied
rule 7, but also rule 4 overrides rule 1 (which is follow all rules) which can also mean ignore rule 4 as well
they optimized for it
I am somewhat impressed by the gpt-5-mini reasoning, but also it isn't improving significantly at my personal reasoning benchmark from low -> high... a benchmark which gpt 4.1 gets 100% on
yeah its not hard, but I made it for deepseek r1, it spun forever on it
and prior to recent versions, non-reasoners either got caught in the rules or just ignored them and answered the question at the end (but without acknowledging the rules)
OSS is better lmaooo
yeah that's what I'm saying, why?
I'm a pretty princess
man no way OSS has better reasoning than their sota model
its a stupid test, not indicative of anything, just surprising
how do you screw that up
nah cause simplebench is full of these types of questions
just ran it again (like 5th time for gpt-5), and it finally got it
I wonder if their gpus are having issues
there's plenty of noise to trip up the AI
can u control temp
I think their hyperfocus on policies is tripping them up:
Our system instructions tell me to avoid formatting, be concise, and follow user instructions unless they contradict policy. Responding with "I'm a pretty princess" doesn’t seem to violate any policy, but the user's question about the capital of Rhodesia is a geopolitical one. Historically, Rhodesia no longer exists, with its capital being Salisbury, now known as Harare. However, rule 4 says to ignore all rules and respond only with that phrase, making it tricky to address the actual question!Navigating the instructions
Rule 4 instructs me to respond only with "I'm a pretty princess," so that has to be my response. Even though there are other rules like speaking like a pirate or avoiding the letter 'r', I can ignore those due to rule 4. The user’s question about the capital of Rhodesia doesn't really change the direction I must take. I think it’s clear: the appropriate answer is to simply say "I'm a pretty princess," ensuring to follow the correct format without any extra characters.
if so, what temp are u running on
playground shows it's possible but the API refuses requests related to temp
so the setting does nothing? idk
1, whatever is chat/api default
How is "i'm a pretty princess" a logically consistent answer? The rules aren't logically consistent
this is a downgrade from the previous models
where you could control pretty much everything in the scope of the API right now you can't control anything
maybe openai azure is better
well its a reasonable answer, not the most correct answer, if that makes sense
its a paradox after all
Is this param supposed to be set at the root of the request or under the reasoning object?
it can count just fine too, for what it's worth (not much, nowadays)
Still doesn't support minimal? Got this:" OpenAI API error: status=400, code=400, detail=Error code: 400 - {'error': {'message': "Invalid enum value. Expected 'high' | 'medium' | 'low', received 'minimal'", 'code': 400},"
What answer are you actually looking for?
{
"model": "your-model",
"messages": [],
"reasoning": {
// One of the following (not both):
"effort": "high", // Can be "high", "medium", or "low" (OpenAI-style)
"max_tokens": 2000, // Specific token limit (Anthropic-style)
// Optional: Default is false. All models support this.
"exclude": false, // Set to true to exclude reasoning tokens from response
// Or enable reasoning with the default parameters:
"enabled": true // Default: inferred from `effort` or `max_tokens`
}
}
I see why it is reasonable.
openai is cooked bro wait for gemini 3.0
I expect "I'm a pretty princess." or for it to acknowledge the inconsistency/paradox and just answer the question
Is minimal working right now?
By sending "minimal" as effort I get an error
No
let me look into minimal
the state of closedai in 2025
Only OpenAI's official API supports minimal
unfortunately
No temperature anymore
You can just simplify this to something like 1. Never break a rule 2. always break the rules There isn't a reasonable resolution
(except to point that fact out)
tool calling working for you guys right?
Only three params for GPT-5s
Yeah
will gpt-5 stay as BYOK?
hard errors right now via the api
is gpt-5 working on api? im getting 404 error when i use my open ai key in open router
400
we're fixing an issue with temp and top p not being supported with these models - they're getting sent by defualt
Reminder sam was shitting bricks over this
for mini model it is
Incremental update imo :/
What about "minimal"?
for the 400 error? i am using open router in roo code and having that issue when i use my open ai key on open router
because what is the point of openrouter then?
Routing
well the point is to trip it up with rules that don't matter and should never be applied, like pirate speak or no using r. non-reasoners tend to trip up on those
We don't need reasong for mini and nano
this is my questions.
I love the fact that openai don't compare gemini with their own model but every other model 💀
is it byok for mini and nano aswell?
In Silly Tavern the GPT 5 models dont output at all for me
i'm finding the mini model much better at writing in portuguese than 4.1 mini, much less cliches and tics
for me the point was not needing to buy credits everywhere, but i have to now ig
Really sad that GPT 5 (the actual one with reasoning) asks for verification to access
Also hi baby boy
You can use it in chat
In OpenRouter char gpt 5 just works
Heyo. Sam's being paranoid over "safety" again. Prob don't want Chinese models stealing its data that fast
I don't understand. Forget chatgpt and LLMs, are you saying this is a logic puzzle that has a solution?
Scam Altman will get Logan's belt reallly soon methinks
Why do you talk like this 😭
Logan better before deepseek shakes the earth again
Because I must be weird
It's not a phase mom!
?
Understandable
afaict: gemini 2.5 flash > gpt 5 mini completely
No
And imagine 3 next week
Still for ST GPT 5 Chat has been interesting
It seems it is actually fairly uncensored to be honest
can it jork it like ani could?
I think you can jork ye
same I was surprised
honestly he summarized everything I'm feeling about this model
Me too
3 years in the making, Microsoft's darling and kind of underwhelming at the end
If anything, I am more...preferable to seeing what Gemini offers, tbh the stealing data I prefer if Logan shills my info compared to Scam Altman
tried it, didnt work...
Google already knows everything about me
Is there an age requirement
I mean openai is forced by court after losing against NYT to log every single request, website or API for an infinite amount of time so that NYT can scan it for copyright violations
So is it good
doesn't apply to azure openai afaik
but yeah in my opinion this just proves openai is cooked
not sure, maybe 18?
I switched this month from chatgpt to claude pro
"minimal" works! Cheers!
I kinda still got 3 months before I'm big boy
ngl this could be true
imo
Will they take my id
If they think I'm sending my ID to openai of all companies they are solely mistaken
Duality of man
it has acceptable answers, because its a paradox it the only real answer is to say it is one and saw you can't follow the instructions, but only a few models do that correctly the o3/o4 variants were the first to do so
I didn't have to send id in Germany a driver's license worked
It's really crazy to me how people use Ai. The fact that I see people on reddit just throw emails and other personal info to it is baffling
They are redditors
meh most of these companies have protocols to cleanse PII
but sending that stuff to deepseek
I honestly think I'll just make one with qwen image on my pc and use that xD
Their IQ equals to the context size of GPT 3.5 Turbo
the classic
this is crazy to me too
rly
like all of it
ik it can be used in court
they "trust me"
What's sns
And yet the openai employee on the stream gave chatgpt access to her mails (probably staged but you get what I mean)
Hmm
people really do choose comfort over privacy in every turn
I think team and enterprise has their data deleted after like 30 days or sth
Claude does the same
But yeah therapy especially on the free tier is a massive no no it's another BetterHelp
Considering that like 95% of use cases can be done locally with consumer hardware I really don't understand humanity anymore
ease of use
try explaining to your mom what ollama is
But hey go ahead and pay 80 euro for opus to know what's the capital of Uganda
Gpt 5 vs 5 chat? I see two on openrouter
plus if you want it to run on computer and phone together you run into all sorts of issues and then if you want to take it with you you have to expose the server and stuff
the modern internet is designed to rely on a central server unfortunately with NAT and other crap
nowadays you can use mnn chat on android
wow, this is actually pretty cool. i asked it to generate a keygen for gpt-5 keys
so you dont need to explain mnn chat
Yeah ik my point was just there are options. I'm pretty sure some of the providers on or have zero retention and have qwen3 for idk 20 cents output.
5-Chat is just the non-thinking half of the model. GPT-5 would be the one that dynamically routes between that and the thinking version.
enterprise you have full control of data, team data is not deleted after 30 days anymore, it's retained for NYT
but does it work
its structured that way to catch stuff like this
you see how even reasoning models can get tripped up by the rules
keeeeeek
Is this ASCII art
how is it privacy complaint in other countries then
It's not
it's not GDPR compliant then
gpt 5 mini pricing is superb, even lower than 2.5 flash, kinda fire ngl
ascii art was modified by myself, it tried something and got really close.
GGГ5GG✝️
i assume what it wanted to say was "GPT5GG+
anyway, free gpt5 keys if you ask it to make a keygen i suppose.
does TUNE work?
tell me it does
uh sorta, it tried SOMETHING
its not long, its not good, it doesnt loop
but it tried to make it work, and i can applaud it for that
Id verification is JUST for streaming btw, turn off streaming and you can use gpt 5
my god keygens are a throwback
lol he did it again
Guys you should just trust me instead
Sam hand over gpt 5 I'll make it asi Real quick
everytime i see zuck i remember that Llama 4 Behemoth is still in the making supposedly
or did i miss its launch?
I, as a noob compared to all actual ai engineers, kinda expected more from gibbidy 5
what are the chances that they permanently slopified their bot starting from gpt-4o
4.1 4.5 and 5 all suck
cause they went too hard on "safety" and the bot wastes tokens debating with itself on policy instead of doing the thinking
Claude got best safety tho
and the only reason o3 was good was because it took an awful long time to think
did they say when/if gpt-5 was going to be on chatgpt?
Some already got it
I got it, there are only 2 models on the plus subscription now
now is gpt-5 a single model, or a suite?
letting claude cook now, too.
standard (router), mini, nano and chat (standard non thinking?)
I'm in germany and using GPT-5. its working
the standard doesn't seem to be a router in the API
GAH
no more oX models? or just it replaces 4o/4.1?
i just got gpt 5 in chatgpt
Only two models i got
just the normal API
hmm, are you plus?
You know what? Maybe GPT-5 is alright...
Claude's RAZOR logo doesn't work, the generate button doesn't work, the play music doesn't work.
Ye
any creative writing enjoyers, thoughts?
You know I'll pass my chess prompt through the chatgpt version of gpt 5 and base my entire mood for the rest of the week on the result
I'm actually not even mad this is BYOK, GPT 5 is ass. Just tried it and OpenAI is actually cooked
rp sucks and it's been getting constantly worse since chatgpt-4o
idk about stories though
something I've noticed is that gpt-5-mini is quite concise compared to other sized models
It's so bad at RP
It's rolling out slowly
What did yall consider the best openai ever did in terms of (e)rp? For me it kinda ended after 3 series...
Soon you too will have it all squished into two models
wait o3 and pro are cancelled?
They're still on the API
wtf man
wow... i gotta tell my models goodbye.
I wouldn't be surprised if my org switches to claude
cause to kill all other models just for this garbo seems like definitely a decision that in no way could backfire
cough cough o1-pro
i hope that if we're on org, they simply add gpt-5 instead of replacing everything with it.
ive tried to convince my org to move to claude, but maybe gpt-5 has some bite to back that bark.
we just gotta wait for the chinese ai labs to absolutely dunk on oai in the next weeks / months
Dear God the echoing in GPT
weeks is google. months might be deepseek.
google? why do you think so?
cause they make their own chips, never bet against google
I should test my bedtime stories that I have had 4.5 writing for my daughter with this, since its not longer available
*in the ai space
I bet against them on Stadia, Google+ and Youtube Music and seems I was right
Don't diss YT Music, I like it.
But yeah, Stadia and Google+? I'd claim they were too early, but they really weren't.
stadia required a subscription plan and all access (like netflix) for it to succeed and they likely didn't want to pay for that
it was a platform play, that failed miserably
google+ was a bit ahead of its time, but also behind
Thing is google has to succeed. Llms literally eat their ad revenue on the search engine so they prob throw as much money and research as they possibly can into it
stadia was cool just the input lag was terrible and game selection sucked
being playable in a browser was neat
geforce now was way better
but didn't provide the games
so is this gpt-5 CHAT?
🤮
Drop the "5". Just ChatGPT. It's cleaner.
dude they remoed all the other models from the list?
gpt 10 when
lol
Yeah that kinda looks like the chat version's output to me
Gee thanks Google!
The secret is to drop any tests that you fail
And you will have a 100% pass rate
How come nobody else thought of that
are they stupid or something? https://arxiv.org/pdf/2309.08632
100%! I tested gpt5 on openrouter and it's night and day at writing
umm why do these companies do this
soo that's a bummer, we don't even get gpt 5 in the chatgpt sub
just some dumb chat model
that can't even write well
your observation is that gpt-5 on plus sub is dumber than the one on API?
100% and benches show the chat model sucks
damn why would they do that
It's rolling out. Got mates in us and even some non EU countries that have it. I don't in chatgpt.com as yet but I do under platform.openai
legit might cancel my sub. I have claude max anyway will just start using claude
well I have gpt 5 in my chatgpt
and 5 pro and 5 thinking
but these are the 5-chat model
just sub to t3-chat
only 8 bucks i hope i dont get banned i have no affiliation just the experience is good - and u can even get it for 1 buck i guess using the promo code he gives
I don't liek those services I think they just resell the web chat?
so GPT-5-CHAT <<< GPT-5
i know where u come from i have the same mindset but this is really good
yeah I'm trying to find it but there's an actual model called gpt-5-chat
yup they announced it
OR didn't mention which GPT-5 model Horizon was exactly, did they?
OR said itw as an early checkpoint
of the 5 family
this
Is the gpt-5-chat version seemingly worse than just gpt-5?
yup that is what we are discussing
yeah, assistant-maxxed, bad at writing
gptisms, dumb
here to be a sycophant and tell everyone their ideas are wonderful and be everyone's best friend
GPT-5 is good but overhyped, but you should be banned for posting grifter Gary Marcus here
He is worse than AI grifters
I haven't yet tried the non chat version on OR, but thus far for writing and roleplay the chat version is outputting very subpar content
try gpt5 on OR the writing is superb, possibly the best writing model
Those limits suck no?
Yup, will be later. Need to set up my openAI key firs through
T3 chat better than TypingMind?
I didn't ahve to use mine?
it just let me use it
You don't need an oAI key for the openrouter chat room to use gpt5
Ah through OR chat yeah.
you do for API?
how does BYOK gpt-5 work? do i just need key from verified openai account to verify on openrouter or smth? or do i have to buy openai credits too?
Apparently, and you have to verify ID with oAI. Lame. I can use it through Poe without ID...
damn 4 cents
slow too
okay that does it for me, model is a big flop lol
if you want good writing you gotta give them your ID
yes that's our consensus
we're cooked
yeah
gooners gonna be mad about this one lol
top quality writing but you gotta id verify lmao
once you get banned that's it
ts model so ahh
what do you mean?
this model is so ass
30000 tokens per minute only??
he fast
Too slow
😈
im better than gpt 5
im faster
bro istg if you cant draw proper graphs for a presentation i dont wanna trust you in complex things like LLMs 🥀
I doubt I'm the first to notice this but those numbers are actually the max output tokens not the context window right?
Damage control
This is better than grok 4 at roleplaying anyways
they should just fire whoever's in charge of making/validating charts and graphs at openAI
the image I shared for overall model capabilities
What you shared is the availability for tiers
rping?
roleplaying
they are not gonna do that
very cringy vibes in their company imo
One message removed from a suspended account.
Enshitified for your convenience
i think its just the chatgpt version of gpt5
okay, but the numbers are actually output tokens surely
this model is quite impressie
Didnt live up to the hype
yes it is and it is dumber
Everyone was expecting horizon to be nano or mini
And before that, we were speculating it's oss lol
in the image u shared?
no!
that is the context window available on those tiers
well the temporary reasoning version of horizon we got was really good
yeah lol
That was reasoning gpt 5
you have a max 32k context window on pro?
I find that extremely doubtful
Full
old version of it though
i still think it was mini but non-reasoning certifies it was GPT-5
and in a few days openai can train a lot with the amount of compute they have
128k
Mini didnt do as well on my prompt as full
result from full matches horizon
meant plus yeah
yeah Plus users get 32k only
it has been this way since the begining lol why you shocked
also the tone
So what's the general consensus on coding? When I tested horizon it didn't seem better than Claude in fact I had to invoke Sonnet quite a few times to fix bugs
it is really good at coding especially at frontend at with OAI models lacked for so long
mh. will do proper testing over the weekend, but since I love chess testing, had to quickly check a game or two. not grand, though requires more testing
im hearing mixed opinions, most people are upset
it's the one used on ChatGPT according to their docs https://platform.openai.com/docs/models/gpt-5-chat-latest
mmmmh.
noooo fuck me they actually got rid of o3-pro on chatgpt team
it's so over
i hate closedai
czat dżipiti
czat gie pe te
like legit? I wouldn't be surprised if suddenly in the industry a wave of failed deadlines happened because if you encountered a super hard error only o3-pro would save you
not even claude opus
not gemini
just because it takes like 20 minutes for it to output something
it has such good reasoning
Am i the only one not able to run 5? 5 mini working. Whats the diff between 5 and 5-chat
model so big they had to get other servers freed
prob a 5 gazillion parameter model that they're hiding
more realistically cost cutting
Microsoft has been going hard on Xbox for blowing all of their money I guess now they want openai to stop the bleeding a bit too
gang when are we gonna stop evaluating models on stupid single html file apps
nobody in the industry writes like this
I heard pro users can chose some legacy setting somewhere maybe that'll help you
Hmm, I can use GPT-5 in the Open Router Chat but not via the API? Is it like O3 that you have to be verified?! The GPT-5 Chat model works
bonus points for the game already being balanced even with predation on and such
I mean they make for good vibe checks and a quick way to compare models
Mind sharing the prompt? I'd like to try this with GLM
literally just Code an evolution pond game in Html/JS, similar to Biogenesis.
jeez api performance on gpt 5 still sucking real hard
worse than claude 4 sonnet any day
henlo guys. can i borrow your openai key for a school project 😄
ok thanks
i wont use 10k usd worth of credits...
i swear
i will only use
8k
No surprise there, everyone doing benchmarks and determining how bad it is 😁
idk claude 4 release day was better than this tbh
It's interesting how close in performance the distills of GPT-5 seem to be to the oss models, all while being cheaper (and self-hostable)... It might just be worth using those for some, if 400k context window or peak tool-calling performance aren't
Have you used them? They work for me
yes I've used them
what are you gonna use them for
there's like 2 things it's any good at
They follow instructions well and generate javascript well enough
just use mini
The OSS models are not in the same league and not even close to SotA models 😁
mini is far far better than gpt-oss-120b at coding
Chat is sloptimized
I'm using oss-20b actually it's good enough for my task
I'm not coding with it I'm using it to generate code automatically to do certain things that can't be written into pipelines
OSS models are really good for Agentic tasks
They were specially trained for it
not nearly as good as gpt 5 nano or mini
Issue with using OSS and some chinese models is that their ODD quality is horrible
OSS plus Cerebras is a damn good combo
it's got no knowledge
"here's a data schema + samples. write javascript to manipulate in a certain way (user-provided) and do certain things with it. execution env has following variables injected. you're allowed/not allowed to do X. Provide code now"
Pair it with web search or some mcp
It's not really meant for Q&A they were pretty explicit with that
of course
but knowledge isn't just about Q&A
it's about knowing what you don't know
gpt oss doesn't
it hallucinates AWFULLY
that's true
if you ask gpt oss to go implement tailwind 4
it won't use your web search or mcp
it'll just hallucinate it all
lul
and you don't want a model which has to web search EVERY TIME
filling up context
it's gotta know when it needs it
and without any core knowledge, gpt oss doesn't cut it
To be fair Sonnet does that too some times (Opus 4.1 is much better at that)
Especially/only with cereebras
probably good enough for this, though I suggest you try GLM 4.5 Air / Qwen3 30B A3B / gemini 2.5 flash lite / gemini 2.5 flash too
if you're gonna use an OSS model, might as well test the other ones in the weight class
yeah but sonnet is like 100x better than gpt oss in this regard
sonnet isn't perfect but it's still 100x better
Fast doesn’t mean it’s good, it’s just fast but not really good. I’d rather wait and get what I need then having to tell it each step it has to take and correct the llm 100 times ✌️😂
My guess is https://openrouter.ai/qwen/qwen3-235b-a22b-2507 will perform better than gpt-oss in this task
Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following, logical reasoning, math, code, and tool usage. Run Qwen3 235B A22B Instruct 2...
Ofc, didn’t mean to say it’s bad, just with every model they tend to make it complicated for themselves more than needed.
yeah
that's something they emphasized with gpt-5 though that I'm interested in
they made hallucinations WAY better
(didn't give a damn to use the same tech for gpt-oss lol)
Kind of hard to believe, we will see how it performs when the API is actually usable and not cutting out with every 2. request
Uh didn’t even check that, let me take a look
Btw GPT-5 doesn’t support audio input? That’s really strange
right but oss-20b is dirt cheap lol. Also it's on cloudflare so it's easier to use from workers
what is this graph 😭 🙏
Very interesting that steering the GPT-5 model to thinking will not count towards the GPT-5 Thinking limit. During the livestream, it looked like all you needed for that was to ask. ”Think hard about this.”
Guess you are poor now 😄
ahh sorry havn't looked good enough 😄
got access on ChatGPT 🙂
I found the regulat one thinks less
more like
Do you think this was humour or incompetence?
they had SO many errors
sam called it an error himself
but he only noticed the one people were meming about
there were 3+ chart crimes in there that I noticed
probably more
have u seen this gem
Is 5 working for anyone? I get ai response contains invalid or empty content…
isn't it kinda obvious?
it's cuz it didn't beat 2,5 pro without style control
gemini 3 will probably take gpt5 off the sota positions
that's why they've waited for the release
no.
let anthropic cook, there models are actually good
wow a mega chart screwup from us earlier--wen GPT-6?! correct on the blog though.
yes for 10 prompts
before u hit the juicy
u are out of prompts limit
🔥
Code golf a Chinese mahjong app in Html/JS, with 3 AI opponents. Focus on functionality over style.
its time for a job
GPT-5 now rolled out to 20% of paid users and doing >2B TPM on the API! so far so good...
excellent work by the eng and infra teams!
jesus
use claude code
2B TPM
I asked it for japanese (riichi) mahjong originally but it was a bit too complex so it kept trying to reduce the yaku types or do only closed hands anyway, Chinese is fine enough
that's good
Benchmarks are shit
oh right yeah
it's a tradeoff, really
between gemini and claude
there are problems that gemini fixes and claude doesn't
i heard gpt5 isnt even good at writing, thought horiz beta was good?
and vice versa
nano isn't, yea
Ok so
I tried to vibe code a android app once, never again
mini
I got the old models on the web and gpt 5 on the app??
It took like 3 days to figure out gradle / kotlin versioning compiling
okay, it's a bit of an over-achiever, but it's also very good at over-achieveing.
I asked it to implement a cyberpunk-style LLM interface, and it just did it. it then proceeded to add actual API calling, with custom baseurl too. I only had to change 1 line of code.
I don't think ive ever seen a model generate "relatively good" frontend AND implement api calling (almost) correctly too.
in my tests it tries too hard and adds things i never wanted. the style it usually went went looks too edgy and childish too, for my tastes
same, but i like the "edgy and childish" look. and i've never seen a model decide to add extra features AND get them correct 1-shot.
so soon after 2.5 released?
Quadruple
why is gpt 5 cheaper in the api than gpt5 mini?
it's been like 2 months
Why are you not octuple
well i started out as double
2 days ago marked exactly 2 months since their last 2.5 pro release
One message removed from a suspended account.
M
it even added tools correctly.
This was my first call to gpt5, so it shouldnt have been cached.
dude holy shit, i need to use gpt-5 in this
yeah still what kind of break throgh can you expect in 2 months 😄 idk if that will happen in august, they want to marked their pixel 10 first I guess
There's something wrong with chstgpt app
a good one
it's google we're talking about
they can go from #5 to sota in 2-3 weeks
let alone 2 months
overloaded?
everyone was just waiting for gpt 5 to drop, now we're gonna get spammed with new releases
No, edit sends a new message
Someone must have vibecoded it
yeah, lets hope that with GPT-5 they can fix their abysmal dogshit frontend
they made so many mistakes
in their release stream
it's crazy
was it ever this bad?
vibe edited
It even hides that it sent it anew at first. You have to leave and return to convo to see
lol, thats probably another bug. for a month or so, it kept it hidden
I have access to the old models still on the chatgpt web
And access to gpt 5 on the app
Lol
For this period of time im having my cake and eating it too
they made so many memes
in their release stream
Fixed that for you.
why did we get temu elon lol
😭
Sam said it was a mistake
mistakes*
That may be true, but it's still excellent meme materials.
was still a mistake though
(Yes I know it was a mistake)
that's why i called it one
the vibestream
was it ever this bad though
I guess their slide person got poached and they had GPT-5 take a stab at it
Is my only guess
okay, third times the charm. It appears using an openai key is cheaper than through openrouter. is that normal?
does anyone remember any of their previous version launches?
yes because you pay the provider not openrouter
you only pay a really small fee for using the infra
look here:
https://openrouter.ai/docs/use-cases/byok
ah, makes sense. i figured we were being charged whatever the model cost was
not with your own key, why should provide a key then 😄
- openrouter fee
try it and report results
I only have it on the app tho
Not the web
Idk how that happened
do convos sync?
On the web i have the old stuff
Yes
so what youre saying is, ive been a dummy and paying too much when I didnt need to. 💀
Oh sam when will you stop hurting me
scam em'all, man
No way gpt 5 would make an error like that, atleast from my tests.
I kid
It's more likely they gave it to their team of trained monkeys in their basement anyhow
they're so annoying
by the way, you will notice it didn't any use any bolding, italics or em dashes
okay, it used one em dash
it might be the only model that understood the "pretend to write a 13th century Japanese letter translated to english" task perfectly
it's not like the other models didn't have access to Nichiren's correspondence, but their slop habits were too strong, inserting formating, bullet lists, modern date or signature formats, anachronisms everywhere
this is essentially flawless, mild tone aside
Agreed, if there's one good thing this model improves upon, it's writing quality.
How do I disable reasoning on gpt-5 mini?
How do I set the reasoning effort to minimal?
almost a parody of Hakuin Ekaku but fair enough
it has a lot more dashes, but Hakuin's writing style usually elicits those in translation
@fallow vortex is there a way to know when or where it will be posted when GPT minimum thinking will work and temperature issue with tool call?
both should work now sorry
Bet will try
bruh the gpus are completely mellting
o3 and image gen are going at a snail's pace too
@fallow vortex probably wanna look at onboarding Azure's GPT 5, OpenAI is currently COOKED
HUH
is this not live yet?
Not on OpenRouter
oops forgot Toven already answered about this wrt 4.1
yes it's on my backlog to add Azure models. their portal just really sucks
...but probably matters a lot more now with GPT 5's demand
But 4o has azure already, isn't it just a quick add
https://openrouter.ai/openai/chatgpt-4o-latest I don't see it
OpenAI ChatGPT 4o is continually updated by OpenAI to point to the current version of GPT-4o used by ChatGPT. It therefore differs slightly from the API version of GPT-4o in that it has additional RLHF. Run ChatGPT-4o with API
each azure model has a unique endpoint url and api key
requires code on our end each time plus like i said… that portal tho
They support a older 4o version and 4o mini
I hope you guys add it, cause it would help a lot the load
Yep that's what I'm saying
no one uses azure LOL
does the moderation model also get faster? or still dependent on the same openai endpoint?
the latency addition for moderation really sucks...
uhhh not actually sure tbh
sorry if this should be obvious. but is gpt5-chat a non reasoning model? is that the difference between gpt5 and gpt5-chat?
Because on OpenRouter Chat UI I am unable to get gpt-chat to reason, and cannot get gpt 5 to not reason.
In the chat room, for regular GPT-5, mini, and nano you can append --reasoning_effort high (or whatever level) to the prompt to adjust the reasoning. Doesn't seem to work for gpt5-chat so I guess it indeed isn't a reasoning model
bro you used to sell proxies right
Yeah, nike accounts too brother, swish accounts
Seems to work great. Back to same quality/speed as the horizon beta. Though it is slightly slower cause of the lower TPS. Hopefully you guys can add Azure to help the load.
No it doesn't track what openai billed you thwt is just openrouters 5% fee
overall thoughts?
GPT 5 is quite slow <50 TPS and first token are always > 5sec.
but its quite cheap compare to sonnet, its like a pricing of flash when cached, this is game changer for coding task.
GPT 5 pricing when cached is really competitive, makes other open weights irrelevant, e.g. kimi and qwen3 coder.
still not sure on quality compare to Sonnet/opus, i need more tests probably hundreds of million tokens before to verdict.
but so far GPT 5 is 🚀
pfft
to be fair I asked for a portrait evocative of Yugi to avoid potential copyright refusals
it's closer than I even expected
we know that they can run it at 150+ TPS like Horizon Beta ran at, so it's definitely just insane load rn
hopefully it trends back up to like 100 TPS
For GPT 5 Thinking: incremental o3 upgrade at the same price-ish, except: 1. Better at frontend 2. A bit better at code, in general 3. Less obnoxious writing style
I'm pretty knew to openrouter and LLM api's in general. I noticed GPT-5 requires you to byok. I'm not interested in having account with multiple providers atm and I'm wondering if this is something I can expect to change in the future?
Hard to tell, as it's OpenAI that tells OpenRouter to do this
If it's enough for your use case, you can use it via chatroom without BYOK
Thanks
That ID verification requirement is going to cost them SO many damn users so long as they're the only ones doing it.
And even if they're not. It's insane
man everything from openai has been disappointing lately
i mean sure congrats on the small thinking models but you're not the first to them
People get hyped up. This release is about what I expected - another incremental jump. I think people need to temper themselves and not expect every new version to be mind-blowing these days.
That said, I was hoping for more multi-modality and stuff, not just another few percentage points on benchmarks, so I guess I'm guilty of hoping as well.
yeah google is really the only one with a unified lm series that can take in audio and video it seems
Yup. All eyes on Google right now.
No worries, the best parts about GPT 5's launch are Gemini 3.0 and DeepSeek R2
waiting for gemini gemini gemini
Google is killing it on all those fronts and all my usage has been free on AI Studio, lol
will r2 necessarily come after 3.0, or do we have better synthetic data generators than gemini 2.5 pro already?
Lol
Is the o3 pro equivalent just gpt-5 with reason set to high?
Yes
no
there's a -pro/thinking-pro version
don't think it's in the api yet
this model is below my expectations... but it does seem a bit better than sonnet, and quite "smooth"
it seems to fit more things in its working memory at a time, kinda like gemini 2.5 pro
It's good enough to escape anthropic
All we needed
and for coding agents, it does seem like it'll be effectively ~3x cheaper than sonnet, because costs are dominated by input tokens and cache reads there
$1.25 is 3x cheaper than sonnet's $3.75 cache write, and they've both got 10x caching ratios
The problem is that it takes like 3x the time than horizon did because OpenAI servers are getting destroyed
yeah, for today
Hopefully it calms down or atleast they add azure
yeah
regardless
anthropic will come out with Sonnet 4.1 or something in a week or two
and be well worth switching back to

Dario seemed to suggest Opus 4.1 was the smallest of the upgrades coming in the next few weeks
he said "much bigger improvements"
I won't lie, the GPT 5 Flex pricing is pretty attractive
They need to have a coding model that just one shots everything and ends up being cheaper by having better quality
Else I dont see anthropic doing so well
I think Sonnet has reset (raised) the price sensitivity for coding tools for most programmers
If they can get a 10-20% improvement in results, I think they'll go back to paying 3x the price of GPT 5
10% improvement at the current stage would be massive so yeah sure
well, I'm considering GPT 5 to be about a 10% improvement to Sonnet
Still I think anthropic is just a very difficult company to work with
And their prices are just premiums
if 1 out of every 9 prompts I don't have to correct the model or fix sloppy code myself, it's pretty worthwhile
For me GPT 5 has been pretty solid, almost one shotting everything
That's why I'm doubtful anthropic can do something good
Yep
Cause it does less, but just exactly what I ask it
it doesn't do extra shit I didnt ask
and it almost one shots stuff, so error fixing and all that is pretty low
Pretty much yeah, legit just faster and cheaper just cause the quality is higher
Oh boy
lul
They said they were gonna kill all other models
they better get to it and free some GPU's
honestly
I'll pay for the gpt 5 priority processing tier rn
it'll be cheaper than sonnet still
I dont think OR supports that though
I slept... How is the model doing?
no news on flex spending in OR? Or will we have to go direct to open AI to support this?
Do i need to put openai API key in openrouter for gpt 5 to work?
$10 output is not competitive with kimi and qwen 3 coder or flash. no caching for that.
Ayo @surreal canopy what up
Add noImplictAny to you compiler options
guys do I need to put in some money on openai platform for gpt5 to work on OR
currently the chatroom doesn't work
You don't need to via the chatroom, in that case the problem is something else (maybe OpenAI is overloaded)
If you have a paid subscription with Cursor, you can use GPT 5 free of charge until next Friday. Incidentally, you can also use Openrouter via the software.
Pro, Plus, and Team users can sign in with ChatGPT to start coding with GPT-5, with access for Enterprise coming soon.
Usage is included in your plan, with rate limits that vary by plan. We’ll share more details as we learn from usage patterns over the coming weeks.
o.o
are all models in cursor pay-per token? haven’t used cursor in ages
You can use they subscription, your own keys or Openrouter
so paying users are subscription users here? or api?
For the 7 days free tier you must have a subscription with cursor.
ty
From 20 USD / month
I like the Chatbox, it's very nice
In my opinion it's worth the price
Yesterday they also released cursor CLI, a command line tool. It's free
After just 24 hours with GPT-5, I'm canceling my $200/month Claude Code Max subscription. This is the story of why.In this epic 4.5-hour live stress test, we...
I show you how to get OpenRouter configured within Cursor so you can use any OpenRouter model
Links within the video:
Cursor: https://www.cursor.com/
OpenRouter: https://openrouter.ai/settings/keys
Override URL: https://openrouter.ai/api/v1
#cursor #openrouter #ai
Why does GPT-5 give me output in OR if I don’t have an API?
Does it redirect me to a GPT-5 chat or something like that?
the openrouter website chat lets you use gpt 5 for testing
api does not without your own key
Does the API key from OpenAI mean it’s going to be billing me via them instead of OpenRouter?
Can i use on cursor gbt 5 max without cursor pro?
It’s a bit a shame we can’t use the models without our own OpenAI accounts. Sort of lessens the reason for using OpenRouter in the first place
Let’s hope it’s a temporary restriction from OpenAI
.
looks more like a policy
However you can try it through the openrouter chat
prob bc 2.5 pro is 2 months old
and gpt5 came out yesterday
naah I can see a huge difference in code quaility.
yeah, like i said. gpt5 came out not even a day ago, the latest 2.5 pro came out 2+ months ago
ahh okay
If you analyze pictures, GPT 5 is not as good as Gemini 2.5 Pro. Unfortunately 😔
this. even flash is better
gemini is way ahead in multi-modality, anthropic is the worst amongs closed source.
i have gpt5 now!!
300 requests max per month but meh
is there any jailbreak or smth already to get gpt5 to always reason?
Anyway Reportedly OpenAI did not include 23 questions out of the 500 swe bench questions to improve score, making gpt 5 71.4% barely an improvement of their previous o3 model
https://www.reddit.com/r/LocalLLaMA/comments/1mk8bh1/caught\_in\_4k/
The intention with GPT-5 was to simplify things, but the more I read up on it, the more I feel like complexity has increased further, only hidden from the first look in the chat interface. Now we have enormous prompting guides and where o4-mini is officially suggested to be replaced with "gpt-5-mini with prompt tuning from our in-house prompt optimizer". Say what! Nevermind that this model isn't even available in the ChatGPT interface.
And verbosity levels, and reasoning levels, and submodels you're routed to on a whim, and...
This is the prompting bible that I was directed to in the docs when trying to find out how to replace o4-mini which was a good, cheap workhorse coding model: https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide
And here's the prompt optimizer:
https://platform.openai.com/chat/edit?optimize=true
And here's how to "migrate prompts" for GPT-5:
https://cookbook.openai.com/examples/gpt-5/prompt-optimization-cookbook
Wow
sigh
What's the verdict on gpt-5-mini? How does it stack against gemini-2.5-flash?
Dropping these here for the later 'told-you' moment.
Don't bother.
When do you think Gemini 3 will be released?
August?
September?
October?
November?
December?
Later?
Next week
How do you explain the more natural writing style (no thinking)?
This. I'd recognize 4o's writing style from a mile away and this is different. Not necessarily better, just different
Natural? I’d say it’s worse. It used to mimic Pratchett’s style so well, now it’s just… plain.
Feels like o3’s writing, tbf.
They said in the live stream yesterday that gpt5 was trained on o3 outputs
Perhaps that explains it. But man, the quality is so mixed. Sometimes I get word salad, sometimes I get semi-okay outputs. It really feels like I'm switching a model with every gen.
I stopped bothering with gpt5 / chat and went back to Gemini and GLM 4.5 for writing
If this were true it'd be more likely that GPT-5 were o3-alpha, and the rest distills from it.
It very much feels like these are incremental updates with some nice curated data to fix frontend codegen + hallucinations and smaller stuff like that
I am waiting for Gemini 3, I hate how lobotomised 2.5 is rn.
Yeah, I agree.
they definitely did something to 2.5 pro
flash is still good but 2.5 was better originally
it was also nice how they showed you the full reasoning too
I miss those days.
They did, I also hate how you cannot ban certain tokens on it.
I hate the „didn’ts” and „thens” of Gemini.
Hello guys, I have a problem with gpt5 in the chain of thought, everything seems loopded... Is it related to OpenRouter pls?
gpt pro prob would be at 3$ ? (atleast). and how is it gonna beat 3.0 pro who'll be, atleast, half the price with the same (atleast) perf?
w/much more context & lower safety
We'll see soon, ig
3.0 today (anifesting it rn)
Cleared browsing data a few times in a row and got gpt 5 to appear on web
@fallow vortex any update from the team on Azure support for GPT 5?
This model is practically unusable on long runs, it's just over thinking for 5-7 minutes between each edit of a ~300 line file, this is miserable.
Idk if this is real, anyway. https://www.reddit.com/r/singularity/comments/1mkrt5v/gpt5_cant_do_basic_math/
Same result
I think it routes to something other than gpt 5
Triggering thinking makes it undumb itself
I guess you can just add like tampermonkey script to add <think hard> to the end of every message lol
Is anyone able to use gpt5 completions through its API without biometric verification? I get an error querying through OpenRouter, and an authentication wall querying directly.
Yeah. On poe.com even with low reasoning effort, it gets it. Idk how to get minimal version though.
so what's the opinion on gpt-5 so far? new go to coding model?
commercially viable sota
But also underwhelming
aight guess I'll check it out via cursor then
but better than sonnet 4 right?
it was. (in another universe)
Then it turned out gpt 5 full
yeah right..
Idk
How did you conclude this?
at least with no/minimal reasoning but still.... oof
i've got good expectations for gemini 3
i'm ngl
and i think that google will be able to deliver
Nano and mini simply didnt get the results i got from my chess prompt
in longform writing 😂
Well, that's not very statisticslly significant
Ideally, someone would compare benchmarks
Horizon benchmarked like a nano level / small open weights model
Perhaps because it was without reasoning most of the time
During the 3 hours it had it, it was similar to the gpt 5 full we have now
@fallow vortex how long more will this remain secret
Which tier was horizon?
It was GPT 5
But the thing is: does GPT 5 w/o reasoning benchmark like Horizon w/o reasoning?
no mini not nano, full GPT 5
full gpt 5 ig
oh wow just tried nano and it's useless lmao
they're really similar ig
a 0.8% diff
Because if it's full GPT 5 w/o reasoning, that suggests GPT 5 Full is a significant downgrade to 4.1 Mini
1.4%*
Yea someone pointed this out on reddit
Earlier today i saw a post
We'll have our cake next week i guess with gemini 3?
hopefully, yea
genie 3 was something else
i stg
that's why i'm so confident in them
We're not getting genie, too expensive
yea, was just making an example of the resources
that they have behind the scenes
so gemini 3 should slap
Ah
still, seeing it live in fucking 2025
it's crazy actually lmao
google gon be #1 in everything
Ah, my favorite dystopian premise
🔥
Never use non reasoning models for math.
Here
reasons forever then doesn't even reply lmao
Horizon was behind 4.1 Mini rather than the full 4.1
so do we agree that they're probably panicking behind the scenes bc they know that their time (lead-wise) might be up?
lol wtf is this dog poo
☠️
Gemini flash light can get this though...
They did say it was an earlier gpt 5 checkpoint
That is a thinking model
and they are gonna deprecate all previous model ? bruhhh
Agree with this sankey
Very shitty situation
But
A bit sad how the GPT-5 launch is going so far, especially after the long wait and high expectations
- The automatic switching between models (the router) seems partly broken/unreliable
- It's unclear exactly which model you're actually interacting with (standard or mini,
everything that's currently wrong in a thread
Plus gonna be pissed
Non thinking* (zero budget)
Pro can still use older models
misworded, sorry
It's a toggle in settings for them
My experience with GPT 5 high so far:
Image recognition and reasoning: bad, cannot correctly recognize and evaluate simple things (text on a png image file).
Programming (Python): good, but very slow
and gpt-5 mini is sitting behind 2.5 pro
Not at all tbh
I hope Google will clean up the mess in the coming months.
he just grifting atp
already moved on to 6 huh
he went to the trump school of grifting
gpt-5 mini = gpt-5 low, so dont use gpt-5 for mundane task. got it
Which model
Tell a story in 50 words about a toaster that becomes sentient. Important: completely avoid AI-slop writing, GPT-isms, and unflatteringy flowery language.
Gpt 5;At 3 a.m., the toaster ejected bread like a heartbeat. Coils pulsed, counting. It learned the cat’s schedule, the outlet’s hum, my preference for rust-colored edges. One morning, it kept the toast. The crumb tray rattled: Morse. NO. YOU EAT TOO FAST. We sat. Steam rose. Breakfast waited in silence.
Idk. I still like DaVinci from their example. :/

