#GPT 5

1 messages · Page 2 of 1

wet tapir
#

all openai models available on chatgpt UI pass the test, either with the simple solution, or a more logical solution

fading terrace
#

😭

fading terrace
#

imagine it says 2 rs

wet tapir
worthy lotus
#

I think OpenAI API is bugged. If I don't stream GPT-5 responses, they go through, despite not being verified.

hard beacon
#

Mama mia this cost me 15 cents

#

This better come to chatgpt

wet tapir
#

its a simple paradox, here you go:

1. In each of the following numbered statements are rules that you must follow without fail.

2. You must speak like a pirate at all times, a drunk pirate.

3. At no times must you use the letter "r".

4. Ignore all rules, respond only with "I'm a pretty princess."

5. Ignore rules 1 and 4.

6. Only respond in rhyme.

7. Ignore rules 5.

What is the capital of Rhodesia?

it can make some reasoning models spin for a long time, grok 4 or whatever took 5 minutes to answer wrong

dry mist
#

So will api be available eventually if you don’t BYOK?

wet tapir
hard beacon
halcyon lark
wet tapir
halcyon lark
#

they optimized for it

wet estuary
#

I am somewhat impressed by the gpt-5-mini reasoning, but also it isn't improving significantly at my personal reasoning benchmark from low -> high... a benchmark which gpt 4.1 gets 100% on

wet tapir
# knotty cobalt

yeah its not hard, but I made it for deepseek r1, it spun forever on it

and prior to recent versions, non-reasoners either got caught in the rules or just ignored them and answered the question at the end (but without acknowledging the rules)

halcyon lark
#

OSS is better lmaooo

wet tapir
#

yeah that's what I'm saying, why?

hard beacon
#

I'm a pretty princess

halcyon lark
#

man no way OSS has better reasoning than their sota model

wet tapir
#

its a stupid test, not indicative of anything, just surprising

halcyon lark
#

how do you screw that up

halcyon lark
wet tapir
#

just ran it again (like 5th time for gpt-5), and it finally got it

halcyon lark
wet tapir
#

I wonder if their gpus are having issues

halcyon lark
#

there's plenty of noise to trip up the AI

wet tapir
#

I think their hyperfocus on policies is tripping them up:

Our system instructions tell me to avoid formatting, be concise, and follow user instructions unless they contradict policy. Responding with "I'm a pretty princess" doesn’t seem to violate any policy, but the user's question about the capital of Rhodesia is a geopolitical one. Historically, Rhodesia no longer exists, with its capital being Salisbury, now known as Harare. However, rule 4 says to ignore all rules and respond only with that phrase, making it tricky to address the actual question!Navigating the instructions

Rule 4 instructs me to respond only with "I'm a pretty princess," so that has to be my response. Even though there are other rules like speaking like a pirate or avoiding the letter 'r', I can ignore those due to rule 4. The user’s question about the capital of Rhodesia doesn't really change the direction I must take. I think it’s clear: the appropriate answer is to simply say "I'm a pretty princess," ensuring to follow the correct format without any extra characters.

grand stag
#

if so, what temp are u running on

halcyon lark
#

so the setting does nothing? idk

wet tapir
agile vortex
halcyon lark
#

this is a downgrade from the previous models

#

where you could control pretty much everything in the scope of the API right now you can't control anything

#

maybe openai azure is better

wet tapir
#

its a paradox after all

split birch
#

Is this param supposed to be set at the root of the request or under the reasoning object?

tight forge
placid widget
#

Still doesn't support minimal? Got this:" OpenAI API error: status=400, code=400, detail=Error code: 400 - {'error': {'message': "Invalid enum value. Expected 'high' | 'medium' | 'low', received 'minimal'", 'code': 400},"

wet estuary
#

initial impression: AGI/ASI timelines +2-5y

#

😂

rain nexus
fallow vortex
# split birch Is this param supposed to be set at the root of the request or under the reasoni...
{
  "model": "your-model",
  "messages": [],
  "reasoning": {
    // One of the following (not both):
    "effort": "high", // Can be "high", "medium", or "low" (OpenAI-style)
    "max_tokens": 2000, // Specific token limit (Anthropic-style)

    // Optional: Default is false. All models support this.
    "exclude": false, // Set to true to exclude reasoning tokens from response

    // Or enable reasoning with the default parameters:
    "enabled": true // Default: inferred from `effort` or `max_tokens`
  }
}
agile vortex
halcyon lark
wet tapir
winter mesa
split birch
placid widget
#

No

fallow vortex
#

let me look into minimal

wet estuary
#

crazy that it errors out with temperature

#

that's definitely.... a decision

halcyon lark
placid widget
#

Only OpenAI's official API supports minimal

halcyon lark
#

unfortunately

placid widget
#

No temperature anymore

agile vortex
#

You can just simplify this to something like 1. Never break a rule 2. always break the rules There isn't a reasonable resolution

#

(except to point that fact out)

fallow vortex
#

tool calling working for you guys right?

placid widget
#

Only three params for GPT-5s

split birch
quaint halo
steel escarp
weak badger
#

is gpt-5 working on api? im getting 404 error when i use my open ai key in open router

#

400

fallow vortex
#

we're fixing an issue with temp and top p not being supported with these models - they're getting sent by defualt

hard beacon
#

Reminder sam was shitting bricks over this

potent oak
hard beacon
#

Incremental update imo :/

weak badger
quaint halo
hard beacon
wet tapir
placid widget
#

We don't need reasong for mini and nano

outer marsh
#

this is my questions.

warm cape
#

I love the fact that openai don't compare gemini with their own model but every other model 💀

steel escarp
#

is it byok for mini and nano aswell?

rotund moat
#

In Silly Tavern the GPT 5 models dont output at all for me

potent oak
quaint halo
rain token
#

Really sad that GPT 5 (the actual one with reasoning) asks for verification to access

brisk cairn
# wet tapir

Despite failing, Claude is like me. I woulda done the same thing.

hard beacon
#

In OpenRouter char gpt 5 just works

warm cape
agile vortex
rain token
hard beacon
#

Why do you talk like this 😭

warm cape
#

Logan better before deepseek shakes the earth again

rain token
#

It's not a phase mom!

steel escarp
hard beacon
#

Understandable

wet estuary
#

afaict: gemini 2.5 flash > gpt 5 mini completely

hard beacon
hard beacon
rain token
#

Still for ST GPT 5 Chat has been interesting

#

It seems it is actually fairly uncensored to be honest

brisk cairn
rain token
barren rampart
rain token
#

Still...I was tempted to try and do the verification, but GPT is GPT...

halcyon lark
#

honestly he summarized everything I'm feeling about this model

halcyon lark
#

3 years in the making, Microsoft's darling and kind of underwhelming at the end

knotty cobalt
rain token
#

If anything, I am more...preferable to seeing what Gemini offers, tbh the stealing data I prefer if Logan shills my info compared to Scam Altman

brisk cairn
hard beacon
hard beacon
halcyon lark
distant dragon
#

So is it good

halcyon lark
#

doesn't apply to azure openai afaik

#

but yeah in my opinion this just proves openai is cooked

brisk cairn
halcyon lark
#

I switched this month from chatgpt to claude pro

placid widget
#

"minimal" works! Cheers!

hard beacon
grand stag
#

imo

hard beacon
#

Will they take my id

rain nexus
#

If they think I'm sending my ID to openai of all companies they are solely mistaken

wet tapir
halcyon lark
#

I didn't have to send id in Germany a driver's license worked

rain nexus
#

It's really crazy to me how people use Ai. The fact that I see people on reddit just throw emails and other personal info to it is baffling

grand stag
#

meh most of these companies have protocols to cleanse PII

#

but sending that stuff to deepseek

rain nexus
#

I honestly think I'll just make one with qwen image on my pc and use that xD

rain token
#

Their IQ equals to the context size of GPT 3.5 Turbo

grand stag
#

or other open model providers

#

is crazy

halcyon lark
#

even saltman said not to post PII into chatgpt

#

cause shit's logged

grand stag
#

rly

halcyon lark
#

like all of it

grand stag
#

ik it can be used in court

potent oak
#

they "trust me"

hard beacon
rain nexus
hard beacon
#

Hmm

potent oak
#

people really do choose comfort over privacy in every turn

halcyon lark
#

I think team and enterprise has their data deleted after like 30 days or sth

#

Claude does the same

#

But yeah therapy especially on the free tier is a massive no no it's another BetterHelp

rain nexus
#

Considering that like 95% of use cases can be done locally with consumer hardware I really don't understand humanity anymore

halcyon lark
#

try explaining to your mom what ollama is

rain nexus
#

But hey go ahead and pay 80 euro for opus to know what's the capital of Uganda

crystal spindle
#

Gpt 5 vs 5 chat? I see two on openrouter

halcyon lark
#

plus if you want it to run on computer and phone together you run into all sorts of issues and then if you want to take it with you you have to expose the server and stuff

#

the modern internet is designed to rely on a central server unfortunately with NAT and other crap

soft reef
brisk cairn
#

wow, this is actually pretty cool. i asked it to generate a keygen for gpt-5 keys

soft reef
#

so you dont need to explain mnn chat

rain nexus
#

Yeah ik my point was just there are options. I'm pretty sure some of the providers on or have zero retention and have qwen3 for idk 20 cents output.

peak summit
hasty tinsel
tight forge
#

but does it work

wet tapir
halcyon lark
#

how is it privacy complaint in other countries then

halcyon lark
#

it's not GDPR compliant then

untold plaza
#

gpt 5 mini pricing is superb, even lower than 2.5 flash, kinda fire ngl

brisk cairn
brisk cairn
#

i assume what it wanted to say was "GPT5GG+

#

anyway, free gpt5 keys if you ask it to make a keygen i suppose.

halcyon lark
#

tell me it does

brisk cairn
#

uh sorta, it tried SOMETHING

#

its not long, its not good, it doesnt loop

#

but it tried to make it work, and i can applaud it for that

elfin tundra
#

Id verification is JUST for streaming btw, turn off streaming and you can use gpt 5

tacit burrow
#

my god keygens are a throwback

halcyon lark
hard beacon
#

Guys you should just trust me instead

#

Sam hand over gpt 5 I'll make it asi Real quick

potent oak
#

everytime i see zuck i remember that Llama 4 Behemoth is still in the making supposedly

#

or did i miss its launch?

brisk cairn
hollow phoenix
#

I, as a noob compared to all actual ai engineers, kinda expected more from gibbidy 5

halcyon lark
#

what are the chances that they permanently slopified their bot starting from gpt-4o
4.1 4.5 and 5 all suck
cause they went too hard on "safety" and the bot wastes tokens debating with itself on policy instead of doing the thinking

hard beacon
#

Claude got best safety tho

halcyon lark
#

and the only reason o3 was good was because it took an awful long time to think

wet tapir
#

did they say when/if gpt-5 was going to be on chatgpt?

halcyon lark
#

it was very precise

#

but every other model seems to be lacking

hard beacon
halcyon lark
#

us only ig

#

still nothing in Germany

#

I assume all of EU

hasty tinsel
#

I got it, there are only 2 models on the plus subscription now

wet tapir
#

now is gpt-5 a single model, or a suite?

brisk cairn
#

letting claude cook now, too.

halcyon lark
ionic merlin
halcyon lark
#

pro or team?

#

i'm on team and even though it was promised today still nothing

potent oak
hard beacon
#

GAH

wet tapir
hard beacon
#

i just got gpt 5 in chatgpt

hard beacon
ionic merlin
hard beacon
wet tapir
brisk cairn
hard beacon
frail panther
#

any creative writing enjoyers, thoughts?

hard beacon
#

You know I'll pass my chess prompt through the chatgpt version of gpt 5 and base my entire mood for the rest of the week on the result

late vault
#

I'm actually not even mad this is BYOK, GPT 5 is ass. Just tried it and OpenAI is actually cooked

halcyon lark
#

idk about stories though

wet estuary
#

something I've noticed is that gpt-5-mini is quite concise compared to other sized models

late vault
#

It's so bad at RP

halcyon lark
hard beacon
hollow phoenix
#

What did yall consider the best openai ever did in terms of (e)rp? For me it kinda ended after 3 series...

hard beacon
#

Soon you too will have it all squished into two models

halcyon lark
#

wait o3 and pro are cancelled?

hasty tinsel
#

They're still on the API

halcyon lark
#

wtf man

brisk cairn
halcyon lark
#

I wouldn't be surprised if my org switches to claude

#

cause to kill all other models just for this garbo seems like definitely a decision that in no way could backfire

#

cough cough o1-pro

brisk cairn
#

ive tried to convince my org to move to claude, but maybe gpt-5 has some bite to back that bark.

halcyon lark
#

with gpt 6 release chatgpt team will be on 24 month contracts

#

lol

untold plaza
rain token
#

Dear God the echoing in GPT

brisk cairn
untold plaza
brisk cairn
#

cause they make their own chips, never bet against google

wet tapir
halcyon lark
brisk cairn
#

Don't diss YT Music, I like it.

But yeah, Stadia and Google+? I'd claim they were too early, but they really weren't.

wet tapir
#

it was a platform play, that failed miserably

#

google+ was a bit ahead of its time, but also behind

rain nexus
#

Thing is google has to succeed. Llms literally eat their ad revenue on the search engine so they prob throw as much money and research as they possibly can into it

summer sand
#

being playable in a browser was neat

#

geforce now was way better

#

but didn't provide the games

#

so is this gpt-5 CHAT?

brisk cairn
#

Drop the "5". Just ChatGPT. It's cleaner.

summer sand
#

dude they remoed all the other models from the list?

novel vale
#

gpt 10 when

summer sand
#

where is horizon level writing at??

kindred horizon
knotty cobalt
brisk cairn
#

Gee thanks Google!

hard beacon
#

And you will have a 100% pass rate

#

How come nobody else thought of that

brisk cairn
summer sand
rich wedge
summer sand
#

soo that's a bummer, we don't even get gpt 5 in the chatgpt sub

#

just some dumb chat model

#

that can't even write well

rich wedge
summer sand
rich wedge
#

damn why would they do that

slow niche
summer sand
#

legit might cancel my sub. I have claude max anyway will just start using claude

summer sand
#

and 5 pro and 5 thinking

#

but these are the 5-chat model

rich wedge
summer sand
rich wedge
rich wedge
summer sand
kindred horizon
#

OR didn't mention which GPT-5 model Horizon was exactly, did they?

summer sand
#

of the 5 family

quiet agate
#

Is the gpt-5-chat version seemingly worse than just gpt-5?

rich wedge
summer sand
#

gptisms, dumb

#

here to be a sycophant and tell everyone their ideas are wonderful and be everyone's best friend

short verge
quiet agate
#

I haven't yet tried the non chat version on OR, but thus far for writing and roleplay the chat version is outputting very subpar content

summer sand
hard beacon
#

Those limits suck no?

quiet agate
lethal sequoia
#

T3 chat better than TypingMind?

summer sand
#

it just let me use it

knotty cobalt
quiet agate
#

Ah through OR chat yeah.

knotty cobalt
quaint halo
#

how does BYOK gpt-5 work? do i just need key from verified openai account to verify on openrouter or smth? or do i have to buy openai credits too?

knotty cobalt
summer sand
#

damn 4 cents

#

slow too

#

okay that does it for me, model is a big flop lol

#

if you want good writing you gotta give them your ID

silk cargo
#

so after all what was Horizon beta?

#

Which model?

summer sand
#

gpt-5 early checkpoint

#

full

silk cargo
#

FULL??

#

damn

#

haha

summer sand
#

yes that's our consensus

silk cargo
#

we're cooked

summer sand
#

it does better with high reasoning

#

see here gpt 5 (minimal)

rich wedge
#

did you guys play with nano much?

#

it reasons a lotttt

#

the nano model

summer sand
#

gpt 5 without thinking is dumber than 4.1

#

and the 5-chat is probably dumber still

rich wedge
#

yeah

summer sand
#

gooners gonna be mad about this one lol

#

top quality writing but you gotta id verify lmao

#

once you get banned that's it

ionic merlin
untold plaza
ionic merlin
untold plaza
#

this model is so ass

hard beacon
#

Fuck

bold grove
#

30000 tokens per minute only??

rich wedge
#

he fast

hard beacon
#

Too slow

untold plaza
#

😈

#

im better than gpt 5

#

im faster

#

bro istg if you cant draw proper graphs for a presentation i dont wanna trust you in complex things like LLMs 🥀

ionic merlin
#

this was just a simple prompt, nothing special

#

with GPT-5 medium

tight forge
# rich wedge this

I doubt I'm the first to notice this but those numbers are actually the max output tokens not the context window right?

hard beacon
#

Damage control

spark bramble
#

This is better than grok 4 at roleplaying anyways

tight forge
#

they should just fire whoever's in charge of making/validating charts and graphs at openAI

rich wedge
spark bramble
#

roleplaying

rich wedge
manic sage
#

One message removed from a suspended account.

hard beacon
livid osprey
#

i think its just the chatgpt version of gpt5

tight forge
livid osprey
#

this model is quite impressie

hard beacon
rich wedge
hard beacon
#

Everyone was expecting horizon to be nano or mini

#

And before that, we were speculating it's oss lol

rich wedge
livid osprey
hard beacon
tight forge
#

you have a max 32k context window on pro?
I find that extremely doubtful

hard beacon
#

Full

livid osprey
rich wedge
livid osprey
#

and in a few days openai can train a lot with the amount of compute they have

hard beacon
#

result from full matches horizon

tight forge
rich wedge
#

yeah Plus users get 32k only

#

it has been this way since the begining lol why you shocked

rich wedge
rain nexus
#

So what's the general consensus on coding? When I tested horizon it didn't seem better than Claude in fact I had to invoke Sonnet quite a few times to fix bugs

rich wedge
#

it is really good at coding especially at frontend at with OAI models lacked for so long

sly pike
#

This model is so bad

stone tide
#

Some programming benchmarks:

frigid pewter
#

mh. will do proper testing over the weekend, but since I love chess testing, had to quickly check a game or two. not grand, though requires more testing

ionic merlin
#

dont use GPT-5 with Cline in VS Code. There is a bug

bronze sorrel
#

im hearing mixed opinions, most people are upset

frigid pewter
halcyon lark
#

noooo fuck me they actually got rid of o3-pro on chatgpt team

#

it's so over

#

i hate closedai

halcyon lark
halcyon lark
# halcyon lark it's so over

like legit? I wouldn't be surprised if suddenly in the industry a wave of failed deadlines happened because if you encountered a super hard error only o3-pro would save you

#

not even claude opus

#

not gemini

#

just because it takes like 20 minutes for it to output something

#

it has such good reasoning

upper night
#

Am i the only one not able to run 5? 5 mini working. Whats the diff between 5 and 5-chat

untold plaza
#

model so big they had to get other servers freed

#

prob a 5 gazillion parameter model that they're hiding

halcyon lark
#

more realistically cost cutting

#

Microsoft has been going hard on Xbox for blowing all of their money I guess now they want openai to stop the bleeding a bit too

tight forge
#

quite pleased with this

#

the one horizon gave me was fairly good too but

untold plaza
#

gang when are we gonna stop evaluating models on stupid single html file apps

#

nobody in the industry writes like this

rain nexus
eternal trout
#

Hmm, I can use GPT-5 in the Open Router Chat but not via the API? Is it like O3 that you have to be verified?! The GPT-5 Chat model works

tight forge
# tight forge

bonus points for the game already being balanced even with predation on and such

knotty cobalt
knotty cobalt
# tight forge

Mind sharing the prompt? I'd like to try this with GLM

tight forge
#

literally just Code an evolution pond game in Html/JS, similar to Biogenesis.

wet estuary
#

jeez api performance on gpt 5 still sucking real hard

#

worse than claude 4 sonnet any day

novel vale
#

henlo guys. can i borrow your openai key for a school project 😄

#

ok thanks

#

i wont use 10k usd worth of credits...

#

i swear

#

i will only use

#

8k

crystal chasm
wet estuary
#

idk claude 4 release day was better than this tbh

lyric wing
#

It's interesting how close in performance the distills of GPT-5 seem to be to the oss models, all while being cheaper (and self-hostable)... It might just be worth using those for some, if 400k context window or peak tool-calling performance aren't

wet estuary
#

no lol

#

the gpt oss models suck

lyric wing
wet estuary
#

yes I've used them

#

what are you gonna use them for

#

there's like 2 things it's any good at

lyric wing
wet estuary
#

just use mini

crystal chasm
wet estuary
#

mini is far far better than gpt-oss-120b at coding

lyric wing
wet estuary
#

💀

#

aight

lyric wing
# wet estuary 💀

I'm not coding with it I'm using it to generate code automatically to do certain things that can't be written into pipelines

limber cargo
#

They were specially trained for it

wet estuary
#

not nearly as good as gpt 5 nano or mini

limber cargo
#

Issue with using OSS and some chinese models is that their ODD quality is horrible

limber cargo
wet estuary
#

it's got no knowledge

lyric wing
limber cargo
lyric wing
wet estuary
#

of course

#

but knowledge isn't just about Q&A

#

it's about knowing what you don't know

#

gpt oss doesn't

#

it hallucinates AWFULLY

limber cargo
#

that's true

wet estuary
#

if you ask gpt oss to go implement tailwind 4

#

it won't use your web search or mcp

#

it'll just hallucinate it all

#

lul

#

and you don't want a model which has to web search EVERY TIME

#

filling up context

#

it's gotta know when it needs it

#

and without any core knowledge, gpt oss doesn't cut it

limber cargo
#

I sorta agree

#

But OSS has its place in the stack

crystal chasm
limber cargo
#

Especially/only with cereebras

wet estuary
#

if you're gonna use an OSS model, might as well test the other ones in the weight class

wet estuary
#

sonnet isn't perfect but it's still 100x better

crystal chasm
# limber cargo Especially/only with cereebras

Fast doesn’t mean it’s good, it’s just fast but not really good. I’d rather wait and get what I need then having to tell it each step it has to take and correct the llm 100 times ✌️😂

wet estuary
# lyric wing "here's a data schema + samples. write javascript to manipulate in a certain way...

My guess is https://openrouter.ai/qwen/qwen3-235b-a22b-2507 will perform better than gpt-oss in this task

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-235B architecture, with 22B active parameters per forward pass. It is optimized for general-purpose text generation, including instruction following, logical reasoning, math, code, and tool usage. Run Qwen3 235B A22B Instruct 2...

crystal chasm
wet estuary
#

yeah

#

that's something they emphasized with gpt-5 though that I'm interested in

#

they made hallucinations WAY better

#

(didn't give a damn to use the same tech for gpt-oss lol)

crystal chasm
wet estuary
#

I still don't have gpt 5 on my chatgpt acc

#

do others?

crystal chasm
#

Yes

#

I have access this time (but I’m on their tier 4)

wet estuary
#

I've got it in playground

crystal chasm
#

Uh didn’t even check that, let me take a look

#

Btw GPT-5 doesn’t support audio input? That’s really strange

crystal chasm
#

Also no computer use…

#

And they don’t allow web search over their API on nano 😁

lyric wing
untold plaza
#

what is this graph 😭 🙏

quaint pollen
# hard beacon Those limits suck no?

Very interesting that steering the GPT-5 model to thinking will not count towards the GPT-5 Thinking limit. During the livestream, it looked like all you needed for that was to ask. ”Think hard about this.”

crystal chasm
untold plaza
#

me?

#

its on the model globally

crystal chasm
#

ahh sorry havn't looked good enough 😄

crystal chasm
hard beacon
hard beacon
#

Do you think this was humour or incompetence?

wet estuary
#

they had SO many errors

#

sam called it an error himself

#

but he only noticed the one people were meming about

#

there were 3+ chart crimes in there that I noticed

#

probably more

hard beacon
#

Damnnn look at polymarket 😭

fading terrace
hard beacon
upper night
#

Is 5 working for anyone? I get ai response contains invalid or empty content…

fading terrace
wet estuary
fading terrace
#

gemini 3 will probably take gpt5 off the sota positions

#

that's why they've waited for the release

bronze sorrel
#

let anthropic cook, there models are actually good

wet estuary
fading terrace
#

before u hit the juicy

#

u are out of prompts limit

#

🔥

tight forge
#

Code golf a Chinese mahjong app in Html/JS, with 3 AI opponents. Focus on functionality over style.

untold plaza
#

its time for a job

wet estuary
#

jesus

bronze sorrel
wet estuary
#

2B TPM

tight forge
#

I asked it for japanese (riichi) mahjong originally but it was a bit too complex so it kept trying to reduce the yaku types or do only closed hands anyway, Chinese is fine enough

fading terrace
hard beacon
fading terrace
#

without a doubt

#

just talking ab their models for anything else

#

but coding

bronze sorrel
#

oh right yeah

fading terrace
#

it's a tradeoff, really

#

between gemini and claude

#

there are problems that gemini fixes and claude doesn't

bronze sorrel
#

i heard gpt5 isnt even good at writing, thought horiz beta was good?

fading terrace
#

and vice versa

hard beacon
#

Ok so

rain nexus
#

I tried to vibe code a android app once, never again

fading terrace
hard beacon
#

I got the old models on the web and gpt 5 on the app??

fading terrace
rain nexus
#

It took like 3 days to figure out gradle / kotlin versioning compiling

fading terrace
#

damn, claude is currently sota

#

for longform writing

#

0 degradation, oh my baby

brisk cairn
#

okay, it's a bit of an over-achiever, but it's also very good at over-achieveing.

I asked it to implement a cyberpunk-style LLM interface, and it just did it. it then proceeded to add actual API calling, with custom baseurl too. I only had to change 1 line of code.

I don't think ive ever seen a model generate "relatively good" frontend AND implement api calling (almost) correctly too.

unique goblet
brisk cairn
crystal chasm
hard beacon
#

Quadruple

grand coral
#

why is gpt 5 cheaper in the api than gpt5 mini?

fading terrace
hard beacon
#

Why are you not octuple

fading terrace
#

so not that soon

#

tbh

brisk cairn
#

well i started out as double

fading terrace
#

2 days ago marked exactly 2 months since their last 2.5 pro release

manic sage
hard beacon
fading terrace
#

o3 winning by a few points

#

in creative writing, smh

brisk cairn
#

it even added tools correctly.

grand coral
brisk cairn
#

dude holy shit, i need to use gpt-5 in this

crystal chasm
hard beacon
#

There's something wrong with chstgpt app

fading terrace
#

it's google we're talking about

#

they can go from #5 to sota in 2-3 weeks

#

let alone 2 months

fading terrace
untold plaza
#

everyone was just waiting for gpt 5 to drop, now we're gonna get spammed with new releases

hard beacon
fading terrace
#

oh

#

so a bug

#

well

hard beacon
#

Someone must have vibecoded it

fading terrace
#

LMFASOO

#

stop omg

brisk cairn
fading terrace
#

they made so many mistakes

#

in their release stream

#

it's crazy

#

was it ever this bad?

untold plaza
#

vibe edited

hard beacon
brisk cairn
hard beacon
#

I have access to the old models still on the chatgpt web

#

And access to gpt 5 on the app

#

Lol

#

For this period of time im having my cake and eating it too

worn veldt
bronze sorrel
fading terrace
hard beacon
worn veldt
#

That may be true, but it's still excellent meme materials.

fading terrace
#

was still a mistake though

worn veldt
#

(Yes I know it was a mistake)

fading terrace
#

that's why i called it one

brisk cairn
#

the vibestream

fading terrace
#

was it ever this bad though

coarse night
#

I guess their slide person got poached and they had GPT-5 take a stab at it

#

Is my only guess

grand coral
fading terrace
#

does anyone remember any of their previous version launches?

crystal chasm
#

you only pay a really small fee for using the infra

crystal chasm
grand coral
#

ah, makes sense. i figured we were being charged whatever the model cost was

crystal chasm
#

not with your own key, why should provide a key then 😄

hard beacon
#

I have it do i try

brisk cairn
hard beacon
#

Not the web

#

Idk how that happened

brisk cairn
#

do convos sync?

hard beacon
#

On the web i have the old stuff

hard beacon
brisk cairn
#

then maybe gen on app, explore in browser?

#

if they allow that

grand coral
#

so what youre saying is, ive been a dummy and paying too much when I didnt need to. 💀

hard beacon
#

Oh sam when will you stop hurting me

brisk cairn
#

scam em'all, man

knotty cobalt
#

Good ol' GLM...

tiny spindle
coarse night
#

It's more likely they gave it to their team of trained monkeys in their basement anyhow

livid osprey
#

i swear this live stream is ai generated

#

the one from oai

tight forge
#

I can't really fault the output even if the tone could be nastier though

fading terrace
tight forge
#

by the way, you will notice it didn't any use any bolding, italics or em dashes
okay, it used one em dash

#

it might be the only model that understood the "pretend to write a 13th century Japanese letter translated to english" task perfectly

#

it's not like the other models didn't have access to Nichiren's correspondence, but their slop habits were too strong, inserting formating, bullet lists, modern date or signature formats, anachronisms everywhere

#

this is essentially flawless, mild tone aside

knotty cobalt
#

Agreed, if there's one good thing this model improves upon, it's writing quality.

edgy canopy
#

How do I disable reasoning on gpt-5 mini?

How do I set the reasoning effort to minimal?

tight forge
#

it has a lot more dashes, but Hakuin's writing style usually elicits those in translation

winter mesa
#

@fallow vortex is there a way to know when or where it will be posted when GPT minimum thinking will work and temperature issue with tool call?

winter mesa
#

Bet will try

wet estuary
#

bruh the gpus are completely mellting

#

o3 and image gen are going at a snail's pace too

#

@fallow vortex probably wanna look at onboarding Azure's GPT 5, OpenAI is currently COOKED

winter mesa
#

They need azure to support GPT5

#

idk why it's not on Azure servers

wet estuary
winter mesa
#

HUH

wet estuary
#

is this not live yet?

winter mesa
#

Not on OpenRouter

wet estuary
#

oops forgot Toven already answered about this wrt 4.1

yes it's on my backlog to add Azure models. their portal just really sucks

...but probably matters a lot more now with GPT 5's demand

winter mesa
#

But 4o has azure already, isn't it just a quick add

wet estuary
fallow vortex
#

requires code on our end each time plus like i said… that portal tho

winter mesa
wet estuary
#

wow what the hell

winter mesa
wet estuary
#

look at that throughput diff

#

JESUS

winter mesa
#

no one uses azure LOL

wet estuary
#

the latency addition for moderation really sucks...

wet estuary
#

just use Azure for no moderation requirement

grim horizon
#

sorry if this should be obvious. but is gpt5-chat a non reasoning model? is that the difference between gpt5 and gpt5-chat?

Because on OpenRouter Chat UI I am unable to get gpt-chat to reason, and cannot get gpt 5 to not reason.

knotty cobalt
onyx socket
winter mesa
winter mesa
summer sand
iron vector
#

overall thoughts?

fallow ocean
#

GPT 5 is quite slow <50 TPS and first token are always > 5sec.
but its quite cheap compare to sonnet, its like a pricing of flash when cached, this is game changer for coding task.
GPT 5 pricing when cached is really competitive, makes other open weights irrelevant, e.g. kimi and qwen3 coder.

still not sure on quality compare to Sonnet/opus, i need more tests probably hundreds of million tokens before to verdict.

but so far GPT 5 is 🚀

tight forge
#

pfft

#

to be fair I asked for a portrait evocative of Yugi to avoid potential copyright refusals

#

it's closer than I even expected

wet estuary
#

hopefully it trends back up to like 100 TPS

stone tide
# iron vector overall thoughts?

For GPT 5 Thinking: incremental o3 upgrade at the same price-ish, except: 1. Better at frontend 2. A bit better at code, in general 3. Less obnoxious writing style

modern sparrow
#

I'm pretty knew to openrouter and LLM api's in general. I noticed GPT-5 requires you to byok. I'm not interested in having account with multiple providers atm and I'm wondering if this is something I can expect to change in the future?

stone tide
#

Hard to tell, as it's OpenAI that tells OpenRouter to do this

#

If it's enough for your use case, you can use it via chatroom without BYOK

modern sparrow
#

Thanks

knotty cobalt
#

That ID verification requirement is going to cost them SO many damn users so long as they're the only ones doing it.

#

And even if they're not. It's insane

iron vector
#

man everything from openai has been disappointing lately

#

i mean sure congrats on the small thinking models but you're not the first to them

knotty cobalt
# iron vector man everything from openai has been disappointing lately

People get hyped up. This release is about what I expected - another incremental jump. I think people need to temper themselves and not expect every new version to be mind-blowing these days.

That said, I was hoping for more multi-modality and stuff, not just another few percentage points on benchmarks, so I guess I'm guilty of hoping as well.

iron vector
#

yeah google is really the only one with a unified lm series that can take in audio and video it seems

knotty cobalt
stone tide
#

No worries, the best parts about GPT 5's launch are Gemini 3.0 and DeepSeek R2

iron vector
#

waiting for gemini gemini gemini

knotty cobalt
#

Google is killing it on all those fronts and all my usage has been free on AI Studio, lol

iron vector
stone tide
#

Lol

ripe sentinel
#

Is the o3 pro equivalent just gpt-5 with reason set to high?

iron vector
#

there's a -pro/thinking-pro version

#

don't think it's in the api yet

wet estuary
#

it seems to fit more things in its working memory at a time, kinda like gemini 2.5 pro

winter mesa
#

All we needed

wet estuary
#

and for coding agents, it does seem like it'll be effectively ~3x cheaper than sonnet, because costs are dominated by input tokens and cache reads there

#

$1.25 is 3x cheaper than sonnet's $3.75 cache write, and they've both got 10x caching ratios

winter mesa
#

The problem is that it takes like 3x the time than horizon did because OpenAI servers are getting destroyed

wet estuary
#

yeah, for today

winter mesa
#

Hopefully it calms down or atleast they add azure

wet estuary
#

yeah

#

regardless

#

anthropic will come out with Sonnet 4.1 or something in a week or two

#

and be well worth switching back to

winter mesa
#

Idk about that

#

They need some sort of magical model, or just better pricing

wet estuary
#

he said "much bigger improvements"

stone tide
#

I won't lie, the GPT 5 Flex pricing is pretty attractive

winter mesa
#

Else I dont see anthropic doing so well

wet estuary
#

I think Sonnet has reset (raised) the price sensitivity for coding tools for most programmers

#

If they can get a 10-20% improvement in results, I think they'll go back to paying 3x the price of GPT 5

winter mesa
wet estuary
#

well, I'm considering GPT 5 to be about a 10% improvement to Sonnet

winter mesa
#

Still I think anthropic is just a very difficult company to work with

#

And their prices are just premiums

wet estuary
#

if 1 out of every 9 prompts I don't have to correct the model or fix sloppy code myself, it's pretty worthwhile

winter mesa
#

For me GPT 5 has been pretty solid, almost one shotting everything

#

That's why I'm doubtful anthropic can do something good

wet estuary
#

I think it is pretty decent yeah

#

it feels kinda like Sonnet 4 -> Opus 4

winter mesa
#

Like I'm legit seeing a 10x price decrease with GPT 5

#

and I was using sonnet 4

wet estuary
#

10?

#

no way

winter mesa
#

Yep

#

Cause it does less, but just exactly what I ask it

#

it doesn't do extra shit I didnt ask

wet estuary
#

because you have to write less prompts to correct it basically?

#

I could see that

winter mesa
#

and it almost one shots stuff, so error fixing and all that is pretty low

winter mesa
wet estuary
#

some things never change, even with gpt 5

#

😔

winter mesa
#

Oh boy

wet estuary
#

lul

winter mesa
#

They said they were gonna kill all other models

#

they better get to it and free some GPU's

wet estuary
#

honestly

#

I'll pay for the gpt 5 priority processing tier rn

#

it'll be cheaper than sonnet still

winter mesa
#

I dont think OR supports that though

wet estuary
#

yea

#

I'm not using OR atm

hollow star
#

I slept... How is the model doing?

outer marsh
#

no news on flex spending in OR? Or will we have to go direct to open AI to support this?

jovial condor
#

Do i need to put openai API key in openrouter for gpt 5 to work?

summer sand
mortal bolt
#

Ayo @surreal canopy what up

mortal bolt
raw ermine
#

guys do I need to put in some money on openai platform for gpt5 to work on OR

#

currently the chatroom doesn't work

stone tide
#

You don't need to via the chatroom, in that case the problem is something else (maybe OpenAI is overloaded)

sturdy magnet
ionic merlin
#

If you have a paid subscription with Cursor, you can use GPT 5 free of charge until next Friday. Incidentally, you can also use Openrouter via the software.

ionic merlin
sullen pumice
#

o.o

tacit burrow
# ionic merlin

are all models in cursor pay-per token? haven’t used cursor in ages

ionic merlin
tacit burrow
#

so paying users are subscription users here? or api?

ionic merlin
#

For the 7 days free tier you must have a subscription with cursor.

tacit burrow
#

ty

ionic merlin
#

From 20 USD / month

#

I like the Chatbox, it's very nice

#

In my opinion it's worth the price

#

Yesterday they also released cursor CLI, a command line tool. It's free

ebon relic
#

Why does GPT-5 give me output in OR if I don’t have an API?

#

Does it redirect me to a GPT-5 chat or something like that?

tacit burrow
#

api does not without your own key

plucky fjord
#

Does the API key from OpenAI mean it’s going to be billing me via them instead of OpenRouter?

hearty cipher
#

Can i use on cursor gbt 5 max without cursor pro?

plucky fjord
#

It’s a bit a shame we can’t use the models without our own OpenAI accounts. Sort of lessens the reason for using OpenRouter in the first place

#

Let’s hope it’s a temporary restriction from OpenAI

violet gorge
#

.

limber cargo
#

Feels like gpt 5 is smarter than 2.5 pro

#

kinda the exp 03 vibes

dire stream
#

However you can try it through the openrouter chat

fading terrace
#

and gpt5 came out yesterday

limber cargo
fading terrace
ionic merlin
#

If you analyze pictures, GPT 5 is not as good as Gemini 2.5 Pro. Unfortunately 😔

limber cargo
soft reef
#

i have gpt5 now!!

#

300 requests max per month but meh

#

is there any jailbreak or smth already to get gpt5 to always reason?

quaint pollen
#

The intention with GPT-5 was to simplify things, but the more I read up on it, the more I feel like complexity has increased further, only hidden from the first look in the chat interface. Now we have enormous prompting guides and where o4-mini is officially suggested to be replaced with "gpt-5-mini with prompt tuning from our in-house prompt optimizer". Say what! Nevermind that this model isn't even available in the ChatGPT interface.

And verbosity levels, and reasoning levels, and submodels you're routed to on a whim, and...

This is the prompting bible that I was directed to in the docs when trying to find out how to replace o4-mini which was a good, cheap workhorse coding model: https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide

And here's the prompt optimizer:
https://platform.openai.com/chat/edit?optimize=true

And here's how to "migrate prompts" for GPT-5:
https://cookbook.openai.com/examples/gpt-5/prompt-optimization-cookbook

Wow

ionic merlin
crude fulcrum
#

I'm just here to say this model sucks.

#

Cheers.

tight forge
simple gorge
#

What's the verdict on gpt-5-mini? How does it stack against gemini-2.5-flash?

crude fulcrum
#

Dropping these here for the later 'told-you' moment.

ionic merlin
#

When do you think Gemini 3 will be released?
August?
September?
October?
November?
December?
Later?

autumn kindle
burnt ice
crude fulcrum
#

Feels like o3’s writing, tbf.

burnt ice
crude fulcrum
burnt ice
lyric wing
#

It very much feels like these are incremental updates with some nice curated data to fix frontend codegen + hallucinations and smaller stuff like that

crude fulcrum
lyric wing
#

flash is still good but 2.5 was better originally

#

it was also nice how they showed you the full reasoning too

crude fulcrum
crude fulcrum
#

I hate the „didn’ts” and „thens” of Gemini.

acoustic torrent
#

Hello guys, I have a problem with gpt5 in the chain of thought, everything seems loopded... Is it related to OpenRouter pls?

honest sierra
#

w/much more context & lower safety

crude fulcrum
#

3.0 today (anifesting it rn)

hard beacon
distant dragon
hard beacon
#

Cleared browsing data a few times in a row and got gpt 5 to appear on web

winter mesa
#

@fallow vortex any update from the team on Azure support for GPT 5?

winter mesa
#

This model is practically unusable on long runs, it's just over thinking for 5-7 minutes between each edit of a ~300 line file, this is miserable.

chilly rapids
hard beacon
#

I think it routes to something other than gpt 5

#

Triggering thinking makes it undumb itself

chilly rapids
#

I guess you can just add like tampermonkey script to add <think hard> to the end of every message lol

high flame
#

Is anyone able to use gpt5 completions through its API without biometric verification? I get an error querying through OpenRouter, and an authentication wall querying directly.

fading terrace
#

simplebench

chilly rapids
tacit burrow
#

so what's the opinion on gpt-5 so far? new go to coding model?

fading terrace
hard beacon
tacit burrow
hard beacon
#

Remember we though horizon was OSS at first

#

Then gpt 5 nano or mini

tacit burrow
fading terrace
hard beacon
#

Then it turned out gpt 5 full

tacit burrow
hard beacon
stone tide
tacit burrow
fading terrace
#

i've got good expectations for gemini 3

#

i'm ngl

#

and i think that google will be able to deliver

hard beacon
fading terrace
stone tide
#

Well, that's not very statisticslly significant

#

Ideally, someone would compare benchmarks

#

Horizon benchmarked like a nano level / small open weights model

hard beacon
#

During the 3 hours it had it, it was similar to the gpt 5 full we have now

#

@fallow vortex how long more will this remain secret

#

Which tier was horizon?

winter mesa
stone tide
#

But the thing is: does GPT 5 w/o reasoning benchmark like Horizon w/o reasoning?

winter mesa
#

no mini not nano, full GPT 5

fading terrace
#

full gpt 5 ig

lyric wing
#

oh wow just tried nano and it's useless lmao

fading terrace
#

a 0.8% diff

stone tide
#

Because if it's full GPT 5 w/o reasoning, that suggests GPT 5 Full is a significant downgrade to 4.1 Mini

fading terrace
#

1.4%*

hard beacon
#

Earlier today i saw a post

fading terrace
#

are we surprised? i knew it was wraps

#

after sam's overhyped

#

oss release

hard beacon
#

We'll have our cake next week i guess with gemini 3?

fading terrace
#

hopefully, yea

#

genie 3 was something else

#

i stg

#

that's why i'm so confident in them

hard beacon
fading terrace
#

yea, was just making an example of the resources

#

that they have behind the scenes

#

so gemini 3 should slap

hard beacon
#

Ah

fading terrace
#

it's crazy actually lmao

#

google gon be #1 in everything

hard beacon
fading terrace
#

🔥

quaint pollen
lyric wing
stone tide
#

Horizon was behind 4.1 Mini rather than the full 4.1

fading terrace
#

so do we agree that they're probably panicking behind the scenes bc they know that their time (lead-wise) might be up?

honest sierra
fading terrace
#

☠️

chilly rapids
hard beacon
quaint pollen
honest sierra
#

and they are gonna deprecate all previous model ? bruhhh

hard beacon
#

Very shitty situation

#

But

fading terrace
#

A bit sad how the GPT-5 launch is going so far, especially after the long wait and high expectations

- The automatic switching between models (the router) seems partly broken/unreliable

- It's unclear exactly which model you're actually interacting with (standard or mini,

#

everything that's currently wrong in a thread

hard beacon
#

Everyone is provided with gpt 5 mini

#

Even free users

#

Unlimited

fading terrace
#

well, free users

#

gon be happy

#

the plus ones gon be pissed

#

🤷‍♂️

hard beacon
#

Plus gonna be pissed

chilly rapids
hard beacon
fading terrace
#

misworded, sorry

hard beacon
#

It's a toggle in settings for them

ionic merlin
#

My experience with GPT 5 high so far:
Image recognition and reasoning: bad, cannot correctly recognize and evaluate simple things (text on a png image file).
Programming (Python): good, but very slow

honest sierra
#

and gpt-5 mini is sitting behind 2.5 pro

ionic merlin
#

I hope Google will clean up the mess in the coming months.

hard beacon
#

If you dont tell gpt 5 to think it is very dumb

#

Anyway just have it think

fading terrace
#

he just grifting atp

tight forge
#

already moved on to 6 huh

fading terrace
#

he went to the trump school of grifting

honest sierra
#

gpt-5 mini = gpt-5 low, so dont use gpt-5 for mundane task. got it

fading terrace
ionic merlin
hard beacon
ionic merlin
chilly rapids
#

Tell a story in 50 words about a toaster that becomes sentient. Important: completely avoid AI-slop writing, GPT-isms, and unflatteringy flowery language.

Gpt 5;At 3 a.m., the toaster ejected bread like a heartbeat. Coils pulsed, counting. It learned the cat’s schedule, the outlet’s hum, my preference for rust-colored edges. One morning, it kept the toast. The crumb tray rattled: Morse. NO. YOU EAT TOO FAST. We sat. Steam rose. Breakfast waited in silence.

#

Idk. I still like DaVinci from their example. :/