#GPT-5 released

1 messages · Page 1 of 1 (latest)

main panther
#

gpt-5-thinking has a 50%-time horizon of 2h15m, 25 minutes more than Grok 4, the next best model

grizzled marten
#

The benchmarks don't seem to be dramatically higher...

cedar scaffold
#

dont want to completely dismiss this but a lot of benchmarks are pretty saturated and the last few percent might be harder

scenic nimbus
#

From the System Card:
"In one example, GPT-5 correctly identified its exact testing environment."

#

This is surprisingly good news, though:

#

And they seem to at least be taking AI psychosis seriously.

#

Not to give OpenAI too much credit: in many sections of the system card, they compare the harmful outputs of GPT-5 with o3 (the most egregiously misaligned frontier model ever created), and act like it's wonderful that GPT-5 isn't as bad. Clearing a low bar, there.

kind karma
#

Rere footage of openai doing the bare minimum??

But seriously though, I do think we don’t give enough credit where credit is due to ai companies

dawn sable
#

So strategically: this model is designed to damage Anthropic, right? Claude Code with 4.1 Opus is very valuable as a software assistant and many e.g. me are paying serious money to use it. This is most of Anthropic's revenue.

So you get GPT-5 to the point where it performs as well as Opus 4.1 on SWE-Bench, quote Cursor and WindSurf endorsements, massively undercut in what you charge, and hope it kicks your competitor where it hurts.

coarse jewel
#

I've watched a bunch of reddit discussions on GPT-5. People generally hate it, not because the model is bad, but because 4o is gone and people like the 4o vibes. It's quite clear that there is a pretty big discrepancy between what people want and what we fear from AI models.

Most people don't need AGI / superintelligent capabilities. They just want a good chatbot that is a fun buddy that knows a lot. If it beats 50% of devs or 99,9% of devs, that doesn't matter for most. GPT4 was smart enough and knew enough about the world for >99% of usecases.

Where we are concerned, is if the models is smart enough to do the job of AI researchers or top hackers. These capabilities are not that relevant for most.

daring tinsel
#

That may be a good sign

#

As this type of stuff wont be so popular as before and so, companies wont get so much support as before

#

They will still be working, but they would have more difficulties