#general | Arena | Page 84

ocean vortex Aug 7, 2025, 2:30 PM

#

disagree

#

😇

solid brook Aug 7, 2025, 2:30 PM

#

guys competetion is good.

eternal niche Aug 7, 2025, 2:31 PM

#

US were created by Europe lol

solid brook Aug 7, 2025, 2:32 PM

#

eternal niche US were created by Europe lol

yeah "americans" are just bunch of europeans

#

only came to the land few hundred years ago

stray aspen Aug 7, 2025, 2:32 PM

#

why is this thing not working

ocean vortex Aug 7, 2025, 2:33 PM

#

What is currently unfolding in US, Europe (or EU) has already lived through and hopefully put behind...

echo aurora Aug 7, 2025, 2:34 PM

#

lets not go down this road pls

solid brook Aug 7, 2025, 2:35 PM

#

echo aurora lets not go down this road pls

I get it here is not the place for political stuff

stray aspen Aug 7, 2025, 2:36 PM

#

google needs to lock in

devout vault Aug 7, 2025, 2:37 PM

#

google will prob win again against gpt-5

#

with their new gemini 3 model

red tangle Aug 7, 2025, 2:37 PM

#

lol gemini 2.5 deepthink is probably better than gpt5

stray aspen Aug 7, 2025, 2:37 PM

#

how do you know that

red tangle Aug 7, 2025, 2:37 PM

#

i have

#

it's miles better than o3 pro

#

i have completely switched from chatgpt to gemini usage because o3 has been such a poor showing

devout vault Aug 7, 2025, 2:38 PM

#

gemini is completely free with almost no usage at all

red tangle Aug 7, 2025, 2:38 PM

#

o3 pro thinks for 20 minutes and is like at max 5% better than o1 pro

stray aspen Aug 7, 2025, 2:38 PM

#

devout vault Aug 7, 2025, 2:38 PM

#

gemini 3 is waiting for gpt-5 to release

#

so they can compare

#

their models

solid brook Aug 7, 2025, 2:38 PM

#

red tangle i have completely switched from chatgpt to gemini usage because o3 has been such...

the deepthink is very limited

devout vault Aug 7, 2025, 2:38 PM

#

gemini 3 is already done training

#

a month ago

ocean vortex Aug 7, 2025, 2:39 PM

#

stray aspen google needs to lock in

they added nice feature to google gpt5 launch time conveniently. If you google for "time 10am pt" this will convert to your local time. How nice of them 😇

stray aspen Aug 7, 2025, 2:39 PM

#

i love google

obsidian shell Aug 7, 2025, 2:39 PM

#

gpt5 in 20 minutes?

solid brook Aug 7, 2025, 2:39 PM

#

only 2 hours and 20 minutes left

stray aspen Aug 7, 2025, 2:39 PM

#

obsidian shell gpt5 in 20 minutes?

no

warm fulcrum Aug 7, 2025, 2:40 PM

#

guys why it feel like openai slowed time

red tangle Aug 7, 2025, 2:40 PM

#

gpt5 isn't even out lol

solid brook Aug 7, 2025, 2:40 PM

#

stray aspen i love google

i did but they nerfed the hell out of their pro model

stray aspen Aug 7, 2025, 2:40 PM

#

solid brook i did but they nerfed the hell out of their pro model

yes i noticed that

devout vault Aug 7, 2025, 2:40 PM

#

gemini 2.5 pro was so powerful before when it first released which shows google true strength

red tangle Aug 7, 2025, 2:41 PM

#

doubt

devout vault Aug 7, 2025, 2:41 PM

#

gpt 5 - pricy
gemini 3 - free

solid brook Aug 7, 2025, 2:41 PM

#

good competetion is always good for us

stray aspen Aug 7, 2025, 2:41 PM

#

gpt 5 50 euro per token lol

red tangle Aug 7, 2025, 2:41 PM

#

openai has already lost on API and enterprise

white hatch Aug 7, 2025, 2:41 PM

#

stray aspen gpt 5 50 euro per token lol

lol wtf

red tangle Aug 7, 2025, 2:42 PM

#

anthropic

devout vault Aug 7, 2025, 2:42 PM

#

claude is like a human when it comes to coding

red tangle Aug 7, 2025, 2:42 PM

#

why would you use a 10x more expensive model

devout vault Aug 7, 2025, 2:42 PM

#

gpt-5 will be more smarter than claude opus 4 for sure

#

but gemini 3? idk

red tangle Aug 7, 2025, 2:43 PM

#

um

#

im pretty sure they will drop something this month

#

they've been hyping it

solid brook Aug 7, 2025, 2:43 PM

#

claude is so expensive. someone needs to beat them at coding to challenge them

blazing bison Aug 7, 2025, 2:43 PM

#

I got access to gpt 5, the one that plus users will receive

#

Too much hype

red tangle Aug 7, 2025, 2:43 PM

#

the markets are idiots

devout vault Aug 7, 2025, 2:43 PM

#

blazing bison I got access to gpt 5, the one that plus users will receive

how is it?

stray aspen Aug 7, 2025, 2:43 PM

#

blazing bison I got access to gpt 5, the one that plus users will receive

show us

red tangle Aug 7, 2025, 2:43 PM

#

lmarena no style control google is dominating

#

#

it's not even close

#

lol no they arent

solid brook Aug 7, 2025, 2:44 PM

#

blazing bison I got access to gpt 5, the one that plus users will receive

Source: Trust me bro

devout vault Aug 7, 2025, 2:44 PM

#

i think animated is lying

#

lol

red tangle Aug 7, 2025, 2:44 PM

#

theres somehow a 75% chance openai has best model by end of august on polymarket 😂 😂 😂

based on lmarena no style control

#

i did

#

$1k

blazing bison Aug 7, 2025, 2:44 PM

#

Well, no one need to believe me 🤷‍♂️

red tangle Aug 7, 2025, 2:44 PM

#

it's my first time betting on polymarket

devout vault Aug 7, 2025, 2:45 PM

#

blazing bison Well, no one need to believe me 🤷‍♂️

thats what a liar would say bro

#

no proof at all

stray aspen Aug 7, 2025, 2:45 PM

#

devout vault Aug 7, 2025, 2:45 PM

#

stray aspen

LMFAO

blazing bison Aug 7, 2025, 2:45 PM

#

What do you want, prints?

#

Share screen?

devout vault Aug 7, 2025, 2:45 PM

#

blazing bison Share screen?

you cant do noone of those

blazing bison Aug 7, 2025, 2:45 PM

#

??

red tangle Aug 7, 2025, 2:46 PM

#

wall street is better than polymarket traders

#

HF analysts actually know how to do math

#

a lot of smart money has googl rn

#

like a loooot

white hatch Aug 7, 2025, 2:46 PM

#

When gpt 5 was announced?

patent aspen Aug 7, 2025, 2:46 PM

#

It's a small enough market that I could move the market

ocean vortex Aug 7, 2025, 2:47 PM

#

lol

solid brook Aug 7, 2025, 2:47 PM

#

ocean vortex lol

imagine if it does not

ocean vortex Aug 7, 2025, 2:47 PM

#

solid brook imagine if it does not

impossible

devout vault Aug 7, 2025, 2:48 PM

#

what is st

stray aspen Aug 7, 2025, 2:48 PM

#

wall street

solid brook Aug 7, 2025, 2:48 PM

#

yes impossible. that is the reason i used the term "imagine"

patent aspen Aug 7, 2025, 2:48 PM

#

ocean vortex lol

I wouldn't read that as "the market thinks it's a 94% chance"

red tangle Aug 7, 2025, 2:48 PM

#

all of the good HFs are subscribed to semianalysis

ocean vortex Aug 7, 2025, 2:49 PM

#

I think it's gonna be made avail immediately. Too big of a release not to. At the very least to their Pro subs same day. But likely more

red tangle Aug 7, 2025, 2:49 PM

#

and have really smart analysts constantly looking at compute & energy that google vs openai are building

#

whereas polymarket is people who invest based on twitter vibes

solid brook Aug 7, 2025, 2:49 PM

#

ocean vortex I think it's gonna be made avail immediately. Too big of a release not to. At th...

with this much hype they have to release a version to free users

patent aspen Aug 7, 2025, 2:50 PM

#

Semi analysis is good if you want invest in the supply chain of the relevant companies

warm fulcrum Aug 7, 2025, 2:51 PM

#

blazing bison I got access to gpt 5, the one that plus users will receive

i believe u

devout vault Aug 7, 2025, 2:51 PM

#

warm fulcrum i believe u

@blazing bison is saying a lot of baloney

warm fulcrum Aug 7, 2025, 2:51 PM

#

devout vault <@224577039724838912> is saying a lot of baloney

hoW?

blazing bison Aug 7, 2025, 2:51 PM

#

devout vault <@224577039724838912> is saying a lot of baloney

Proof?

warm fulcrum Aug 7, 2025, 2:51 PM

#

he would never do such thing

devout vault Aug 7, 2025, 2:51 PM

#

blazing bison Proof?

show a screenshot

#

rn

blazing bison Aug 7, 2025, 2:52 PM

#

Of copilot?

#

🤓

solid brook Aug 7, 2025, 2:52 PM

#

this guy is such a troll

blazing bison Aug 7, 2025, 2:53 PM

#

Im not the only one with access btw

stray aspen Aug 7, 2025, 2:53 PM

#

theres no gpt 5 on copilot yet

devout vault Aug 7, 2025, 2:53 PM

#

stray aspen theres no gpt 5 on copilot yet

ye

warm fulcrum Aug 7, 2025, 2:53 PM

#

blazing bison Im not the only one with access btw

do u have the smart mode thing on copilot?

stray aspen Aug 7, 2025, 2:53 PM

#

i dont

blazing bison Aug 7, 2025, 2:53 PM

#

stray aspen theres no gpt 5 on copilot yet

You need to change things in frontend to access

blazing bison Aug 7, 2025, 2:53 PM

#

warm fulcrum do u have the smart mode thing on copilot?

Yes

warm fulcrum Aug 7, 2025, 2:53 PM

#

blazing bison Yes

ye

#

i saw some other people get it

#

was it just random or did u have to do something?

blazing bison Aug 7, 2025, 2:54 PM

#

Random

eternal niche Aug 7, 2025, 2:54 PM

#

guys i have gpt6

devout vault Aug 7, 2025, 2:54 PM

#

i have gpt-6 on copilot rn

warm fulcrum Aug 7, 2025, 2:54 PM

#

blazing bison Random

how does it perform

#

does it perform like summit or zenith

blazing bison Aug 7, 2025, 2:54 PM

#

Idk too much limited

solid brook Aug 7, 2025, 2:54 PM

#

eternal niche guys i have gpt6

bro you're far behind I have gpt 7

devout vault Aug 7, 2025, 2:54 PM

#

blazing bison Of copilot?

ye

#

do that

blazing bison Aug 7, 2025, 2:54 PM

#

I can't send images

#

For example

stray aspen Aug 7, 2025, 2:55 PM

#

guys ive just been granted access to gpt 8.5 pro high max reasoning 1 billion context

devout vault Aug 7, 2025, 2:55 PM

#

blazing bison I can't send images

and why not LOOL

blazing bison Aug 7, 2025, 2:55 PM

#

Idk

devout vault Aug 7, 2025, 2:55 PM

#

u can send photos here

warm fulcrum Aug 7, 2025, 2:55 PM

#

wait wdym

#

too much limited?

#

rate limit?

blazing bison Aug 7, 2025, 2:55 PM

#

They added a limit of output or something

#

Maybe it's not really gpt 5

ocean vortex Aug 7, 2025, 2:55 PM

#

devout vault and why not LOOL

he sent too many naughty ones

warm fulcrum Aug 7, 2025, 2:56 PM

#

blazing bison Maybe it's not really gpt 5

it is

#

but finetuned

devout vault Aug 7, 2025, 2:56 PM

#

ocean vortex he sent too many naughty ones

ewwww

wicked ingot Aug 7, 2025, 2:56 PM

#

warm fulcrum but finetuned

finetuned?

warm fulcrum Aug 7, 2025, 2:57 PM

#

wicked ingot finetuned?

ye

#

they did the same for gpt 4

devout vault Aug 7, 2025, 2:57 PM

#

i like these ngl

warm fulcrum Aug 7, 2025, 2:57 PM

#

devout vault i like these ngl

price wont be likeable tho

solid brook Aug 7, 2025, 2:59 PM

#

Guys what would you think will be the free tier model with no limits?

torn bison Aug 7, 2025, 2:59 PM

#

red tangle a lot of smart money has googl rn

a lot of insiders has bought openai

#

given that summit and zenith have already been tested in the arena, they have enough confidence to do so

warm fulcrum Aug 7, 2025, 3:00 PM

#

lmarena so greedy

#

they didnt keep zenith

eternal niche Aug 7, 2025, 3:00 PM

#

warm fulcrum lmarena so greedy

lol

whole wagon Aug 7, 2025, 3:00 PM

#

why didnt lm arena give us the sota model for free smh

solid brook Aug 7, 2025, 3:00 PM

#

2 hours left

whole wagon Aug 7, 2025, 3:00 PM

#

Kappa

warm fulcrum Aug 7, 2025, 3:00 PM

#

whole wagon why didnt lm arena give us the sota model for free smh

exactly

#

so greedy

red tangle Aug 7, 2025, 3:02 PM

#

torn bison a lot of insiders has bought openai

no one is deciding to buy or sell openai secondaries based on summit and zenith

#

guarantee 99% of polymarket traders can't even buy secondaries

#

it's very few individuals

#

it's mostly funds that are buying up secondaries

stray aspen Aug 7, 2025, 3:05 PM

#

i think horizon beta has given me the best plug and play main menu for roblocks out of all the LLMs ive tried

rapid merlin Aug 7, 2025, 3:07 PM

#

they having a livestream in about 2 or so hours right

stray aspen Aug 7, 2025, 3:08 PM

#

yes

quartz light Aug 7, 2025, 3:11 PM

#

blazing bison Im not the only one with access btw

`how 2 get

stray aspen Aug 7, 2025, 3:11 PM

#

quartz light `how 2 get

your best bet is using horizon beta

quartz light Aug 7, 2025, 3:11 PM

#

warm fulcrum they didnt keep zenith

whats zenith

warm fulcrum Aug 7, 2025, 3:12 PM

#

quartz light whats zenith

an ai model that was in battle mode (best gpt 5 version)

#

they said it was really good

#

but i never got the chance to try it

hollow imp Aug 7, 2025, 3:14 PM

#

Actual chatgpt plus and pro sora?

molten cipher Aug 7, 2025, 3:17 PM

#

Hey @echo aurora , I was wondering if I could talk with you through DMS or a ticket in this server?

patent aspen Aug 7, 2025, 3:18 PM

#

I think they know but also don't have models that deal well with it. No AI company is profitable and wall street models generally are focused on 6-18 months in the future at best

echo aurora Aug 7, 2025, 3:18 PM

#

molten cipher Hey <@283397944160550928> , I was wondering if I could talk with you through DMS...

Yeah, my DMs are open or you can DM @oak python

quartz light Aug 7, 2025, 3:18 PM

#

warm fulcrum an ai model that was in battle mode (best gpt 5 version)

bruh

wintry tinsel Aug 7, 2025, 3:19 PM

#

China is in actual flames now, just because they open source good models doesn’t mean the country is holding itself together lol

molten cipher Aug 7, 2025, 3:19 PM

#

echo aurora Yeah, my DMs are open or you can DM <@575252669443211264>

ModMail isn't properly setup

quartz light Aug 7, 2025, 3:19 PM

#

warm fulcrum an ai model that was in battle mode (best gpt 5 version)

why'd they snatch it

warm fulcrum Aug 7, 2025, 3:19 PM

#

quartz light why'd they snatch it

idk

#

greedy mfs

molten cipher Aug 7, 2025, 3:19 PM

#

warm fulcrum Aug 7, 2025, 3:19 PM

#

why would pineapple take away zenith from us??!?

echo aurora Aug 7, 2025, 3:20 PM

#

molten cipher

Hmm okay thanks I’ll look into, very odd. You can DM me

molten cipher Aug 7, 2025, 3:24 PM

#

echo aurora Hmm okay thanks I’ll look into, very odd. You can DM me

Sent a DM.

raven helm Aug 7, 2025, 3:41 PM

#

GPT-5 benchmarks just leaked (It might be fake; take it with a grain of salt.)

#

#

#

#

rapid merlin Aug 7, 2025, 3:42 PM

#

where did you grab this from?

raven helm Aug 7, 2025, 3:42 PM

#

https://x.com/iruletheworldmo/status/1944375085478904118

🍓🍓🍓 (@iruletheworldmo)

excited for gpt5 july 31

i leaked grok’s bench scores so. this feels fair.

hot ai summer.

we’re so back.

lfg.

wild.

it’s a good model sir.

#

But might be fake

#

take it with a grain of salt

rapid merlin Aug 7, 2025, 3:44 PM

#

most likely yeah

zinc ore Aug 7, 2025, 3:44 PM

#

It's strawberry

#

He probably made those this morning

rapid merlin Aug 7, 2025, 3:45 PM

#

excited for gpt5 july 31

#

strawberry?

zinc ore Aug 7, 2025, 3:45 PM

#

Okay I mean on the 13th*

rapid merlin Aug 7, 2025, 3:45 PM

#

wasnt strawberry o1

zinc ore Aug 7, 2025, 3:45 PM

#

Look at the strawberries in his username

raven helm Aug 7, 2025, 3:45 PM

#

He also posted this

#

whole wagon Aug 7, 2025, 3:46 PM

#

bruh

stray aspen Aug 7, 2025, 3:46 PM

#

fake

raven helm Aug 7, 2025, 3:46 PM

#

But again; take it with a grain of salt

whole wagon Aug 7, 2025, 3:46 PM

#

he could have at least attempted to make it look realistic

rapid merlin Aug 7, 2025, 3:46 PM

#

yeah no

raven helm Aug 7, 2025, 3:46 PM

#

whole wagon he could have at least attempted to make it look realistic

What makes it look unrealistic? The insane scores?

whole wagon Aug 7, 2025, 3:47 PM

#

you obviously do not get a base model performing like that on arc agi 2

raven helm Aug 7, 2025, 3:47 PM

#

yea

#

From brockman

#

https://x.com/i/status/1953456219433455898

OpenAI (@OpenAI)

Dropping soon.

stray aspen Aug 7, 2025, 3:49 PM

#

gpt 4.5

#

no way

raven helm Aug 7, 2025, 3:50 PM

#

https://x.com/chetaslua/status/1953420691967016989

Chetaslua (@chetaslua)

🚨 GPT-5 ONE SHOTTED 🚨

SNAKE IN HEXAGON GAME , i gave it difficulty of hexagon and multiple snakes and it handle gracefully one shotted .

If anyone want to test GPT-5 send your prompts

rapid merlin Aug 7, 2025, 3:50 PM

#

gpt 4-p

#

4 pixels

astral jetty Aug 7, 2025, 3:52 PM

#

Very curious to see if the creative writing is less repetitive than even something like Gemini

raven helm Aug 7, 2025, 3:52 PM

#

Yea, i think that will be getting an upgrade

#

(Hopefully there'll be less/no em-dashes)

stray aspen Aug 7, 2025, 3:55 PM

#

horizon beta is so good at lua

solid brook Aug 7, 2025, 3:57 PM

#

raven helm

WTF

#

i feel fake

raven helm Aug 7, 2025, 3:58 PM

#

Yea, no way a base model scores that high

#

or maybe it does

#

idk

solid brook Aug 7, 2025, 3:58 PM

#

that is just way too big

#

way too big

astral jetty Aug 7, 2025, 3:58 PM

#

I can kinda believe it, but I don’t think the gap will be that big between Gemini 2.5 and base gpt 5

solid brook Aug 7, 2025, 3:59 PM

#

if this is true

#

gpt 5 is AGI?

stray aspen Aug 7, 2025, 3:59 PM

#

no

pure anvil Aug 7, 2025, 4:00 PM

#

agi is when spinny hexagon+snake

solid brook Aug 7, 2025, 4:00 PM

#

i guess gpt 5.5 then?

#

I feel we are so close

stray aspen Aug 7, 2025, 4:02 PM

#

we are close to AGI

eternal niche Aug 7, 2025, 4:03 PM

#

agi is fake

torn mantle Aug 7, 2025, 4:05 PM

#

raven helm https://x.com/iruletheworldmo/status/1944375085478904118

not this guy again

#

stop reposting that idiot

raven helm Aug 7, 2025, 4:06 PM

#

Ok, sure. Sorry.

torn mantle Aug 7, 2025, 4:06 PM

#

is it you?

#

strawberry guy = you?

#

😡

raven helm Aug 7, 2025, 4:06 PM

#

No No,

torn mantle Aug 7, 2025, 4:06 PM

#

ok good

raven helm Aug 7, 2025, 4:06 PM

#

I'm just saying i'll stop posting it here.

torn mantle Aug 7, 2025, 4:06 PM

#

thanks

raven helm Aug 7, 2025, 4:06 PM

#

No problem!

eternal niche Aug 7, 2025, 4:09 PM

#

it is

jade egret Aug 7, 2025, 4:10 PM

#

raven helm

0_0

#

btw gpt-5 today right

#

YO

stray aspen Aug 7, 2025, 4:12 PM

#

its out

#

47 minutes

#

for launch

obsidian shell Aug 7, 2025, 4:16 PM

#

i dont think there will be a huge difference

probably 5-9%

#

they are just capitalizing on the hype they have been building

storm needle Aug 7, 2025, 4:17 PM

#

https://github.com/openai/gpt-5-coding-examples/blob/main/apps/webcam-filter-playground/index.html

GitHub

gpt-5-coding-examples/apps/webcam-filter-playground/index.html at m...

GPT-5 coding examples. Contribute to openai/gpt-5-coding-examples development by creating an account on GitHub.

keen beacon Aug 7, 2025, 4:19 PM

#

Ahhhhhhhhhh

jade egret Aug 7, 2025, 4:19 PM

#

better be much better at coding than 4.1 opus (:

quartz light Aug 7, 2025, 4:21 PM

#

https://websim.com/@rat/gpt-5-countdown-party-2

GPT-5 Countdown Party

#

38 minutes

coarse glade Aug 7, 2025, 4:23 PM

#

Guys quick question how do we do text to video on LMArena.ai

echo aurora Aug 7, 2025, 4:25 PM

#

coarse glade Guys quick question how do we do text to video on LMArena.ai

More info can be found in #1397655624103493813 , but the TLDR is use /video in #video-arena-1 #video-arena-2 #video-arena-3

coarse glade Aug 7, 2025, 4:25 PM

#

So I can’t do it on the website

echo aurora Aug 7, 2025, 4:26 PM

#

Video Arena is currently only available through our Discord

torn mantle Aug 7, 2025, 4:33 PM

#

https://github.com/openai/gpt-5-coding-examples/tree/main

GitHub

GitHub - openai/gpt-5-coding-examples: GPT-5 coding examples

GPT-5 coding examples. Contribute to openai/gpt-5-coding-examples development by creating an account on GitHub.

#

some cool demos

torn mantle Aug 7, 2025, 4:33 PM

#

storm needle https://github.com/openai/gpt-5-coding-examples/blob/main/apps/webcam-filter-pla...

oops

blazing bison Aug 7, 2025, 4:33 PM

#

Gpt 5 on copilot feels like 4o v2

stray aspen Aug 7, 2025, 4:33 PM

#

30 minutes

stray aspen Aug 7, 2025, 4:33 PM

#

blazing bison Gpt 5 on copilot feels like 4o v2

show us

blazing bison Aug 7, 2025, 4:33 PM

#

I hope it's not the same

astral jetty Aug 7, 2025, 4:41 PM

#

blazing bison Gpt 5 on copilot feels like 4o v2

Copilot is also kind of mid to begin with ngl

stray aspen Aug 7, 2025, 4:41 PM

#

it is

quartz light Aug 7, 2025, 4:45 PM

#

echo aurora Aug 7, 2025, 4:45 PM

#

Reminder we have our Staff AMA tomorrow with the dev behind our Video Arena bot, if you have any specific questions be sure to add them here

quartz light Aug 7, 2025, 4:46 PM

#

quartz light

please vote :)

echo aurora Aug 7, 2025, 4:46 PM

#

https://discord.com/events/1340554757349179412/1400149736027328623

quartz light Aug 7, 2025, 4:47 PM

#

quartz light

👀

#

12 minutes!

#

vibe coded, lol
https://websim.com/@rat/gpt-5-countdown-party-2

GPT-5 Countdown Party

astral jetty Aug 7, 2025, 4:48 PM

#

quartz light

256k sounds right, I don’t want to get my hopes too high

blazing bison Aug 7, 2025, 4:50 PM

#

From my first impressions, it's not good, but I had good results with zenith

spring rune Aug 7, 2025, 4:50 PM

#

I have this

blazing bison Aug 7, 2025, 4:51 PM

#

https://x.com/OpenAI/status/1953498900230250850?t=JiAJn7rnprqOOM4ou91JqQ&s=19

OpenAI (@OpenAI)

wen GPT-5? In 10 minutes.

https://t.co/EOvjGJrHbj

#

Lesgoo

#

Hype or sota

#

Answer in 10 minutes

wintry hamlet Aug 7, 2025, 4:53 PM

#

how do I use the bot in direct messages

echo aurora Aug 7, 2025, 4:54 PM

#

wintry hamlet how do I use the bot in direct messages

You're unable to, it only works in #video-arena-1 #video-arena-2 #video-arena-3

wintry hamlet Aug 7, 2025, 4:54 PM

#

oh ok. It would be cool if you could

spring rune Aug 7, 2025, 4:54 PM

#

Hey pineapple when gpt-5 added in lmarena?

echo aurora Aug 7, 2025, 4:54 PM

#

wintry hamlet oh ok. It would be cool if you could

Be sure to share feedback about the bot in #bot-feedback

proven raft Aug 7, 2025, 4:54 PM

#

spring rune Hey pineapple when gpt-5 added in lmarena?

Is gpt 5 released?

wintry hamlet Aug 7, 2025, 4:54 PM

#

Ok

astral jetty Aug 7, 2025, 4:55 PM

#

blazing bison https://x.com/OpenAI/status/1953498900230250850?t=JiAJn7rnprqOOM4ou91JqQ&s=19

Is the demo in 5 mins or is the full release in 5 mins

storm needle Aug 7, 2025, 4:58 PM

#

https://platform.openai.com/docs/models/gpt-5

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

stray aspen Aug 7, 2025, 4:59 PM

#

gpt 5 is out

#

400k context

proven raft Aug 7, 2025, 4:59 PM

#

Ooo

stray aspen Aug 7, 2025, 4:59 PM

#

@echo auroraadd gpt 5 to the arena

astral jetty Aug 7, 2025, 4:59 PM

#

stray aspen 400k context

What is that compared to Gemini

astral jetty Aug 7, 2025, 4:59 PM

#

stray aspen <@283397944160550928>add gpt 5 to the arena

Give him time

rapid merlin Aug 7, 2025, 4:59 PM

#

1 minute yall

void elm Aug 7, 2025, 4:59 PM

#

blazing bison https://x.com/OpenAI/status/1953498900230250850?t=JiAJn7rnprqOOM4ou91JqQ&s=19

the stream is in 10 minutes but when will it be released actually

proven raft Aug 7, 2025, 5:00 PM

#

Yea it's cheaper than gpt 4.5

clever estuary Aug 7, 2025, 5:00 PM

#

stray aspen Aug 7, 2025, 5:00 PM

#

clever estuary Aug 7, 2025, 5:00 PM

#

400k context

void elm Aug 7, 2025, 5:00 PM

#

clever estuary 400k context

how much was o4

#

4o*

proven raft Aug 7, 2025, 5:00 PM

#

void elm how much was o4

200

clever estuary Aug 7, 2025, 5:00 PM

#

4o
2.5 dollar input
10 output

proven raft Aug 7, 2025, 5:01 PM

#

I think it was 200k for 4o

#

Most likely

#

o3 deep research is still the smartest one, tho with huge latency

rapid merlin Aug 7, 2025, 5:05 PM

#

even a free tier? goddamn

tame horizon Aug 7, 2025, 5:06 PM

#

clever estuary

What is the name of this site?

whole wagon Aug 7, 2025, 5:06 PM

#

what the hell is this graph

#

makes no sense lmao

brittle tiger Aug 7, 2025, 5:06 PM

#

still don't have audio or video input which is annoying

raven oracle Aug 7, 2025, 5:06 PM

#

yes it does, bottom is no think top is with think

whole wagon Aug 7, 2025, 5:06 PM

#

o3 is at 69.1

#

like the scale of the graph is impossible

#

lmao

void elm Aug 7, 2025, 5:07 PM

#

whole wagon what the hell is this graph

5 increase is so disappointing

keen beacon Aug 7, 2025, 5:07 PM

#

Very happy to hear about it

tame horizon Aug 7, 2025, 5:07 PM

#

tame horizon What is the name of this site?

@clever estuary What is the name of this site?

rapid merlin Aug 7, 2025, 5:07 PM

#

whole wagon what the hell is this graph

https://tenor.com/view/better-call-saul-faint-dead-nope-bye-gif-12483755

Tenor

faint

▶ Play video

clever estuary Aug 7, 2025, 5:07 PM

#

tame horizon <@1210824232213413889> What is the name of this site?

openAI api site

rapid merlin Aug 7, 2025, 5:07 PM

#

graph made by chatgpt

willow grail Aug 7, 2025, 5:08 PM

#

byebye anthropic

tame horizon Aug 7, 2025, 5:08 PM

#

Thankyou

willow grail Aug 7, 2025, 5:08 PM

#

byebye opus 4.x

#

BYEBYE

#

say hello to gpt5

barren prairie Aug 7, 2025, 5:09 PM

#

Let s test and see if it is great or as always just a hype

willow grail Aug 7, 2025, 5:09 PM

#

were.

#

case closed.

#

you are fanboying billy

brittle tiger Aug 7, 2025, 5:09 PM

#

The disparity between GPT-5 Thinking score (incredible) and no-think (awful) is pretty crazy

willow grail Aug 7, 2025, 5:10 PM

#

wait thats a crow?

whole wagon Aug 7, 2025, 5:10 PM

#

how many thinking tokens is it gonna use kek

willow grail Aug 7, 2025, 5:10 PM

#

omg we both likes crows

rapid merlin Aug 7, 2025, 5:10 PM

#

brittle tiger The disparity between GPT-5 Thinking score (incredible) and no-think (awful) is ...

yeah idk what to think about that

#

barely better than 4o

willow grail Aug 7, 2025, 5:11 PM

#

is it not a crow?

devout vault Aug 7, 2025, 5:11 PM

#

void elm Aug 7, 2025, 5:11 PM

#

brittle tiger The disparity between GPT-5 Thinking score (incredible) and no-think (awful) is ...

not even a 1% increase over 4o is disgusting

brittle tiger Aug 7, 2025, 5:11 PM

#

Thinking score is amazing and I don't know why you would use it without but definitely interesting

void elm Aug 7, 2025, 5:11 PM

#

thats pathetic

keen beacon Aug 7, 2025, 5:11 PM

#

devout vault

Daaaaamn

#

Daamn

void elm Aug 7, 2025, 5:11 PM

#

devout vault

playground? what site

stray aspen Aug 7, 2025, 5:12 PM

#

its out on lmarena

#

gpt 5

hardy pecan Aug 7, 2025, 5:12 PM

#

HOLY

stray aspen Aug 7, 2025, 5:13 PM

#

that was quick

devout vault Aug 7, 2025, 5:13 PM

#

hardy pecan HOLY

damn

astral jetty Aug 7, 2025, 5:13 PM

#

brittle tiger The disparity between GPT-5 Thinking score (incredible) and no-think (awful) is ...

I mean I’m pretty sure thinking is automatic now

stray aspen Aug 7, 2025, 5:13 PM

#

is the gpt 5 in arena think

blazing bison Aug 7, 2025, 5:14 PM

#

hardy pecan HOLY

Lmao

rapid merlin Aug 7, 2025, 5:14 PM

#

hardy pecan HOLY

patent aspen Aug 7, 2025, 5:14 PM

#

GPT-5 seems to be weaker or comparable to Deep Think on all benchmarks without tools

blazing bison Aug 7, 2025, 5:14 PM

#

But it answers faster

#

Its important

primal orbit Aug 7, 2025, 5:14 PM

#

GPT 5 live direct arena!

stray aspen Aug 7, 2025, 5:14 PM

#

gpt 5 crushed the webdev by far

warm fulcrum Aug 7, 2025, 5:14 PM

#

yeehaw

blazing bison Aug 7, 2025, 5:15 PM

#

And you can send more than 5 prompts per day

#

Also important

obsidian cargo Aug 7, 2025, 5:15 PM

#

summit was GPT-5? Then what was zenith?

warm fulcrum Aug 7, 2025, 5:15 PM

#

obsidian cargo summit was GPT-5? Then what was zenith?

beast

hoary plaza Aug 7, 2025, 5:15 PM

#

Ig it's time google will release something 😂

spring rune Aug 7, 2025, 5:15 PM

#

FINALLY YAY!!!

misty star Aug 7, 2025, 5:15 PM

#

@echo aurora I love you

patent aspen Aug 7, 2025, 5:15 PM

#

Only a 21-point ELO lead over 2.5 Pro. We're good

warm fulcrum Aug 7, 2025, 5:15 PM

#

hoary plaza Ig it's time google will release something 😂

its time for everyone to release something

spring rune Aug 7, 2025, 5:15 PM

#

Love you guys

stray aspen Aug 7, 2025, 5:15 PM

#

crazy

hoary plaza Aug 7, 2025, 5:15 PM

#

Since they want #1

exotic tartan Aug 7, 2025, 5:15 PM

#

WOW

whole wagon Aug 7, 2025, 5:16 PM

#

LMAO all the bettors lose with openai

#

it didnt win without style control

prime mulch Aug 7, 2025, 5:16 PM

#

Hell yeah

primal orbit Aug 7, 2025, 5:16 PM

#

lmarena first with gpt5 on the web, gratz

whole wagon Aug 7, 2025, 5:16 PM

#

kekw

clever estuary Aug 7, 2025, 5:16 PM

#

they are deprecating everything???

meager harbor Aug 7, 2025, 5:16 PM

#

what a joke, AI pioneer not even to be able to make a proper graph, embarassing

prime mulch Aug 7, 2025, 5:16 PM

#

@echo aurora thanks a lot

stray aspen Aug 7, 2025, 5:16 PM

#

cooked

patent aspen Aug 7, 2025, 5:17 PM

#

I'm feeling pretty good ngl

stray aspen Aug 7, 2025, 5:17 PM

#

the hype was a marketing stunt

spring rune Aug 7, 2025, 5:17 PM

#

Hey brian! Looks like gpt 5 is indeed becoming underrated right?

elder rapids Aug 7, 2025, 5:17 PM

#

dawg they're deprecating all the previous models

blazing bison Aug 7, 2025, 5:17 PM

#

Much marketing little results

#

Hahahahahahha

clever estuary Aug 7, 2025, 5:17 PM

#

elder rapids dawg they're deprecating all the previous models

crazy
that's actually crazy

keen ferry Aug 7, 2025, 5:17 PM

#

gpt 5 worse than Gemini 2.5 pro?

blazing bison Aug 7, 2025, 5:18 PM

#

Yes

soft river Aug 7, 2025, 5:18 PM

#

I have a question, as soon as they started the live stream, they already published GPT 5 on the official website?

blazing bison Aug 7, 2025, 5:18 PM

#

No

ember sentinel Aug 7, 2025, 5:18 PM

#

guys, when? 🔥

spring rune Aug 7, 2025, 5:18 PM

#

Well animated you know gpt 5 is way better than gemini 2.5 pro right?

barren prairie Aug 7, 2025, 5:18 PM

#

Now Google can have a good sleep 😆

hardy pecan Aug 7, 2025, 5:18 PM

#

no ones interested about non style control

ornate agate Aug 7, 2025, 5:18 PM

#

patent aspen I'm feeling pretty good ngl

I think this is the "google won its over" moment.

blazing bison Aug 7, 2025, 5:18 PM

#

spring rune Well animated you know gpt 5 is way better than gemini 2.5 pro right?

Im gonna be sure of it after testing

primal orbit Aug 7, 2025, 5:18 PM

#

https://i.snipboard.io/mT7P1H.jpg

sleek stump Aug 7, 2025, 5:19 PM

#

what video model this AI is using and how many videos i can generate in a day?

indigo hazel Aug 7, 2025, 5:19 PM

#

spring rune Well animated you know gpt 5 is way better than gemini 2.5 pro right?

i'd like to see the hallucination rate of gemini before saying this, i dont know

meager harbor Aug 7, 2025, 5:19 PM

#

patent aspen Only a 21-point ELO lead over 2.5 Pro. We're good

I predicted a 50 elo jump max above o3, I was right, AI is hitting a plateau

fleet ocean Aug 7, 2025, 5:19 PM

#

sleek stump what video model this AI is using and how many videos i can generate in a day?

8 videos

sleek stump Aug 7, 2025, 5:19 PM

#

fleet ocean 8 videos

daily?

astral jetty Aug 7, 2025, 5:19 PM

#

ornate agate I think this is the "google won its over" moment.

I’m gonna wait until Gemini 3 to fully say that

stray aspen Aug 7, 2025, 5:19 PM

#

cant wait for gemini 3 to destroy OAI

fleet ocean Aug 7, 2025, 5:19 PM

#

sleek stump daily?

Yes

sleek stump Aug 7, 2025, 5:20 PM

#

fleet ocean Yes

and model video model is it?

primal orbit Aug 7, 2025, 5:20 PM

#

primal orbit https://i.snipboard.io/mT7P1H.jpg

Let's wait for gpt5 thinking on arena

fleet ocean Aug 7, 2025, 5:20 PM

#

sleek stump and model video model is it?

You can't select a model. It will independently select.

stray aspen Aug 7, 2025, 5:20 PM

#

horizon beta chat was disabled

split kayak Aug 7, 2025, 5:20 PM

#

ok

sleek stump Aug 7, 2025, 5:20 PM

#

fleet ocean You can't select a model. It will independently select.

ohh ,thanks for the info bro

fleet ocean Aug 7, 2025, 5:21 PM

#

No problem

astral jetty Aug 7, 2025, 5:21 PM

#

primal orbit Let's wait for gpt5 thinking on arena

Isn’t thinking incorporated onto the main model itself

split kayak Aug 7, 2025, 5:21 PM

#

ok

barren prairie Aug 7, 2025, 5:21 PM

#

stray aspen cant wait for gemini 3 to destroy OAI

With a flash 😆

spring rune Aug 7, 2025, 5:21 PM

#

Uh?

civic flame Aug 7, 2025, 5:21 PM

#

@echo aurora can you shed some light on what zenith was? there wasn't jsut summit

leaden meteor Aug 7, 2025, 5:22 PM

#

They can't. It's against their policy.

storm needle Aug 7, 2025, 5:23 PM

#

civic flame <@283397944160550928> can you shed some light on what zenith was? there wasn't j...

gpt 5 pro

stray aspen Aug 7, 2025, 5:23 PM

#

so summit was gpt 5

cyan zodiac Aug 7, 2025, 5:23 PM

#

does anyone know what makes gpt 5 better than opus 4 at coding?

keen talon Aug 7, 2025, 5:23 PM

#

what is the limit for video arena?

primal orbit Aug 7, 2025, 5:24 PM

#

astral jetty Isn’t thinking incorporated onto the main model itself

Dunno, there have to be different versions still.

stray aspen Aug 7, 2025, 5:24 PM

#

eight

indigo hazel Aug 7, 2025, 5:24 PM

#

i cant wait to see gpt 5 failing at your tests lmao, send images pls

hoary plaza Aug 7, 2025, 5:24 PM

#

I remember seeing opus 4.1 in direct chat

#

Was it removed?

primal orbit Aug 7, 2025, 5:24 PM

#

hoary plaza Was it removed?

yes

torn mantle Aug 7, 2025, 5:24 PM

#

As I've guessed summit = gpt5

#

It wasnt that obvious

stray aspen Aug 7, 2025, 5:25 PM

#

what was zenith

brittle tiger Aug 7, 2025, 5:25 PM

#

system card
https://cdn.openai.com/pdf/8124a3ce-ab78-4f06-96eb-49ea29ffb52f/gpt5-system-card-aug7.pdf

proven raft Aug 7, 2025, 5:25 PM

#

stray aspen Aug 7, 2025, 5:26 PM

#

is the gpt 5 in lmarena reasoning?

tidal schooner Aug 7, 2025, 5:26 PM

#

torn mantle As I've guessed summit = gpt5

what do you think zenith is?

stray aspen Aug 7, 2025, 5:26 PM

#

im guessing it is because it took forever to answer my prompt

proven raft Aug 7, 2025, 5:26 PM

#

Gpt 5 has got 50 percent in hfe

tidal schooner Aug 7, 2025, 5:26 PM

#

tidal schooner what do you think zenith is?

big question atm

proven raft Aug 7, 2025, 5:26 PM

#

That's crazy

fleet lintel Aug 7, 2025, 5:26 PM

#

Am I reading the evals incorrectly? GPT-5 looks underwhelming to me.. What am I missing?

dreamy sparrow Aug 7, 2025, 5:26 PM

#

is every model in lmarena even real

dreamy sparrow Aug 7, 2025, 5:27 PM

#

dreamy sparrow is every model in lmarena even real

is it actually the real model

stray aspen Aug 7, 2025, 5:27 PM

#

dreamy sparrow is every model in lmarena even real

yes not sure for grok 4 tho

meager harbor Aug 7, 2025, 5:27 PM

#

fleet lintel Am I reading the evals incorrectly? GPT-5 looks underwhelming to me.. What am...

it's an improvement, just not that significant.

rapid merlin Aug 7, 2025, 5:27 PM

#

stray aspen is the gpt 5 in lmarena reasoning?

it looks like it, it took like 2 minutes to answer an image examination question

fleet lintel Aug 7, 2025, 5:28 PM

#

Things are slowing down but GPT-5 was supposed to be like multiple levels better ..

proven raft Aug 7, 2025, 5:28 PM

#

rapid merlin it looks like it, it took like 2 minutes to answer an image examination question

Traffic

#

Many ppl are using it

dreamy sparrow Aug 7, 2025, 5:28 PM

#

stray aspen yes not sure for grok 4 tho

ye it's kinda real

#

most apps use it

echo aurora Aug 7, 2025, 5:28 PM

#

dreamy sparrow is it actually the real model

yes

dreamy sparrow Aug 7, 2025, 5:28 PM

#

echo aurora yes

ok

stray aspen Aug 7, 2025, 5:28 PM

#

is it gpt 5 think or no think

leaden meteor Aug 7, 2025, 5:28 PM

#

Why didn't lmarena add got 5 thinking on leaderboard yet? If it is zenith, it should have had enough votes like summit/gpt5main ?

dreamy sparrow Aug 7, 2025, 5:29 PM

#

stray aspen is it gpt 5 think or no think

thino

#

think

rapid merlin Aug 7, 2025, 5:29 PM

#

proven raft Traffic

that's possible too, and i think the image you sent is fake

keen beacon Aug 7, 2025, 5:29 PM

#

People are already complaining about gpt 5, lol

#

Just enjoy it

eternal niche Aug 7, 2025, 5:29 PM

#

so gpt5 is crap

dreamy sparrow Aug 7, 2025, 5:29 PM

#

keen beacon People are already complaining about gpt 5, lol

gemini 2.5 thinks lesser than it

#

tho idk

#

maybe Gemini IS better

stray aspen Aug 7, 2025, 5:30 PM

#

gemini 3 will destroy it

red sluice Aug 7, 2025, 5:30 PM

#

eternal niche so gpt5 is crap

Barely better than Gemini

keen beacon Aug 7, 2025, 5:30 PM

#

dreamy sparrow maybe Gemini IS better

It's too positive/sycophantic like 4o

#

Don't like it

brittle tiger Aug 7, 2025, 5:30 PM

#

I don't think all of GPT-5 is thinking. It might do some to determine which to route to but they wouldn't differentiate between main and thinking if both were thinking.

fleet lintel Aug 7, 2025, 5:30 PM

#

stray aspen gemini 3 will destroy it

i think next version of gemini 2.5 is going to be better than gpt-5

dusky aurora Aug 7, 2025, 5:30 PM

#

fleet lintel i think next version of gemini 2.5 is going to be better than gpt-5

I can't wait

fleet lintel Aug 7, 2025, 5:30 PM

#

I thought OAI was 3-4 months ahead of Google. I think it's behind now

rapid merlin Aug 7, 2025, 5:31 PM

#

brittle tiger I don't think all of GPT-5 is thinking. It might do some to determine which to r...

i wonder how strong the thinking one is

primal orbit Aug 7, 2025, 5:31 PM

#

Gemini 2.5 was released in March. Google has to have much better version internally now.

barren prairie Aug 7, 2025, 5:31 PM

#

primal orbit Gemini 2.5 was released in March. Google has to have much better version interna...

But is was improved at coding over the time

leaden meteor Aug 7, 2025, 5:32 PM

#

brittle tiger I don't think all of GPT-5 is thinking. It might do some to determine which to r...

They have benchmarks for main and thinking. If they are same, why different benchmarks?

fleet lintel Aug 7, 2025, 5:32 PM

#

polymarket sudden huge moved in favor of gemini models for Aug

primal orbit Aug 7, 2025, 5:32 PM

#

barren prairie But is was improved at coding over the time

it may have improved in coding, but it became sycophantic like 4o

meager harbor Aug 7, 2025, 5:32 PM

#

primal orbit Gemini 2.5 was released in March. Google has to have much better version interna...

the gemini gemini guy kilpatrick say this week was an exciting week so maybe gemini 3 experimental tomorrow ?

patent bane Aug 7, 2025, 5:33 PM

#

oh no it's more censored

fleet lintel Aug 7, 2025, 5:33 PM

#

meager harbor the gemini gemini guy kilpatrick say this week was an exciting week so maybe gem...

gemini 3 might drop in Aug-end but most likely in Sept or Oct. I intially thought it would be Dec but they are accelerating the pace as per some sources

barren prairie Aug 7, 2025, 5:33 PM

#

meager harbor the gemini gemini guy kilpatrick say this week was an exciting week so maybe gem...

Google is now focusing at releasi ng some feautures on gemini app

stray aspen Aug 7, 2025, 5:33 PM

#

LMAO

dusky aurora Aug 7, 2025, 5:34 PM

#

patent bane oh no it's more censored

Gemini is much less censred than the opposition

leaden meteor Aug 7, 2025, 5:34 PM

#

meager harbor the gemini gemini guy kilpatrick say this week was an exciting week so maybe gem...

Didn't we already get genie 3?

meager harbor Aug 7, 2025, 5:34 PM

#

leaden meteor Didn't we already get genie 3?

he said exciting week, not exciting day

jade egret Aug 7, 2025, 5:34 PM

#

is gpt-5 better than gemini

stray aspen Aug 7, 2025, 5:34 PM

#

slightly

barren prairie Aug 7, 2025, 5:35 PM

#

meager harbor he said exciting week, not exciting day

But the story book too

whole wagon Aug 7, 2025, 5:35 PM

#

small haven Aug 7, 2025, 5:35 PM

#

stray aspen LMAO

Wait thats a steal

fleet lintel Aug 7, 2025, 5:35 PM

#

I think wolfstride model is better than gpt-5 ... this is very dissappointing 🙁

small haven Aug 7, 2025, 5:35 PM

#

whole wagon

Gemini 3 aint coming in aug tho

meager harbor Aug 7, 2025, 5:36 PM

#

whole wagon

gemini 3 100%

void elm Aug 7, 2025, 5:36 PM

#

what is this creature talking

feral lichen Aug 7, 2025, 5:36 PM

#

Best ai for

#

Rbx studio

#

?

void elm Aug 7, 2025, 5:36 PM

#

shut up

stray aspen Aug 7, 2025, 5:36 PM

#

feral lichen Rbx studio

gpt 5

void elm Aug 7, 2025, 5:36 PM

#

nah

#

best model is https://www.codecademy.com/learn/learn-lua

Codecademy

Learn Lua Programming: Tutorial | Codecademy

Learn the basics of Lua, a general-purpose programming language used for building games, web apps, and developer tools.

keen beacon Aug 7, 2025, 5:36 PM

#

Yes. They are surely updating the gpt image model

civic flame Aug 7, 2025, 5:36 PM

#

storm needle gpt 5 pro

it literally wasn't though

rapid merlin Aug 7, 2025, 5:36 PM

#

fr

feral lichen Aug 7, 2025, 5:37 PM

#

void elm shut up

I use them to help me, not just copy.

civic flame Aug 7, 2025, 5:37 PM

#

it thought for less time than summit on 90% of tasks

#

lmao

#

it also had less juice

#

that was NOT pro

eternal niche Aug 7, 2025, 5:37 PM

#

@deep adder hi

civic flame Aug 7, 2025, 5:37 PM

#

they just picked the checkpoint that won in elo even though it was less performant facepalm

#

openai never help themselves do they

whole wagon Aug 7, 2025, 5:37 PM

#

Oh dear....

dreamy sparrow Aug 7, 2025, 5:37 PM

#

is the gpt in lmarena the pro thinking and not normal one

jade egret Aug 7, 2025, 5:37 PM

#

is gpt-5 good? (better than claude 4.1 opus?)

civic flame Aug 7, 2025, 5:37 PM

#

whole wagon Oh dear....

in logan we trust

meager harbor Aug 7, 2025, 5:37 PM

#

so gpt 5 is 200 elo better than gpt 4 0314 (the og one)

fleet lintel Aug 7, 2025, 5:38 PM

#

whole wagon

who is the only voter in favor of GPT-5 ? :). who is the crazy person? Reveal yourself

red sluice Aug 7, 2025, 5:38 PM

#

Damn so GPT-5 was on lmarena since the 27th of July 🤯

rapid merlin Aug 7, 2025, 5:38 PM

#

whole wagon Oh dear....

LMAOOOOOOOOOO

eternal niche Aug 7, 2025, 5:38 PM

#

well...

jade egret Aug 7, 2025, 5:38 PM

#

whole wagon Oh dear....

what does this mean, people dont like gpt-5?

stray aspen Aug 7, 2025, 5:38 PM

#

eternal niche <@348477266704990208> hi

craig eating his words again

whole wagon Aug 7, 2025, 5:38 PM

#

jade egret what does this mean, people dont like gpt-5?

underperformed expectations

red sluice Aug 7, 2025, 5:38 PM

#

...

barren prairie Aug 7, 2025, 5:38 PM

#

fleet lintel who is the only voter in favor of GPT-5 ? :). who is the crazy person? Reveal y...

Screenshot_2025-08-07-18-38-28-221_com.discord.jpg

stray aspen Aug 7, 2025, 5:39 PM

#

craig stop licking openai's boots

jade egret Aug 7, 2025, 5:39 PM

#

fleet lintel who is the only voter in favor of GPT-5 ? :). who is the crazy person? Reveal y...

lol

native flame Aug 7, 2025, 5:39 PM

#

Hii, this gpt-5 has the thinking mode activated by default? Like the o3??

dreamy sparrow Aug 7, 2025, 5:39 PM

#

barren prairie

Gemini 3 100%

#

lmao

jade egret Aug 7, 2025, 5:39 PM

#

dreamy sparrow Gemini 3 100%

fr

mental briar Aug 7, 2025, 5:39 PM

#

native flame Hii, this gpt-5 has the thinking mode activated by default? Like the o3??

no

dreamy sparrow Aug 7, 2025, 5:39 PM

#

it's google mate

fleet lintel Aug 7, 2025, 5:39 PM

#

jade egret lol

it's @deep adder 🙂 not surprised

wheat onyx Aug 7, 2025, 5:39 PM

#

Absolutely huge

dreamy sparrow Aug 7, 2025, 5:39 PM

#

mental briar no

it has thinking bruv

mental briar Aug 7, 2025, 5:40 PM

#

but no by default

dreamy sparrow Aug 7, 2025, 5:40 PM

#

mental briar but no by default

well

#

yeah i guess?

brittle tiger Aug 7, 2025, 5:40 PM

#

https://x.com/AiBattle_/status/1953508582927778188?t=xXydAMuMP2I9boJd_8QmqA&s=19

AiBattle (@AiBattle_)

GPT-5 (High) scores 9.9% on ARC-AGI-2, Grok 4 (Thinking) scored 16.0%

keen beacon Aug 7, 2025, 5:40 PM

#

wheat onyx Absolutely huge

yes

meager harbor Aug 7, 2025, 5:40 PM

#

meager harbor so gpt 5 is 200 elo better than gpt 4 0314 (the og one)

2 years and half to get 200 elo boost, we'll see where we at in 2 years and a half but I predict max 100-120 elo if we keep the same pace but yeah sometimes people dont realise when a model hallucinates especially on hard task so an elo plateau is to be expected

lavish orchid Aug 7, 2025, 5:40 PM

#

damn the context windows 😭 https://x.com/ahmetbuilds/status/1953511311175737370

Ahmet Dedeler (@ahmetbuilds)

GPT-5 context windows in ChatGPT:

8k for free users, 32k for Plus, 128k for Pro

same numbers as it was before with gpt-4o, bummer they didn't increase it since the model now has 400k context window

keen beacon Aug 7, 2025, 5:41 PM

#

Can we people stop complaining already? Take a deep breath

fleet lintel Aug 7, 2025, 5:41 PM

#

OMG!

civic flame Aug 7, 2025, 5:41 PM

#

jesus

steady vale Aug 7, 2025, 5:41 PM

#

grok 4 honestly kinda sucks

for any type of question that isn't just benchmark maxxing types

fleet lintel Aug 7, 2025, 5:41 PM

#

If I remove style control... 2.5 > Gpt-5.. WTF

civic flame Aug 7, 2025, 5:41 PM

#

1 point better

#

💀

stray aspen Aug 7, 2025, 5:41 PM

#

lavish orchid damn the context windows 😭 https://x.com/ahmetbuilds/status/195351131117573737...

8k are you serious

verbal nimbus Aug 7, 2025, 5:41 PM

#

It's crazy and kinda scary how much Gemini 2.5 Pro hallucinates.

rapid merlin Aug 7, 2025, 5:41 PM

#

32K? 🥀 🥀

#

yeah, go google go go go

verbal nimbus Aug 7, 2025, 5:42 PM

#

rapid merlin 32K? 🥀 🥀

Yeah, although if you look at the system prompt, I think the model doesn't enforce a limit most of the time on auto.

void elm Aug 7, 2025, 5:42 PM

#

gpt 5 is a disappointment

#

gemini 3 last hope

fleet lintel Aug 7, 2025, 5:42 PM

#

This release is breaking my heart :(. I had soo much hopes

fleet lintel Aug 7, 2025, 5:42 PM

#

void elm gemini 3 last hope

What if Google also fails 🙁

astral jetty Aug 7, 2025, 5:42 PM

#

steady vale grok 4 honestly kinda sucks for any type of question that isn't just benchmark ...

Yeah, I didn’t like grok 4 that much when I tried it

stray aspen Aug 7, 2025, 5:42 PM

#

if its gonna have 8k windom for free users ill just use google ai studio

#

that sucks

verbal nimbus Aug 7, 2025, 5:42 PM

#

verbal nimbus It's crazy and kinda scary how much Gemini 2.5 Pro hallucinates.

For example, I told it to use Google Search. It told me it did. But you can see that there are no citations returned, which mean it didn't.

void elm Aug 7, 2025, 5:43 PM

#

fleet lintel What if Google also fails 🙁

idk ask yourself why ai progress is slowing down all of a sudden

#

so much hype but so little progress

rapid merlin Aug 7, 2025, 5:43 PM

#

verbal nimbus Yeah, although if you look at the system prompt, I think the model doesn't enfor...

I really hope it doesn't get dementia past the said limit, would be pretty damn hard to use for anything that isnt a simple task

astral jetty Aug 7, 2025, 5:43 PM

#

fleet lintel What if Google also fails 🙁

We’ve probably hit a plateau

whole wagon Aug 7, 2025, 5:43 PM

#

Google has incredible stuff upcoming there is no worries trust me. I am a bit shocked by this gpt5 release

#

i thought they would have cooked for sure

#

Tbh i tried gpt5 and the vibes are great

stray aspen Aug 7, 2025, 5:43 PM

#

i hope google doesnt mess up

verbal nimbus Aug 7, 2025, 5:43 PM

#

verbal nimbus For example, I told it to use Google Search. It told me it did. But you can see ...

Then I told it to explicitly include the link from each website it found, and a quote from each. And it just started making non-existent links and quotes up. Like what the heck.

empty stump Aug 7, 2025, 5:43 PM

#

Hi how good is gpt 5

eternal niche Aug 7, 2025, 5:43 PM

#

craig still coping

civic flame Aug 7, 2025, 5:43 PM

#

michelle please stop i can't see you through the tears in my eyes

stray aspen Aug 7, 2025, 5:43 PM

#

empty stump Hi how good is gpt 5

its mid

whole wagon Aug 7, 2025, 5:43 PM

#

like i tried actual help for real world coding and it was pretty good

#

maybe dont judge pure off benchmarks just yet

verbal nimbus Aug 7, 2025, 5:44 PM

#

verbal nimbus Then I told it to explicitly include the link from each website it found, and a ...

It's completely lying to the user about having used the search tool during thinking.

empty stump Aug 7, 2025, 5:44 PM

#

stray aspen its mid

4.5 type of thing?

keen beacon Aug 7, 2025, 5:44 PM

#

empty stump Hi how good is gpt 5

A lot less hallucinations which is a big thing

eternal niche Aug 7, 2025, 5:44 PM

#

empty stump Hi how good is gpt 5

keen beacon Aug 7, 2025, 5:45 PM

#

eternal niche

Needs more votes

native flame Aug 7, 2025, 5:45 PM

#

Well then how can I make the GPT-5 on the lmarena can think?? Like the reasoning models?

feral lichen Aug 7, 2025, 5:45 PM

#

You know, the human brain can reach further than an ai

barren prairie Aug 7, 2025, 5:45 PM

#

keen beacon Needs more votes

To be more down

keen beacon Aug 7, 2025, 5:45 PM

#

barren prairie To be more down

?

verbal nimbus Aug 7, 2025, 5:45 PM

#

native flame Well then how can I make the GPT-5 on the lmarena can think?? Like the reasoning...

Is it not thinking?

keen beacon Aug 7, 2025, 5:45 PM

#

You guys are so negative

indigo hazel Aug 7, 2025, 5:45 PM

#

guys i honestly think that the thing of hallucinations is still really good even if it's not the best model in every task

void elm Aug 7, 2025, 5:45 PM

#

AI is NOT replacing jobs 💀

quartz light Aug 7, 2025, 5:45 PM

#

quartz light

poll_question_text

GPT-5 Context Window Predictions

victor_answer_votes

7

total_votes

14

victor_answer_id

2

victor_answer_text

256k

victor_answer_emoji_name

😕

void elm Aug 7, 2025, 5:45 PM

#

1 year wait for this trash btw

iron meadow Aug 7, 2025, 5:45 PM

#

So

blazing bison Aug 7, 2025, 5:45 PM

#

1 year?

#

2 years

rapid merlin Aug 7, 2025, 5:45 PM

#

void elm Aug 7, 2025, 5:45 PM

#

more yea

fleet lintel Aug 7, 2025, 5:45 PM

#

void elm AI is NOT replacing jobs 💀

that;s one good thing about this rlease 😄 😄

blazing bison Aug 7, 2025, 5:45 PM

#

2 years for a mid model

iron meadow Aug 7, 2025, 5:46 PM

#

How does gpt 5 compare to opus 4

rapid merlin Aug 7, 2025, 5:46 PM

#

saved to my gifs

iron meadow Aug 7, 2025, 5:46 PM

#

Does it have more sophistication?

whole wagon Aug 7, 2025, 5:46 PM

#

its not trash cmon lol. it is still SOTA its not just the jump expected

verbal nimbus Aug 7, 2025, 5:46 PM

#

verbal nimbus It's completely lying to the user about having used the search tool during think...

This is Gemini 2.5 Pro btw, not GPT-5.

tall summit Aug 7, 2025, 5:46 PM

#

gpt 5 seems... better at translation than any other from openai

rapid merlin Aug 7, 2025, 5:46 PM

#

censorship update

blazing bison Aug 7, 2025, 5:46 PM

#

people will stick with claude, believe me

void elm Aug 7, 2025, 5:46 PM

#

this jump is the same amount of improvement from claude 4.0 to 4.1

leaden palm Aug 7, 2025, 5:46 PM

#

what is my timeline 😭

zealous panther Aug 7, 2025, 5:46 PM

#

What was zenith

blazing bison Aug 7, 2025, 5:46 PM

#

Dario did nothing and won

#

🤑

native flame Aug 7, 2025, 5:46 PM

#

verbal nimbus Is it not thinking?

I was told it has not the thinking mode by default, Soo I don't know how to activate it :'v cuz there are no buttons

stray aspen Aug 7, 2025, 5:47 PM

#

zealous panther What was zenith

we dont nkow

rapid merlin Aug 7, 2025, 5:47 PM

#

leaden palm what is my timeline 😭

LMAOOOOOOO

void elm Aug 7, 2025, 5:47 PM

#

HAHA

#

alr bro

fleet lintel Aug 7, 2025, 5:47 PM

#

whole wagon its not trash cmon lol. it is still SOTA its not just the jump expected

agreed.. still very disappointing .

whole wagon Aug 7, 2025, 5:47 PM

#

void elm this jump is the same amount of improvement from claude 4.0 to 4.1

not true

empty stump Aug 7, 2025, 5:47 PM

#

What is airline

zealous panther Aug 7, 2025, 5:47 PM

#

stray aspen we dont nkow

It said it was trained onopenai data so

fleet lintel Aug 7, 2025, 5:47 PM

#

I fell for OAI hype again... this is like 5th time I fell for it

barren prairie Aug 7, 2025, 5:47 PM

#

void elm this jump is the same amount of improvement from claude 4.0 to 4.1

Maybe it was gpt 4o update but they wanted to call ot gpt5

iron meadow Aug 7, 2025, 5:47 PM

#

leaden palm what is my timeline 😭

lawsuit incoming?

blazing bison Aug 7, 2025, 5:47 PM

#

gpt 5 is really a 4o v2

zealous panther Aug 7, 2025, 5:47 PM

#

fleet lintel I fell for OAI hype again... this is like 5th time I fell for it

Literally any AI hype ever…except google thouh

keen beacon Aug 7, 2025, 5:48 PM

#

Boys, "15.9% for Grok 4 vs 9.9% for GPT-5." it sucks. sad face emoji

#

arc agi

verbal nimbus Aug 7, 2025, 5:48 PM

#

Is GPT-5 a router model?

whole wagon Aug 7, 2025, 5:48 PM

#

keen beacon Boys, "15.9% for Grok 4 vs 9.9% for GPT-5." it sucks. sad face emoji

sheesh

rapid merlin Aug 7, 2025, 5:48 PM

#

gpt-4oo

whole wagon Aug 7, 2025, 5:48 PM

#

that aint great ngl

wintry tinsel Aug 7, 2025, 5:48 PM

#

Is there a gpt 5 heavy or a gpt 5 thinking yet?

keen beacon Aug 7, 2025, 5:48 PM

#

This is the worst "upgrade" ever!

fleet lintel Aug 7, 2025, 5:48 PM

#

50 ELO improvement over 2.5 gemini.. that was my mid-level expectation.. i was hoping for 65+

verbal nimbus Aug 7, 2025, 5:48 PM

#

jade egret Aug 7, 2025, 5:48 PM

#

so gpt-5 suck?

whole wagon Aug 7, 2025, 5:48 PM

#

wheat onyx Aug 7, 2025, 5:49 PM

#

Why are people thinking this is bad? This looks fantastic to me

glad perch Aug 7, 2025, 5:49 PM

#

brittle tiger I don't think all of GPT-5 is thinking. It might do some to determine which to r...

Maybe it's a hybrid model like clause 3.7 and qwen3

red sluice Aug 7, 2025, 5:49 PM

#

jade egret so gpt-5 suck?

It doesn't suck, it's just overly underwhelming and isn't the revolution everyone wished for, far from it

whole wagon Aug 7, 2025, 5:49 PM

#

wheat onyx Why are people thinking this is bad? This looks fantastic to me

they hyped it as agi and beyond

keen beacon Aug 7, 2025, 5:49 PM

#

whole wagon

Simple-bench needed right now! can't wait for those results

jade egret Aug 7, 2025, 5:49 PM

#

red sluice It doesn't suck, it's just overly underwhelming and isn't the revolution everyon...

oh

#

so they hype it up way too much lol

whole wagon Aug 7, 2025, 5:49 PM

#

whole wagon they hyped it as agi and beyond

"a team of phd experts on demand" yeah no it is not that lol

brittle tiger Aug 7, 2025, 5:49 PM

#

wheat onyx Why are people thinking this is bad? This looks fantastic to me

Solely expecations. o3 was bigger jump. people rightly or wrongly expected a big jump with 5 series

wheat onyx Aug 7, 2025, 5:49 PM

#

whole wagon they hyped it as agi and beyond

looks like it kills on all tests, and 1/5 the hallucination

keen beacon Aug 7, 2025, 5:49 PM

#

jade egret so they hype it up way too much lol

Thats what sam hype man does!

jade egret Aug 7, 2025, 5:49 PM

#

guys please tell me is it even betetr than opus 4.1?

weary flint Aug 7, 2025, 5:50 PM

#

Hello, how do I make videos in 9:16 size?

fleet lintel Aug 7, 2025, 5:50 PM

#

wheat onyx Why are people thinking this is bad? This looks fantastic to me

it is not fantastic.. remove style control and 2.5 pro is better than gpt-5. that's how dissappointing the model is

deft vigil Aug 7, 2025, 5:50 PM

#

Wow finally Sam hit the wall

steady vale Aug 7, 2025, 5:50 PM

#

this long context performance is really good

actually game changing tbh

whole wagon Aug 7, 2025, 5:50 PM

#

i really like gpt5 ngl. i actually used it

zealous panther Aug 7, 2025, 5:50 PM

#

Jse it bruh

meager harbor Aug 7, 2025, 5:50 PM

#

wheat onyx Why are people thinking this is bad? This looks fantastic to me

it's same improvement from o3 to gpt 5 than from gemini 2.5 experimental 03-25 to gemini experimental 05-06

keen beacon Aug 7, 2025, 5:50 PM

#

Google DID "out accelerate" Sam

weary flint Aug 7, 2025, 5:50 PM

#

Hello, how do I make videos in 9:16 size?

keen beacon Aug 7, 2025, 5:50 PM

#

lol

whole wagon Aug 7, 2025, 5:50 PM

#

the benchmarks arent capturing smth it feels

deft vigil Aug 7, 2025, 5:50 PM

#

Any coding benchmark guys

void elm Aug 7, 2025, 5:51 PM

#

gemini 3.0 deep thinking is gonna leave gpt 5 in the dust

wheat onyx Aug 7, 2025, 5:51 PM

#

looks like GPT5 is much better at writing, coding, and overall knowledge, while having 1/5 the hallucinations. Pretty huge imo

zealous panther Aug 7, 2025, 5:51 PM

#

deft vigil Any coding benchmark guys

Its topping webdev so idk why pissed

fleet lintel Aug 7, 2025, 5:51 PM

#

whole wagon the benchmarks arent capturing smth it feels

Sama is saying the same on twitter...

echo aurora Aug 7, 2025, 5:51 PM

#

weary flint Hello, how do I make videos in 9:16 size?

You're unable to set the size of the output

wintry tinsel Aug 7, 2025, 5:51 PM

#

GPT 5 is confirmation this crsp has hit a wall and we’ve been being grifted into thinking it hasn’t for over a year now

zealous panther Aug 7, 2025, 5:51 PM

#

void elm gemini 3.0 deep thinking is gonna leave gpt 5 in the dust

Disappointed again

indigo hazel Aug 7, 2025, 5:51 PM

#

wheat onyx looks like GPT5 is much better at writing, coding, and overall knowledge, while ...

that's why i think is really good

keen beacon Aug 7, 2025, 5:51 PM

#

whole wagon i really like gpt5 ngl. i actually used it

Ran a writing task (albeit meme one) and it fails to Gemini 2.5

stray aspen Aug 7, 2025, 5:51 PM

#

wheat onyx looks like GPT5 is much better at writing, coding, and overall knowledge, while ...

its good at lua

feral lichen Aug 7, 2025, 5:51 PM

#

echo aurora You're unable to set the size of the output

Hi

jade egret Aug 7, 2025, 5:51 PM

#

wheat onyx Aug 7, 2025, 5:51 PM

#

verbal nimbus Aug 7, 2025, 5:51 PM

#

whole wagon

I don't think ARC-AGI is reliable for models released after the benchmark. Just look at how o3 (High) scores 68.8% on ARC-AGI-1 vs. Opus 4's 35.7%, but for ARC-AGI-2 they score about the same (Opus scores higher now). The current models will probably do badly on ARC-AGI-3.

keen beacon Aug 7, 2025, 5:51 PM

#

Slop maxed model

#

No way

wintry tinsel Aug 7, 2025, 5:52 PM

#

AI has hit a wall guys

void elm Aug 7, 2025, 5:52 PM

#

wintry tinsel Aug 7, 2025, 5:52 PM

#

They don’t know how to scale or improve it anymore it’s all investor hype

wheat onyx Aug 7, 2025, 5:52 PM

#

keen beacon Aug 7, 2025, 5:52 PM

#

This is what Sam commented on the Chart crime "wow a mega chart screwup from us earlier--wen GPT-6?! correct on the blog though.
"

fleet lintel Aug 7, 2025, 5:53 PM

#

jade egret

@deep adder is agian the only voter in favor of gpt-5... are you OAI employee?? reveal to us

feral lichen Aug 7, 2025, 5:53 PM

#

Ai has limits, the brain does not

keen beacon Aug 7, 2025, 5:53 PM

#

"GPT-5 is here - and it’s #1 across the board.

🥇#1 in Text, WebDev, and Vision Arena
🥇#1 in Hard Prompts, Coding, Math, Creativity, Long Queries, and more

Tested under the codename “summit”, GPT-5 now holds the highest Arena score to date." THEN WHAT WAS ZENITH???

#

context: zenith was better

stray aspen Aug 7, 2025, 5:53 PM

#

its gpt 5.5

wheat onyx Aug 7, 2025, 5:53 PM

#

keen beacon context: zenith was better

GPT5Pro?

blazing bison Aug 7, 2025, 5:53 PM

#

reveal zenith

feral lichen Aug 7, 2025, 5:53 PM

#

Does anyone know anything about monitors?

whole wagon Aug 7, 2025, 5:54 PM

#

summit > zenith

blazing bison Aug 7, 2025, 5:54 PM

#

no

red sluice Aug 7, 2025, 5:54 PM

#

Yup if zenith is google damn son we're gonna have fun

blazing bison Aug 7, 2025, 5:54 PM

#

no one agrees that summit >zenith bro

#

NO ONE

keen beacon Aug 7, 2025, 5:54 PM

#

GUYS! GPT-6 COMFIRMED! (gone wrong!)

whole wagon Aug 7, 2025, 5:54 PM

#

red sluice Yup if zenith is google damn son we're gonna have fun

bruh

#

no way

white hatch Aug 7, 2025, 5:54 PM

#

we'll see how gpt 5 will fix the bug in my project

wheat onyx Aug 7, 2025, 5:54 PM

#

wheat onyx

This crushes Opus on SWE Bench. 75% vs 67.6%

fleet lintel Aug 7, 2025, 5:54 PM

#

red sluice Yup if zenith is google damn son we're gonna have fun

it;s OAI model for sure

red sluice Aug 7, 2025, 5:54 PM

#

oh

primal orbit Aug 7, 2025, 5:54 PM

#

openai is not the same without Ilya.

keen beacon Aug 7, 2025, 5:54 PM

#

We need ilya back tbh

verbal nimbus Aug 7, 2025, 5:54 PM

#

wheat onyx

Would be nice if there was a graph with Claude, and the price on the x-axis.

blazing bison Aug 7, 2025, 5:55 PM

#

imagine if zenith is gpt 5 -mini

#

fun

keen beacon Aug 7, 2025, 5:55 PM

#

His balding head made the company bold

wheat onyx Aug 7, 2025, 5:55 PM

#

verbal nimbus Would be nice if there was a graph with Claude, and the price on the x-axis.

we'll see these from 3rd party

void elm Aug 7, 2025, 5:55 PM

#

openai is over

#

not even gpt 6 will save them

#

actual bs

stray aspen Aug 7, 2025, 5:56 PM

#

at this point deepseek r2 will cook them

verbal nimbus Aug 7, 2025, 5:56 PM

#

wheat onyx This crushes Opus on SWE Bench. 75% vs 67.6%

Opus 4.1 scores 74.1% without thinking btw

keen beacon Aug 7, 2025, 5:56 PM

#

Bro this is pure trash

#

coding specialized model

fleet lintel Aug 7, 2025, 5:56 PM

#

nah.. OAI will remain top company for atleast couple of years... but direction is not great

keen beacon Aug 7, 2025, 5:56 PM

#

Opus 4.2 will cook this

wheat onyx Aug 7, 2025, 5:56 PM

#

narrow dawn Aug 7, 2025, 5:56 PM

#

lmarena doesn't works

echo aurora Aug 7, 2025, 5:56 PM

#

narrow dawn lmarena doesn't works

how so?

wheat onyx Aug 7, 2025, 5:56 PM

#

verbal nimbus Opus 4.1 scores 74.1% without thinking btw

good to know, it's not on the leaderboard I saw

narrow dawn Aug 7, 2025, 5:56 PM

#

i need to select the model and there is no model

#

._.

deft vigil Aug 7, 2025, 5:57 PM

#

Lmarena is dxomark 2.0

unborn lantern Aug 7, 2025, 5:57 PM

#

guys, how can i use gpt 4.5 in lmarena?

whole wagon Aug 7, 2025, 5:57 PM

#

Admittedly I am extremely confused how they are expecting to reach AGI

stray aspen Aug 7, 2025, 5:57 PM

#

is lmarena down

fleet lintel Aug 7, 2025, 5:57 PM

#

lol https://www.reddit.com/r/singularity/comments/1mk70xx/summary_of_the_livestream_for_those_that_couldnt/

From the singularity community on Reddit: Summary of the livestream...

Explore this post and more from the singularity community

whole wagon Aug 7, 2025, 5:57 PM

#

whole wagon Admittedly I am extremely confused how they are expecting to reach AGI

This does not seem the path to do so

void elm Aug 7, 2025, 5:57 PM

#

whole wagon Admittedly I am extremely confused how they are expecting to reach AGI

google will be the first to reach agi

verbal nimbus Aug 7, 2025, 5:57 PM

#

wheat onyx good to know, it's not on the leaderboard I saw

Opus 4.1 was just released yesterday I think. Only a small improvement on SWE-bench though.

meager harbor Aug 7, 2025, 5:57 PM

#

SCAM HYPEMAN

pulsar rain Aug 7, 2025, 5:57 PM

#

gpt-5 tend to go straight to the point compare to gemini 2.5 pro

unborn lantern Aug 7, 2025, 5:57 PM

#

narrow dawn lmarena doesn't works

Yes. It's down

void elm Aug 7, 2025, 5:57 PM

#

google has infinite money

#

everyone shitted on bard then they cooked

keen beacon Aug 7, 2025, 5:57 PM

#

whole wagon Admittedly I am extremely confused how they are expecting to reach AGI

They think they can declare AGI by the end of the year (reports mention this in relation to the Microsoft deal) HOW???

wheat onyx Aug 7, 2025, 5:58 PM

#

verbal nimbus Opus 4.1 was just released yesterday I think. Only a small improvement on SWE-be...

ah so it's basically on par for this test

narrow dawn Aug 7, 2025, 5:58 PM

#

unborn lantern Yes. It's down

how long?

echo aurora Aug 7, 2025, 5:58 PM

#

narrow dawn i need to select the model and there is no model

uh oh, I'm seeing the same, escalating now.

opal juniper Aug 7, 2025, 5:58 PM

#

narrow dawn i need to select the model and there is no model

I have same problem

unborn lantern Aug 7, 2025, 5:58 PM

#

narrow dawn how long?

Mybe 3 mints

tired herald Aug 7, 2025, 5:58 PM

#

Yeah model selector be having some problems

narrow dawn Aug 7, 2025, 5:58 PM

#

bruh :/

opal juniper Aug 7, 2025, 5:58 PM

#

However the battle mode works

wheat onyx Aug 7, 2025, 5:58 PM

#

Anthropic should have no problem for staying ahead in coding then, they said big improvements in coming weeks

feral lichen Aug 7, 2025, 5:58 PM

#

Max tokens for gpt 5?

narrow dawn Aug 7, 2025, 5:58 PM

#

i was just making my homework with it

unborn lantern Aug 7, 2025, 5:58 PM

#

guys, how can i use gpt 4.5 in lmarena?

keen beacon Aug 7, 2025, 5:58 PM

#

Guys! have we hit a wall??? first claude opus 4.1 scores 3 percent more and now this?

stray aspen Aug 7, 2025, 5:58 PM

#

lmarena model selection is not working

unborn lantern Aug 7, 2025, 5:58 PM

#

narrow dawn i was just making my homework with it

Lmao

sleek crow Aug 7, 2025, 5:58 PM

#

finally a model that can generate a Minecraft clone

verbal nimbus Aug 7, 2025, 5:58 PM

#

wheat onyx ah so it's basically on par for this test

The subset of problems and the framework is a bit different, I think. And one has thinking whereas the other doesn't. SWE-bench bash-only mode would be more fair: https://www.swebench.com/. They'll probably show up soon.

fleet lintel Aug 7, 2025, 5:58 PM

#

wheat onyx Anthropic should have no problem for staying ahead in coding then, they said big...

100%

feral lichen Aug 7, 2025, 5:59 PM

#

Max tokens for gpt 5?

white hatch Aug 7, 2025, 5:59 PM

#

sleek crow finally a model that can generate a Minecraft clone

looool

stray aspen Aug 7, 2025, 5:59 PM

#

sleek crow finally a model that can generate a Minecraft clone

a chinese model also did it

narrow dawn Aug 7, 2025, 5:59 PM

#

unborn lantern Lmao

i missed one thing

opal juniper Aug 7, 2025, 5:59 PM

#

sleek crow finally a model that can generate a Minecraft clone

What model is that

fleet lintel Aug 7, 2025, 5:59 PM

#

keen beacon Guys! have we hit a wall??? first claude opus 4.1 scores 3 percent more and now ...

i think OAI is definitely hitting the wall... let's see if google is in the same boat.. gemini 3 will tell us

keen beacon Aug 7, 2025, 5:59 PM

#

"Grok 5 will be out before the end of this year and it will be crushingly good
" ELON ON TWITTER!!!

sleek crow Aug 7, 2025, 5:59 PM

#

gpt-5

unborn lantern Aug 7, 2025, 5:59 PM

#

narrow dawn i missed one thing

Which model?

verbal nimbus Aug 7, 2025, 5:59 PM

#

sleek crow finally a model that can generate a Minecraft clone

GPT-5?

pulsar rain Aug 7, 2025, 5:59 PM

#

sleek crow finally a model that can generate a Minecraft clone

that's quiet impressive. try asking it all the problem with chunk generating, glass, water, vvv to see how it would solve

narrow dawn Aug 7, 2025, 6:00 PM

#

unborn lantern Which model?

something with gpt

sleek crow Aug 7, 2025, 6:00 PM

#

is avaible on lmarena

torn mantle Aug 7, 2025, 6:00 PM

#

is lmarena down?

#

cant select models

stray aspen Aug 7, 2025, 6:00 PM

#

keen beacon Aug 7, 2025, 6:00 PM

#

fleet lintel i think OAI is definitely hitting the wall... let's see if google is in the same...

So the entire hypetrain depends on Gemini 3 since both Anthropic and OpenAI hit a wall

feral lichen Aug 7, 2025, 6:00 PM

#

Max tokens for gpt 5?

verbal nimbus Aug 7, 2025, 6:00 PM

#

pulsar rain that's quiet impressive. try asking it all the problem with chunk generating, gl...

Or try asking it to solve old-fashioned JS script-order loading issues. Claude had a seizure, lol.

exotic tartan Aug 7, 2025, 6:00 PM

#

feral lichen Max tokens for gpt 5?

API said 128K

keen beacon Aug 7, 2025, 6:00 PM

#

WE WAITED 2 YEARS FOR THIS???

lone vector Aug 7, 2025, 6:00 PM

#

whole wagon Aug 7, 2025, 6:00 PM

#

Even the december odds are shifting...

feral lichen Aug 7, 2025, 6:00 PM

#

exotic tartan API said 128K

How much is it? Idk

echo aurora Aug 7, 2025, 6:00 PM

#

torn mantle is lmarena down?

Yeah, we're looking into

meager harbor Aug 7, 2025, 6:00 PM

#

exotic tartan API said 128K

yeah that's crap, gemini is 1 million

stray aspen Aug 7, 2025, 6:01 PM

#

waited so long for this garbage

torn mantle Aug 7, 2025, 6:01 PM

#

echo aurora Yeah, we're looking into

thanks

keen beacon Aug 7, 2025, 6:01 PM

#

stray aspen waited so long for this garbage

2 years!

tired herald Aug 7, 2025, 6:01 PM

#

Model selector back on again

exotic tartan Aug 7, 2025, 6:01 PM

#

whole wagon Even the december odds are shifting...

why is it down?

wheat onyx Aug 7, 2025, 6:01 PM

#

The AIDS chart

unborn lantern Aug 7, 2025, 6:01 PM

#

guys, how can i use gpt 4.5 in lmarena?

echo aurora Aug 7, 2025, 6:01 PM

#

tired herald Model selector back on again

ablobcheer I'm seeing the same

barren prairie Aug 7, 2025, 6:01 PM

#

I hope thta deepSeek won t do the same thing , waiting ages for garbage

pulsar rain Aug 7, 2025, 6:01 PM

#

1 million token point is barely enough if you want it to read all the text in a book 🤣

thorn ore Aug 7, 2025, 6:01 PM

#

is chatgpt-5 a joke model

#

i think its fake

steady vale Aug 7, 2025, 6:02 PM

#

JUST IN: GPT-4.5 got removed from chatgpt's website

thorn ore Aug 7, 2025, 6:02 PM

#

Oh

#

NOOOOOO

stray aspen Aug 7, 2025, 6:02 PM

#

barren prairie I hope thta deepSeek won t do the same thing , waiting ages for garbage

lets what they give us. those people never say a word

void elm Aug 7, 2025, 6:02 PM

#

steady vale JUST IN: GPT-4.5 got removed from chatgpt's website

bro is not a twitter bot

fleet lintel Aug 7, 2025, 6:02 PM

#

keen beacon So the entire hypetrain depends on Gemini 3 since both Anthropic and OpenAI hit ...

anthropic is still great but only for coding.. i still have hopes on them in this area

whole sundial Aug 7, 2025, 6:02 PM

#

unborn lantern guys, how can i use gpt 4.5 in lmarena?

gpt 4.5's dead, dead from api and dead from chatgpt web

warm fulcrum Aug 7, 2025, 6:02 PM

#

steady vale JUST IN: GPT-4.5 got removed from chatgpt's website

great

verbal nimbus Aug 7, 2025, 6:02 PM

#

lone vector

Gemini hallucinates so terribly though. It sometimes doesn't even tell me I forgot to attach a document; it just makes one up.

void elm Aug 7, 2025, 6:02 PM

#

its not even removed

prime mulch Aug 7, 2025, 6:02 PM

#

steady vale JUST IN: GPT-4.5 got removed from chatgpt's website

Backfired? Lmaoooo

void elm Aug 7, 2025, 6:02 PM

#

4.5 is still there

#

lies

wicked root Aug 7, 2025, 6:02 PM

#

How much better is gpt5?

pulsar rain Aug 7, 2025, 6:02 PM

#

steady vale JUST IN: GPT-4.5 got removed from chatgpt's website

it's just a cheap ass model for general uses anyway

whole wagon Aug 7, 2025, 6:02 PM

#

wheat onyx The AIDS chart

bros had to make the most misleading graph ever

keen beacon Aug 7, 2025, 6:02 PM

#

This is deeply saddening.

warm fulcrum Aug 7, 2025, 6:02 PM

#

@echo aurora which gpt-5 version is it that's displayed on lmarena?

whole wagon Aug 7, 2025, 6:02 PM

#

thats crazy work ngl

unborn lantern Aug 7, 2025, 6:03 PM

#

whole sundial gpt 4.5's dead, dead from api and dead from chatgpt web

Thanks for your info

pulsar rain Aug 7, 2025, 6:03 PM

#

wicked root How much better is gpt5?

first impression: concise, straight to the point

stray aspen Aug 7, 2025, 6:03 PM

#

wicked root How much better is gpt5?

its great for lua

verbal nimbus Aug 7, 2025, 6:03 PM

#

pulsar rain first impression: concise, straight to the point

Try pasting Claude's system prompt into other LLMs. Might help.

wicked root Aug 7, 2025, 6:03 PM

#

Is gemini 2.5 screwed?

warm fulcrum Aug 7, 2025, 6:03 PM

#

stray aspen its great for lua

who uses lua noob

tired herald Aug 7, 2025, 6:03 PM

#

Lmfao

hoary elbow Aug 7, 2025, 6:04 PM

#

I overslept and I woke up realizing that GPT five is out

stray aspen Aug 7, 2025, 6:04 PM

#

warm fulcrum who uses lua noob

roblocks

stray aspen Aug 7, 2025, 6:04 PM

#

wicked root Is gemini 2.5 screwed?

no

tired herald Aug 7, 2025, 6:04 PM

#

Theres two models called "gpt oss 120b" on lmarena rn

keen beacon Aug 7, 2025, 6:04 PM

#

Why did they release this? If they did it for the normies, why did they hype it so much!?

stray aspen Aug 7, 2025, 6:04 PM

#

tired herald Theres two models called "gpt oss 120b" on lmarena rn

yeah thats trash

warm fulcrum Aug 7, 2025, 6:04 PM

#

keen beacon Why did they release this? If they did it for the normies, why did they hype it ...

whats wrong with it

hoary elbow Aug 7, 2025, 6:04 PM

#

GPTOSS 120 B is a open source model made by GPT. ChatGPT says it’s just as powerful as 4o

keen beacon Aug 7, 2025, 6:04 PM

#

Everything!

#

it does not match the hype!

stray aspen Aug 7, 2025, 6:04 PM

#

hoary elbow GPTOSS 120 B is a open source model made by GPT. ChatGPT says it’s just as power...

no its bad

pulsar rain Aug 7, 2025, 6:04 PM

#

hoary elbow I overslept and I woke up realizing that GPT five is out

go back to sleep again then 🤣

stray aspen Aug 7, 2025, 6:04 PM

#

its the worst open source model ever

unborn lantern Aug 7, 2025, 6:04 PM

#

Same or bugs?

Screenshot_2025-08-08-00-03-59-779_com.android.chrome-edit.jpg

stray aspen Aug 7, 2025, 6:04 PM

#

plus it has north korean level censorship

#

like what were they thinking

hoary elbow Aug 7, 2025, 6:04 PM

#

pulsar rain go back to sleep again then 🤣

Why

blazing rune Aug 7, 2025, 6:04 PM

#

They didn't release any benchmarks for the Mini and Nano versions of GPT-5.

echo aurora Aug 7, 2025, 6:05 PM

#

unborn lantern Same or bugs?

hmm will take a look

keen beacon Aug 7, 2025, 6:05 PM

#

GUYS! GPT-5 supposed to be the best model for cost to performance though.

blazing rune Aug 7, 2025, 6:05 PM

#

probably means they are about at the level of 4.1 Mini and Nano

hoary elbow Aug 7, 2025, 6:05 PM

#

Is it better than Grok though?

pulsar rain Aug 7, 2025, 6:05 PM

#

hoary elbow Why

it barely beat gemini 2.5 pro in bendmark.

torn mantle Aug 7, 2025, 6:05 PM

#

its working now @echo aurora

hoary elbow Aug 7, 2025, 6:05 PM

#

Ok

stray aspen Aug 7, 2025, 6:05 PM

#

hoary elbow Is it better than Grok though?

no

eternal niche Aug 7, 2025, 6:05 PM

#

hoary elbow Why

keen beacon Aug 7, 2025, 6:05 PM

#

ITS NOT A REVOLUTIONARY MODEL! its an effcient model

rapid merlin Aug 7, 2025, 6:05 PM

#

wheat onyx The AIDS chart

good old graphs made by chatgpt

stray aspen Aug 7, 2025, 6:05 PM

#

it didnt beat grok in arc agi 2

wicked root Aug 7, 2025, 6:05 PM

#

Polymarket’s saying gemini’s over

torn mantle Aug 7, 2025, 6:05 PM

#

so far gpt5 is good

verbal nimbus Aug 7, 2025, 6:05 PM

#

wicked root Is gemini 2.5 screwed?

Google has the efficiency advantage because of their custom TPUs. It's kinda crazy that you get free unlimited use of Gemini 2.5 Pro on AIStudio with 1M context.

rapid merlin Aug 7, 2025, 6:05 PM

#

are yall testing it from lmarena?

#

dont think they put it on their site yet

echo aurora Aug 7, 2025, 6:06 PM

#

torn mantle its working now <@283397944160550928>

I'm still seeing a few things off, but yeah overall should be back blobthumbsup

wicked root Aug 7, 2025, 6:06 PM

#

Wait so is it over for gemini?

keen beacon Aug 7, 2025, 6:06 PM

#

"GPT-5 results on ARC-AGI 1 & 2!

Top line:

65.7% on ARC-AGI-1
9.9% on ARC-AGI-2
" IT DOESN'T EVEN MATCH o3 FROM DECEMBER AT ARC AGI 1!!!

stray aspen Aug 7, 2025, 6:06 PM

#

pulsar rain Aug 7, 2025, 6:06 PM

#

Nothing will replace 1M context. just paste the whole book and it know everything

hoary elbow Aug 7, 2025, 6:06 PM

#

wicked root Wait so is it over for gemini?

No Gemini still better than GPT five

echo aurora Aug 7, 2025, 6:06 PM

#

warm fulcrum <@283397944160550928> which gpt-5 version is it that's displayed on lmarena?

the standard version

hoary elbow Aug 7, 2025, 6:06 PM

#

I saw the benchmarks

#

But at least GPT five is good

warm fulcrum Aug 7, 2025, 6:07 PM

#

echo aurora the standard version

ok thx

keen beacon Aug 7, 2025, 6:07 PM

#

LOGAN! say it! say the damn words! "Gemini Gemini Gemini"!!!

astral jetty Aug 7, 2025, 6:07 PM

#

echo aurora the standard version

Does that automatically include thinking

hoary elbow Aug 7, 2025, 6:07 PM

#

directchat3d

unborn lantern Aug 7, 2025, 6:07 PM

#

They Didn't increase their knowledge cut off parameters

Screenshot_2025-08-08-00-07-03-516_com.android.chrome.png

verbal nimbus Aug 7, 2025, 6:08 PM

#

pulsar rain Nothing will replace 1M context. just paste the whole book and it know everythin...

GPT-5 kinda falls short on the context window size and knowledge cutoff date.

stray aspen Aug 7, 2025, 6:08 PM

#

this one?

wintry tinsel Aug 7, 2025, 6:08 PM

#

Gemini may be the best now but google will neuter it once they have market monopoly we need heavy competition to keep Gemini good

tired herald Aug 7, 2025, 6:08 PM

#

unborn lantern They Didn't increase their knowledge cut off parameters

Damn

tired herald Aug 7, 2025, 6:08 PM

#

stray aspen this one?

Where is this picture from

hoary elbow Aug 7, 2025, 6:08 PM

#

Is there another way to get GPT five to search

fleet lintel Aug 7, 2025, 6:08 PM

#

verbal nimbus GPT-5 kinda falls short on the context window size and knowledge cutoff date.

why that old?

hoary elbow Aug 7, 2025, 6:08 PM

#

Because Joe Biden is not the president

torn mantle Aug 7, 2025, 6:08 PM

#

nah its actually so good

hoary elbow Aug 7, 2025, 6:08 PM

#

Not anymore

torn mantle Aug 7, 2025, 6:08 PM

#

i have many things to say

whole wagon Aug 7, 2025, 6:08 PM

#

verbal nimbus GPT-5 kinda falls short on the context window size and knowledge cutoff date.

bruh

#

this is literally stale model wth

stray aspen Aug 7, 2025, 6:09 PM

#

tired herald Where is this picture from

from openai models

tired herald Aug 7, 2025, 6:09 PM

#

Dem

devout vault Aug 7, 2025, 6:09 PM

#

Why is the cut off day 2024 oct

verbal nimbus Aug 7, 2025, 6:09 PM

#

fleet lintel why that old?

Bad data? Idk. I thought Sam Altman said it was a router model though. (few months ago)

patent bane Aug 7, 2025, 6:09 PM

#

gpt-5 on chatgpt is dumber than the one in API????

whole wagon Aug 7, 2025, 6:09 PM

#

devout vault Why is the cut off day 2024 oct

thats how long it took them to get this release out 💀

#

crazy

echo aurora Aug 7, 2025, 6:09 PM

#

unborn lantern Same or bugs?

Should be fixed now blobthumbsup

tired herald Aug 7, 2025, 6:10 PM

#

Openai logins are broken lol

keen beacon Aug 7, 2025, 6:10 PM

#

"gpt-5 fast facts:

hits sota on pretty much every eval
way better than claude 4.1 opus at swe
5× cheaper than opus
40% cheaper than sonnet
best writing quality of any model
way less sycophantic" - OpenAI employee

#

Roon failed us.

#

Trash writing still

verbal nimbus Aug 7, 2025, 6:11 PM

#

The mini version has even an older knowledge cutoff date

void elm Aug 7, 2025, 6:11 PM

#

barely any better

indigo flax Aug 7, 2025, 6:11 PM

#

“Black and white vector-style silhouette of a confident bearded man wearing sunglasses, modern hairstyle

tired herald Aug 7, 2025, 6:11 PM

#

They ought to make a good 1m parameter model

void elm Aug 7, 2025, 6:11 PM

#

openai has to retire and give its compute to google

stray aspen Aug 7, 2025, 6:11 PM

#

tired herald They ought to make a good 1m parameter model

google deepmind has to

solid brook Aug 7, 2025, 6:11 PM

#

tired herald They ought to make a good 1m parameter model

the nano

keen beacon Aug 7, 2025, 6:11 PM

#

Secret Gemini models have better writing!!!

tired herald Aug 7, 2025, 6:11 PM

#

solid brook the nano

Those are garbage

whole wagon Aug 7, 2025, 6:11 PM

#

verbal nimbus The mini version has even an older knowledge cutoff date

WTF

#

how the hell

tired herald Aug 7, 2025, 6:11 PM

#

stray aspen google deepmind has to

Google already have good ones, OpenAI dont at all

whole wagon Aug 7, 2025, 6:12 PM

#

its a mini model bruh it should not take a year to train

verbal nimbus Aug 7, 2025, 6:12 PM

#

patent bane gpt-5 on chatgpt is dumber than the one in API????

The one in ChatGPT is a different model with a different cutoff date. https://platform.openai.com/docs/models/gpt-5-chat-latest vs. https://platform.openai.com/docs/models/gpt-5

devout vault Aug 7, 2025, 6:12 PM

#

Gemini 3 and grok 5 will win 100%

stray aspen Aug 7, 2025, 6:12 PM

#

the chatgpt 5 is dumber than the API

tired herald Aug 7, 2025, 6:12 PM

#

Not even nano has 1m

quiet moss Aug 7, 2025, 6:13 PM

#

is chatGPT 5 on the website yet?

whole wagon Aug 7, 2025, 6:13 PM

#

patent bane Aug 7, 2025, 6:13 PM

#

verbal nimbus The one in ChatGPT is a different model with a different cutoff date. <https://p...

that"s what I said...

keen beacon Aug 7, 2025, 6:13 PM

#

rapid merlin Aug 7, 2025, 6:13 PM

#

so the one in chatgpt is lobotomized?

#

💀

stray aspen Aug 7, 2025, 6:13 PM

#

didnt someone from google say this was gonna be an exciting week

barren prairie Aug 7, 2025, 6:14 PM

#

keen beacon

I am not a google employe

tired herald Aug 7, 2025, 6:14 PM

#

keen beacon

Of course they will, they have to release a "competitor" to this new trashai model

sour spindle Aug 7, 2025, 6:14 PM

#

What model has had the biggest positive reception here?

stray aspen Aug 7, 2025, 6:14 PM

#

gemini

verbal nimbus Aug 7, 2025, 6:14 PM

#

devout vault Gemini 3 and grok 5 will win 100%

Claude is the most self-aware and agentic still. Gemini doesn't even tell me I passed in the exact same attachment twice, whereas Claude is like "HoLdUp"

hoary elbow Aug 7, 2025, 6:14 PM

#

Can I send a video here real quick?

devout vault Aug 7, 2025, 6:14 PM

#

i was the first to say that LOL

devout vault Aug 7, 2025, 6:15 PM

#

verbal nimbus Claude is the most self-aware and agentic still. Gemini doesn't even tell me I p...

true

tired herald Aug 7, 2025, 6:15 PM

#

Gemini 2.5 Pro already has many things better than gpt 5

#

Gemini 3 Pro will be groundbreaking

stray aspen Aug 7, 2025, 6:15 PM

#

tired herald Gemini 2.5 Pro already has many things better than gpt 5

yeah its pretty full model

#

gpt 5 is garbage

keen beacon Aug 7, 2025, 6:15 PM

#

hoary elbow Aug 7, 2025, 6:15 PM

#

Gemini three pro might be better than Grok four I mean it has a chance to be better

#

Since 2.5 pro is better than GPT five

stray aspen Aug 7, 2025, 6:15 PM

#

keen beacon

kimi k2 are you serious ?

keen beacon Aug 7, 2025, 6:15 PM

#

keen beacon

HOW IS GPT 5 WINNING???

hoary elbow Aug 7, 2025, 6:15 PM

#

I wonder how Gemini will be

analog bone Aug 7, 2025, 6:16 PM

#

Where's GLM ? 🙁

pulsar rain Aug 7, 2025, 6:16 PM

#

default gemini 2.5 pro praises all your question no mater how stupid it is. It get very annoyed

hoary elbow Aug 7, 2025, 6:16 PM

#

Wait, it’s winning

#

I can’t believe it

stray aspen Aug 7, 2025, 6:16 PM

#

analog bone Where's GLM ? 🙁

glm aint sota

void elm Aug 7, 2025, 6:16 PM

#

its such a minor upgrade

keen beacon Aug 7, 2025, 6:16 PM

#

Stop with the A = A arguments. its a trash model sir.

tired herald Aug 7, 2025, 6:16 PM

#

pulsar rain default gemini 2.5 pro praises all your question no mater how stupid it is. It g...

Now make a simple prompt against that behaviour and put it into system prompt

verbal nimbus Aug 7, 2025, 6:16 PM

#

pulsar rain default gemini 2.5 pro praises all your question no mater how stupid it is. It g...

Try pasting the Claude system prompt into it

#

Gemini is definitely sycophantic, lol

keen beacon Aug 7, 2025, 6:16 PM

#

What do you mean?

tired herald Aug 7, 2025, 6:16 PM

#

Well, at least gpt 5 nano accepts pictures

sour spindle Aug 7, 2025, 6:17 PM

#

Also does anyone have access to to 5 right now lol

steady vale Aug 7, 2025, 6:17 PM

#

gemini is the most sycophantic model

eternal niche Aug 7, 2025, 6:17 PM

#

whole wagon Aug 7, 2025, 6:17 PM

#

this guy on the livestream is just bsing

tired herald Aug 7, 2025, 6:17 PM

#

sour spindle Also does anyone have access to to 5 right now lol

Lmarena

whole wagon Aug 7, 2025, 6:17 PM

#

literally saying nothing

stray aspen Aug 7, 2025, 6:17 PM

#

sour spindle Also does anyone have access to to 5 right now lol

yes

keen beacon Aug 7, 2025, 6:17 PM

#

It literally is not! arc agi is only one of the benchmarks where it falters!

stray aspen Aug 7, 2025, 6:17 PM

#

on lmarena

void elm Aug 7, 2025, 6:18 PM

#

filler words

stray aspen Aug 7, 2025, 6:18 PM

#

now the livestream is just pure yapping

whole wagon Aug 7, 2025, 6:18 PM

#

selling

slow sail Aug 7, 2025, 6:18 PM

#

He tried his best

whole wagon Aug 7, 2025, 6:18 PM

#

we will get back to selling kek

verbal nimbus Aug 7, 2025, 6:18 PM

#

eternal niche

They're actually tied, as the confidence interval overlaps.

pulsar rain Aug 7, 2025, 6:18 PM

#

verbal nimbus Try pasting the Claude system prompt into it

is it this one? https://www.reddit.com/r/ClaudeAI/comments/1ixapi4/here_is_claude_sonnet_37_full_system_prompt/

From the ClaudeAI community on Reddit

Explore this post and more from the ClaudeAI community

whole wagon Aug 7, 2025, 6:18 PM

#

verbal nimbus They're actually tied, as the confidence interval overlaps.

thats not how the betting works

keen beacon Aug 7, 2025, 6:18 PM

#

Will they do the pokemon Bench with GPT-5?

whole wagon Aug 7, 2025, 6:18 PM

#

it is the actual value only

#

I find it funny they copied gemini 2.5 pro pricing exactly lol

#

like $1.25/$10

hollow imp Aug 7, 2025, 6:19 PM

#

Wait is gpt5 a reasoning model?

short adder Aug 7, 2025, 6:19 PM

#

Could we get gemini 3 this month？

stray aspen Aug 7, 2025, 6:19 PM

#

hollow imp Wait is gpt5 a reasoning model?

yes

keen beacon Aug 7, 2025, 6:19 PM

#

Roon did not fail us

verbal nimbus Aug 7, 2025, 6:19 PM

#

pulsar rain is it this one? https://www.reddit.com/r/ClaudeAI/comments/1ixapi4/here_is_claud...

The official one is here: https://docs.anthropic.com/en/release-notes/system-prompts. I removed unnecessary lines (Anthropic product info).

void elm Aug 7, 2025, 6:19 PM

#

did you even watch the stream?

balmy mist Aug 7, 2025, 6:20 PM

#

why some people hating on GPT-5?

void elm Aug 7, 2025, 6:20 PM

#

because its like upgrading claude 4.0 to 4.1

#

negligible upgrades

hollow imp Aug 7, 2025, 6:20 PM

#

verbal nimbus The official one is here: <https://docs.anthropic.com/en/release-notes/system-pr...

Bro can I dm you
I want to talk a lot about gemini 2.5 and stuff. Not the news or about the model but about the uses of the model and my personal experience and cases

hollow imp Aug 7, 2025, 6:20 PM

#

void elm did you even watch the stream?

I didn't

stray aspen Aug 7, 2025, 6:20 PM

#

balmy mist why some people hating on GPT-5?

they spit in our faces again

#

first with this garbage open source model

#

and now with gpt 5

hollow imp Aug 7, 2025, 6:20 PM

#

stray aspen they spit in our faces again

What happened