#general | Arena | Page 87

reef pawn Aug 7, 2025, 10:54 PM

#

What was the prompt?

errant cave Aug 7, 2025, 10:55 PM

#

reef pawn Aug 7, 2025, 10:55 PM

#

Ohh

golden ocean Aug 7, 2025, 10:59 PM

#

LMAO

wicked root Aug 7, 2025, 10:59 PM

#

If GPT5 didn't live up to expectations, why is it doing so well on the leaderboard? And why is polymarket acting like Google won the race this month?

ocean vortex Aug 7, 2025, 11:00 PM

#

wicked root If GPT5 didn't live up to expectations, why is it doing so well on the leaderboa...

Who said it didn't live up to expectations? I think it did live up to them

#

It's a great model

#

the best we have

#

3.5 gonna destroy 3.0

wicked root Aug 7, 2025, 11:00 PM

#

ocean vortex the best we have

why is polymarket giving the odds at 80% in favor of Gemini?

#

Oh, because of Gemini 3.0?

keen beacon Aug 7, 2025, 11:01 PM

#

gem 3.0 isnt coming this month apparently

wicked root Aug 7, 2025, 11:01 PM

#

gap seems to be narrowing though

keen beacon Aug 7, 2025, 11:01 PM

#

but new 2.5 update

ocean vortex Aug 7, 2025, 11:01 PM

#

wicked root why is polymarket giving the odds at 80% in favor of Gemini?

because it's a market for lmarena leaderboard with style control disabled. 2.5Pro still leads in this specific one leaderboard

candid storm Aug 7, 2025, 11:02 PM

#

#

I dodged a bullet lol

wicked root Aug 7, 2025, 11:02 PM

#

ocean vortex because it's a market for lmarena leaderboard with style control disabled. 2.5Pr...

yes, but if gpt5 is doing so well in everything else, wouldn't it take over soon?

candid storm Aug 7, 2025, 11:02 PM

#

Fortunately i could switch teams just in time

ocean vortex Aug 7, 2025, 11:03 PM

#

wicked root yes, but if gpt5 is doing so well in everything else, wouldn't it take over soon...

it can but it's close, 2.5Pro is benchmaxxed on lmarena

wicked root Aug 7, 2025, 11:03 PM

#

@candid stormI don't see your name.

ocean vortex Aug 7, 2025, 11:03 PM

#

and since 2.5Pro is technically leading at this moment on leaderboard with their settings, the odds are that

wicked root Aug 7, 2025, 11:03 PM

#

hm I see

candid storm Aug 7, 2025, 11:04 PM

#

Im RIC25

wicked root Aug 7, 2025, 11:04 PM

#

candid storm Im RIC25

ah I see you. Grinch127 here

whole wagon Aug 7, 2025, 11:04 PM

#

Smth that really annoyed me about openai powerpoint is a lot is just totally incorrect. Like here, 50 is the number of tests they ran not the percent. Imagine getting paid 8 figures and messing up smth so trivial

candid storm Aug 7, 2025, 11:04 PM

#

wicked root ah I see you. Grinch127 here

Nice profit/loss!

whole wagon Aug 7, 2025, 11:04 PM

#

it is supposed to be this

#

How do you mess up trivial bar graphs

golden ocean Aug 7, 2025, 11:05 PM

#

https://cdn.discordapp.com/attachments/1094462710474883242/1307400845327073362/E59321BC-A727-4EC4-9204-4DD36F4ABEEF.gif

candid storm Aug 7, 2025, 11:05 PM

#

wicked root ah I see you. Grinch127 here

#

Crazy good trade you made!

barren prairie Aug 7, 2025, 11:06 PM

#

whole wagon Smth that really annoyed me about openai powerpoint is a lot is just totally inc...

And 50 is lower than 47.5

wicked root Aug 7, 2025, 11:06 PM

#

🫡 I'm a bit worried if GOogle will be able to keep up the lead until the end of this month

whole wagon Aug 7, 2025, 11:06 PM

#

whole wagon How do you mess up trivial bar graphs

it makes me wonder what other bugs and stuff they have. entire team stayed up all night prepping and cant even spot errors on a bar chart

fickle venture Aug 7, 2025, 11:06 PM

#

candid storm

"Afk money grind" ahh

golden ocean Aug 7, 2025, 11:07 PM

#

It's large
It loves language
and it's a model

fickle venture Aug 7, 2025, 11:07 PM

#

Ok

candid storm Aug 7, 2025, 11:09 PM

#

#

I think google 69% september is a good deal

whole wagon Aug 7, 2025, 11:10 PM

#

anyways i have pro already

#

Not in api yet i think. or anywhere even lol

golden ocean Aug 7, 2025, 11:11 PM

#

are all other models gone

whole wagon Aug 7, 2025, 11:11 PM

#

i like im always the very first for literally everything for some reason lol

whole wagon Aug 7, 2025, 11:11 PM

#

golden ocean are all other models gone

yup

fickle venture Aug 7, 2025, 11:11 PM

#

whole wagon anyways i have pro already

My website still didn't updated same with the app

keen beacon Aug 7, 2025, 11:12 PM

#

whole wagon anyways i have pro already

What are you gonna ask the pro model?

whole wagon Aug 7, 2025, 11:12 PM

#

no idea

errant cave Aug 7, 2025, 11:16 PM

#

whole wagon anyways i have pro already

I really like that "introducing GPT-5" screen

#

It could be way better but it's a step in the right direction

#

Hope we can reach a level of aesthetics similar to Web 2 Gloss again

brittle tiger Aug 7, 2025, 11:23 PM

#

whole wagon anyways i have pro already

Damn I have Pro and only see thinking

rapid merlin Aug 7, 2025, 11:23 PM

#

whole wagon anyways i have pro already

damn nerds

#

im a plus user and i still only have it on my phone

#

:((

whole wagon Aug 7, 2025, 11:24 PM

#

ye they give me stuff early. sometimes it gives a popup to ask if i want to be early tester for features or whatever

blazing bison Aug 7, 2025, 11:26 PM

#

gpt-5 is so good for code oh my god

#

i'm changing my mind completly

whole wagon Aug 7, 2025, 11:27 PM

#

.

blazing bison Aug 7, 2025, 11:27 PM

#

ok maybe craig was right

#

all the time

stray aspen Aug 7, 2025, 11:27 PM

#

yes

#

he was

#

i was wrong too

whole wagon Aug 7, 2025, 11:28 PM

#

people called it crap without trying it? thats kinda weird

blazing bison Aug 7, 2025, 11:30 PM

#

bro it's one shotting things that opus was sad trying to accomplish

#

it's bcs i'm trying with real world shi* now instead of some random create a game or site

brittle tiger Aug 7, 2025, 11:31 PM

#

The negativity is due to it not matching the gains seen in previous gpt 2 -> gpt 3 and gpt 3 -> gpt 4 and hype posting leading up to release. It's clearly v good. Sentiment might get better

little narwhal Aug 7, 2025, 11:31 PM

#

candid storm

There’s treatment centers for gambling addiction

blazing bison Aug 7, 2025, 11:31 PM

#

people when start using gpt-5 for real world tasks will notice the difference

#

wtf

rapid merlin Aug 7, 2025, 11:31 PM

#

what about the lmarena rating, though

#

I haven't tested it thoroughly yet as i dont have it on my computer yet

stray aspen Aug 7, 2025, 11:32 PM

#

rapid merlin what about the lmarena rating, though

its still SoTA wdym

rapid merlin Aug 7, 2025, 11:32 PM

#

it was below gemini for a bit which is a bit concerning

blazing bison Aug 7, 2025, 11:32 PM

#

brittle tiger The negativity is due to it not matching the gains seen in previous gpt 2 -> gpt...

bro i think it's matching, atleast for code

stray aspen Aug 7, 2025, 11:32 PM

#

anyways gpt 5 is a great model and its better than 2.5 pro

rapid merlin Aug 7, 2025, 11:33 PM

#

it was some benchmark i believe

hollow ocean Aug 7, 2025, 11:33 PM

#

rapid merlin Aug 7, 2025, 11:33 PM

#

but yeah, i will see y'all tomorrow as i'm heading off to sleep

blazing bison Aug 7, 2025, 11:33 PM

#

yeah it's better

#

it's sota

#

for me is underwhelming codex cli using gpt 5 on a project with 600k + tokens and identifying right files and doing modifications without creating any bug

#

and my prompt was not even good, it just did

whole wagon Aug 7, 2025, 11:35 PM

#

blazing bison it's sota

obviously its sota lol

#

even in the benchmarks it was usually

blazing bison Aug 7, 2025, 11:36 PM

#

i was expecting more from benchs

#

but maybe they are really saturated rgn

stray aspen Aug 7, 2025, 11:36 PM

#

lets see if it is SoTA in livebench

blazing bison Aug 7, 2025, 11:36 PM

#

idc livebench bro, it do code good

#

i'm happy

whole wagon Aug 7, 2025, 11:36 PM

#

who cares about livebench anyways

#

it had 4o as better at coding than o3 for ages

blazing bison Aug 7, 2025, 11:37 PM

#

true

#

o3 was lazy

#

it's not

#

well, i didnt try it on chatgpt interface yet

#

for now i'm burning some dollars on the api

#

i think it saved me 5 hours of work for $2

#

🤓

alpine osprey Aug 7, 2025, 11:40 PM

#

Can someone unmute my vc

#

@echo aurora

#

thank u

echo aurora Aug 7, 2025, 11:40 PM

#

alpine osprey <@283397944160550928>

all set

#

np

alpine osprey Aug 7, 2025, 11:41 PM

#

echo aurora all set

When we adding Genie 3 ? 😼

stray aspen Aug 7, 2025, 11:43 PM

#

that must be awfully expensive

echo aurora Aug 7, 2025, 11:44 PM

#

alpine osprey When we adding Genie 3 ? 😼

Upvote it! #1403076345138774066

hoary elbow Aug 7, 2025, 11:52 PM

#

Is genie two at least public?

solar hollow Aug 7, 2025, 11:56 PM

#

still not able to solve simple chess puzzles with human words unfortunately

wheat onyx Aug 7, 2025, 11:57 PM

#

still no GPT-5 on my plus account....

#

5pm PT

fickle venture Aug 8, 2025, 12:02 AM

#

echo aurora Upvote it! <#1403076345138774066>

is GPT-5 Thinking and GPT-5 Pro coming to lmarena?

fickle venture Aug 8, 2025, 12:02 AM

#

wheat onyx still no GPT-5 on my plus account....

Same bro I having trouble wait it will take a while to get to your place

#

It's just rolling out slowly

#

Since people live is us and openai is there they got it fast

blazing bison Aug 8, 2025, 12:03 AM

#

fickle venture is GPT-5 Thinking and GPT-5 Pro coming to lmarena?

gpt-5-pro is not even api avaliable

#

so no

#

and thinking i think already is?

fickle venture Aug 8, 2025, 12:04 AM

#

Oh I see

fickle venture Aug 8, 2025, 12:04 AM

#

blazing bison gpt-5-pro is not even api avaliable

So the pro version aren't on the api?

#

Like o3-pro

exotic nebula Aug 8, 2025, 12:05 AM

#

fickle venture So the pro version aren't on the api?

Nope

fickle venture Aug 8, 2025, 12:05 AM

#

Dang

blazing bison Aug 8, 2025, 12:05 AM

#

it's not necessary btw

wheat onyx Aug 8, 2025, 12:05 AM

#

fickle venture Same bro I having trouble wait it will take a while to get to your place

they said EoD for Plus accounts

blazing bison Aug 8, 2025, 12:05 AM

#

this model takes more than 10 minutes for asnwer

flint skiff Aug 8, 2025, 12:07 AM

#

damn is it just me or is gpt5 pretty meh at UX design?

#

claude does it a lot better for me

blazing bison Aug 8, 2025, 12:07 AM

#

yeah it's meh

flint skiff Aug 8, 2025, 12:07 AM

#

so better for backend stuff?

blazing bison Aug 8, 2025, 12:07 AM

#

but if you give a example he is really good copycat

flint skiff Aug 8, 2025, 12:09 AM

#

seems like cc is still king

#

atleast until weekly limits

blazing bison Aug 8, 2025, 12:09 AM

#

idk

#

i can use a example of a design that i like and gpt-5 can create using that

#

claude is not so good with this

flint skiff Aug 8, 2025, 12:10 AM

#

trying it now

blazing bison Aug 8, 2025, 12:10 AM

#

but if you don't have idea of what you want, then claude is better

flint skiff Aug 8, 2025, 12:10 AM

#

yeah I mean gpt 5 is useful its just not as versatile as I expected

#

where are you using it?

blazing bison Aug 8, 2025, 12:11 AM

#

i think it's the best code modifier model rgn

#

i'm using api directly

flint skiff Aug 8, 2025, 12:12 AM

#

its only medium reasoning on cursor rn

#

why do people even like cursor

#

I feel like they nerf models hard

#

everytime

blazing bison Aug 8, 2025, 12:12 AM

#

it nerfs

#

you are not seeing the true potential

#

bcs they use like 3k tokens and rag

flint skiff Aug 8, 2025, 12:13 AM

#

whats ur pipeline

blazing bison Aug 8, 2025, 12:13 AM

#

rgn i'm using codex cli

flint skiff Aug 8, 2025, 12:13 AM

#

codex is good?

#

I heard its eh

blazing bison Aug 8, 2025, 12:15 AM

#

they updated it

#

the reason i'm testing

#

claude code still better i think, but i'm not paying for claude anymore so

flint skiff Aug 8, 2025, 12:15 AM

#

does it work like cc

#

where u can login?

#

or do u just use api credits

blazing bison Aug 8, 2025, 12:15 AM

#

i think yes

#

i'm using api credits rgn bcs it was not working

#

the login thing

#

but they added it

flint skiff Aug 8, 2025, 12:16 AM

#

I stopped my max 20x sub on claude cuz I thought gpt 5 was gonna be way better lol

#

might sub again

blazing bison Aug 8, 2025, 12:16 AM

#

well i think it is better

#

cursor is not reference

candid storm Aug 8, 2025, 12:16 AM

#

blazing bison and thinking i think already is?

I don't see it on the leaderboard?

flint skiff Aug 8, 2025, 12:16 AM

#

it felt so garbage on cursor

blazing bison Aug 8, 2025, 12:16 AM

#

candid storm I don't see it on the leaderboard?

well idk

flint skiff Aug 8, 2025, 12:16 AM

#

holy fkkkk

blazing bison Aug 8, 2025, 12:17 AM

#

flint skiff it felt so garbage on cursor

even sonnet feels garbage

flint skiff Aug 8, 2025, 12:17 AM

#

im sure they have it super nerfed

blazing bison Aug 8, 2025, 12:17 AM

#

on cursor

flint skiff Aug 8, 2025, 12:17 AM

#

cuz they made it free

blazing bison Aug 8, 2025, 12:17 AM

#

bro they made it free, what you think?

flint skiff Aug 8, 2025, 12:17 AM

#

for a week or smth

#

yeah

#

but like thats dumb no?

#

people that use it gets a bad taste

#

like this makes me wanna stay away from cursor more lol

hoary elbow Aug 8, 2025, 12:17 AM

#

When do you think GPT 5 pro will come out as an api

blazing bison Aug 8, 2025, 12:18 AM

#

they didnt talk about that

#

so maybe never?

hoary elbow Aug 8, 2025, 12:18 AM

#

All right

blazing bison Aug 8, 2025, 12:18 AM

#

i'm not saying never ,but idk

hoary elbow Aug 8, 2025, 12:18 AM

#

directchat

blazing bison Aug 8, 2025, 12:18 AM

#

there is not any info

#

about it

flint skiff Aug 8, 2025, 12:19 AM

#

probably a few months

#

but it kinda seems like openai is cooked now if this is all they got

blazing bison Aug 8, 2025, 12:19 AM

#

bro

#

how do you say that without even testing

flint skiff Aug 8, 2025, 12:20 AM

#

I mean ill try it yeah

#

setting it up right now

#

are u impressed?

#

like actually

#

how does it fare to opus

blazing bison Aug 8, 2025, 12:20 AM

#

i'm

#

opus failed 5 tries and gpt-5 one shotted a task on 600k tokens project

#

i didnt send 600k tokens btw, it was using cc and now codex cli

flint skiff Aug 8, 2025, 12:21 AM

#

hmm

blazing bison Aug 8, 2025, 12:21 AM

#

gpt-5 was able to identify the correct files

#

and do the modifications

flint skiff Aug 8, 2025, 12:22 AM

#

ur using vs code?

blazing bison Aug 8, 2025, 12:22 AM

#

bro it was like 8k lines modifications without creating any bug

blazing bison Aug 8, 2025, 12:22 AM

#

flint skiff ur using vs code?

yes

flint skiff Aug 8, 2025, 12:22 AM

#

yeah that sounds crazy

blazing bison Aug 8, 2025, 12:22 AM

#

i never see an AI do that much modifications without break anything before

flint skiff Aug 8, 2025, 12:23 AM

#

its way cheaper than opus too

blazing bison Aug 8, 2025, 12:23 AM

#

it's C# btw

flint skiff Aug 8, 2025, 12:23 AM

#

ok thats impressive

blazing bison Aug 8, 2025, 12:24 AM

#

i run the test just for fun, i didnt believe in it bcs my parament was if opus and gemini 2.5 can't so no model can

stray aspen Aug 8, 2025, 12:29 AM

#

livebench benchmark is out

#

gpt-5 is SoTA

sacred quail Aug 8, 2025, 12:30 AM

#

i liked too

#

Still sad that O3 will disappear

#

Btw in free version on android, you can select reasoning or non reasoning

#

So which reasoning mode running in mobile app for free version ?

#

Medium reasoning ?

quartz light Aug 8, 2025, 12:31 AM

#

guys

#

i figured out the true release date of gpt 5

#


gpt-5: 2025-08-05T20:29:37 UTC

gpt-5-mini-2025-08-07: 2025-08-05T20:31:07 UTC

gpt-5-mini: 2025-08-05T20:32:08 UTC

gpt-5-nano-2025-08-07: 2025-08-05T20:38:23 UTC

gpt-5-chat-latest: 2025-08-01T18:35:06 UTC

gpt-5-2025-08-07: 2025-08-01T19:09:20 UTC```

jade egret Aug 8, 2025, 12:35 AM

#

stray aspen anyways gpt 5 is a great model and its better than 2.5 pro

ofc but how long it would take gemini to catch up? gpt 5 is newer after all.

#

nah..

quartz light Aug 8, 2025, 12:36 AM

#

quartz light ```gpt-5-nano: 2025-08-05T20:39:44 UTC gpt-5: 2025-08-05T20:29:37 UTC gpt-5-mi...

gpt-5-chat-latest was made on august 1st lol

#

so

#

yall

stray aspen Aug 8, 2025, 12:36 AM

#

quartz light ```gpt-5-nano: 2025-08-05T20:39:44 UTC gpt-5: 2025-08-05T20:29:37 UTC gpt-5-mi...

so on lmarena we are using a version thats 2 days older

quartz light Aug 8, 2025, 12:36 AM

#

connect the dots of the models which released like horizon beta n stuff

flint skiff Aug 8, 2025, 12:37 AM

#

@blazing bison is it gpt 5 or gpt 5 2025-08-07

#

in the model list

wicked root Aug 8, 2025, 12:38 AM

#

stray aspen gpt-5 is SoTA

What about gemini?

quartz light Aug 8, 2025, 12:38 AM

#

quartz light ```gpt-5-nano: 2025-08-05T20:39:44 UTC gpt-5: 2025-08-05T20:29:37 UTC gpt-5-mi...

can someone connect the dots and using this info figure out which of these models zenith, horizon beta etc could be

stray aspen Aug 8, 2025, 12:38 AM

#

wicked root What about gemini?

its way below

quartz light Aug 8, 2025, 12:38 AM

#

stray aspen livebench benchmark is out

wasn't livebench laughable

#

is livebench legit now

stray aspen Aug 8, 2025, 12:39 AM

#

bro depseek is above gemini no max think

#

gpt-5

quartz light Aug 8, 2025, 12:40 AM

#

really?

whole wagon Aug 8, 2025, 12:40 AM

#

stray aspen gpt-5

💀

stray aspen Aug 8, 2025, 12:40 AM

#

what does this mean

#

why is gpt-5 not on top

#

its greater than these models

whole wagon Aug 8, 2025, 12:41 AM

#

Not in all aspects

#

Clearly

#

The benchmark is private

#

There is 10 public question and 400 private ones. The public ones not used in the testing

quartz light Aug 8, 2025, 12:42 AM

#

grok 4 is dogshit

#

💔

#

however

#

i do have hope

#

in the new grok model

stray aspen Aug 8, 2025, 12:43 AM

#

grok 4 is actually an amazing model

quartz light Aug 8, 2025, 12:43 AM

#

releasing this month

#

grok 4 coder

sacred quail Aug 8, 2025, 12:43 AM

#

Guys, in mobile gpt 5 thinking using which reasoning mode ? Medium reasoning or high ? Or do we need to buy plus for able to select high reasoning

quartz light Aug 8, 2025, 12:44 AM

#

ad_4nxdh77zf4ze8rat4wfayfy4rcgp7sdakwcyixrfihpgrfni9qx0axu17_cf2wkzqbypwwvml6y_2woa97duvyfyw4_drl7s6yexzcgbgyljnamhlu51krhegnbccoqgv1hyzioqk.png

stray aspen Aug 8, 2025, 12:44 AM

#

sacred quail Guys, in mobile gpt 5 thinking using which reasoning mode ? Medium reasoning or ...

idk how the reasoning levels work

#

the api just says gpt-5

#

what about the gpt-5 in lmarena

quartz light Aug 8, 2025, 12:44 AM

#

oh, I thought thinking isnt available on free plan at all since they removed model selection

limpid schooner Aug 8, 2025, 12:45 AM

#

hey guys, so did anyone figure out here what zenith was?

quartz light Aug 8, 2025, 12:45 AM

#

limpid schooner hey guys, so did anyone figure out here what zenith was?

still waitin

#

fine ill try it

zealous panther Aug 8, 2025, 12:45 AM

#

No one can

stray aspen Aug 8, 2025, 12:45 AM

#

it was deepseek r2

quartz light Aug 8, 2025, 12:45 AM

#

zealous panther No one can

no, I posted this:

quartz light Aug 8, 2025, 12:45 AM

#

quartz light ```gpt-5-nano: 2025-08-05T20:39:44 UTC gpt-5: 2025-08-05T20:29:37 UTC gpt-5-mi...

a

zealous panther Aug 8, 2025, 12:45 AM

#

Oh ok

quartz light Aug 8, 2025, 12:45 AM

#

this way we can find out what zenith could be

limpid schooner Aug 8, 2025, 12:46 AM

#

quartz light still waitin

gotchui

quartz light Aug 8, 2025, 12:46 AM

#

i just need to find first mention of zenith

#

get the time

#

aaand

limpid schooner Aug 8, 2025, 12:46 AM

#

holy

sacred quail Aug 8, 2025, 12:46 AM

#

is horizon beta on openrouter was gpt 5

#

?

quartz light Aug 8, 2025, 12:46 AM

#

^

quartz light Aug 8, 2025, 12:47 AM

#

limpid schooner holy

i think he kiddin

#

lol

limpid schooner Aug 8, 2025, 12:47 AM

#

quartz light i think he kiddin

i know xd

quartz light Aug 8, 2025, 12:48 AM

#

limpid schooner hey guys, so did anyone figure out here what zenith was?

gpt-5-chat-latest

limpid schooner Aug 8, 2025, 12:48 AM

#

quartz light gpt-5-chat-latest

whaaaaaaaat

#

are we sure

quartz light Aug 8, 2025, 12:48 AM

#

yeah pretty sure

#

created 1st of august

#

likely just an edit of zenith

#

dingdingding

limpid schooner Aug 8, 2025, 12:49 AM

#

so zenith was not a thinking model?

whole wagon Aug 8, 2025, 12:49 AM

#

How do people fall for the simple bench 90% GPT5 "leak". They must be incredibly dumb

quartz light Aug 8, 2025, 12:49 AM

#

limpid schooner so zenith was not a thinking model?

thinking models are not separate

quartz light Aug 8, 2025, 12:50 AM

#

quartz light ```gpt-5-nano: 2025-08-05T20:39:44 UTC gpt-5: 2025-08-05T20:29:37 UTC gpt-5-mi...

^

candid storm Aug 8, 2025, 12:50 AM

#

Is zenith gonna be on the leaderboard?

whole wagon Aug 8, 2025, 12:50 AM

#

whole wagon How do people fall for the simple bench 90% GPT5 "leak". They must be incredibly...

Actually makes me lose faith in humanity to see the sheer number of people blindly believing it

#

Without even a second of critical thoughts

quartz light Aug 8, 2025, 12:50 AM

#

candid storm Is zenith gonna be on the leaderboard?

why

limpid schooner Aug 8, 2025, 12:51 AM

#

quartz light thinking models are not separate

wow so when normal gpt 5 thinks in chatgpt it's zenith?

#

zenith was so good

quartz light Aug 8, 2025, 12:51 AM

#

so, zenith was probably gpt 5, summit could be mini and lobster could be nano

quartz light Aug 8, 2025, 12:52 AM

#

limpid schooner wow so when normal gpt 5 thinks in chatgpt it's zenith?

there are 2 separate gpt 5 models though

limpid schooner Aug 8, 2025, 12:52 AM

#

they said summit is gpt 5

brisk helm Aug 8, 2025, 12:52 AM

#

how do we select how much we want gpt 5 to think in lmarena

limpid schooner Aug 8, 2025, 12:52 AM

#

lmarena mods

quartz light Aug 8, 2025, 12:52 AM

#

gpt-5-nano was made 4 days later

stray aspen Aug 8, 2025, 12:52 AM

#

you dont

brisk helm Aug 8, 2025, 12:53 AM

#

limpid schooner lmarena mods

?

#

oh nvm

quartz light Aug 8, 2025, 12:53 AM

#

limpid schooner they said summit is gpt 5

o

flint skiff Aug 8, 2025, 12:53 AM

#

is anyone using gpt 5 codex cli rn

quartz light Aug 8, 2025, 12:53 AM

#

flint skiff is anyone using gpt 5 codex cli rn

no mone

#

💔

#

give api

#

fr

stray aspen Aug 8, 2025, 12:53 AM

#

fr yes cap

quartz light Aug 8, 2025, 12:54 AM

#

ms paint

sour kiln Aug 8, 2025, 1:11 AM

#

A heroic white police dog with shiny blue eyes, wearing a full police uniform, is bravely rescuing a brown rabbit from drowning in a fast-flowing river. The dog is standing in the water, strong and determined, holding the frightened rabbit gently in his mouth while people watch from the riverbank with admiration and awe. The scene is realistic and emotional, with splashing water, dramatic lighting, and a clear sky. The dog is the hero of the town, and everyone loves and respects him

patent aspen Aug 8, 2025, 1:23 AM

#

capacity crunch issues

#

It sure is

wheat onyx Aug 8, 2025, 1:37 AM

#

alright assuming I was lied to, and not getting GPT-5 on Plus today

#

civic flame Aug 8, 2025, 1:44 AM

#

well yikes

rare python Aug 8, 2025, 1:45 AM

#

🤔

stray aspen Aug 8, 2025, 1:55 AM

#

6 months is a lot

#

we still have gemini 3 and grok 5

quartz light Aug 8, 2025, 2:08 AM

#

stray aspen we still have gemini 3 and grok 5

grok 5?

#

i thought it would be grok 4.1 or 4.5

#

since its just a coding version

quartz light Aug 8, 2025, 2:09 AM

#

quartz light

i just realised the placement of the lines might actually represent the exact date of release

hollow ocean Aug 8, 2025, 2:09 AM

#

@deep adder question 10 is solved

stray aspen Aug 8, 2025, 2:10 AM

#

quartz light grok 5?

thats what musk said

hollow ocean Aug 8, 2025, 2:11 AM

#

simple bench

#

yeah

#

90% on public questions

#

mint relic Aug 8, 2025, 2:16 AM

#

Brillant

stray aspen Aug 8, 2025, 2:18 AM

#

how does genspark have gpt 5 pro

#

gpt 5 on bing is out

quartz light Aug 8, 2025, 2:21 AM

#

quartz light i just realised the placement of the lines might actually represent the exact da...

got this result with gpt 5 thinking twice! i guess using that logic, gpt is really good at figuring out the date but seems like my theory is either wrong or the model is delayed

quartz light Aug 8, 2025, 2:22 AM

#

stray aspen how does genspark have gpt 5 pro

its probably just gpt 5 high

quartz light Aug 8, 2025, 2:24 AM

#

stray aspen gpt 5 on bing is out

hey thats cool

#

no signup too

#

#

deep research pops up on signin

#

cool

stray aspen Aug 8, 2025, 2:25 AM

#

gpt 5 in microsoft copilot sucks

quartz light Aug 8, 2025, 2:25 AM

#

stray aspen gpt 5 in microsoft copilot sucks

really?

stray aspen Aug 8, 2025, 2:25 AM

#

yes

quartz light Aug 8, 2025, 2:25 AM

#

you should ask it to think deeply

#

stray aspen Aug 8, 2025, 2:26 AM

#

lets try

#

yes

#

#

it worked

quartz light Aug 8, 2025, 2:26 AM

#

stray aspen it worked

nice

#

#

LOL

stray aspen Aug 8, 2025, 2:26 AM

#

lol

#

this one nailed it first try

quartz light Aug 8, 2025, 2:27 AM

#

stray aspen this one nailed it first try

which site?

stray aspen Aug 8, 2025, 2:27 AM

#

yupp.ai

#

its a great website

#

but its limited

quartz light Aug 8, 2025, 2:28 AM

#

stray aspen yupp.ai

woah

stray aspen Aug 8, 2025, 2:28 AM

#

you basically get points from rating the AIs

#

and you spend them points to use the ais

#

gpt-5 on microsoft copilot is great

#

but you have to ask it to think deeply first

quartz light Aug 8, 2025, 2:30 AM

#

stray aspen and you spend them points to use the ais

so, https://obl.dev

sacred quail Aug 8, 2025, 2:31 AM

#

also in poe app you can select high reasoning mode

blazing bison Aug 8, 2025, 2:31 AM

#

so apparently zennith was gpt-5 too, but another version, and summit won for some reason and they killed zenith

#

😢

stray aspen Aug 8, 2025, 2:31 AM

#

quartz light so, https://obl.dev

whats that

jade egret Aug 8, 2025, 2:31 AM

#

blazing bison so apparently zennith was gpt-5 too, but another version, and summit won for som...

RIP zenith

#

plz gemini 3

stray aspen Aug 8, 2025, 2:34 AM

#

how does the reasoning effort of gpt-5 work?

#

the gpt-5 of microsoft copilot is smarter than the one in lmarena

verbal nimbus Aug 8, 2025, 2:38 AM

#

stray aspen yupp.ai

Interesting

quartz light Aug 8, 2025, 2:38 AM

#

stray aspen whats that

same thing

#

but worse

stray aspen Aug 8, 2025, 2:38 AM

#

yeah that website sucks

#

yupp.ai is miles better

#

if we tallk about using paid models for free of course

jade egret Aug 8, 2025, 2:43 AM

#

if i use gpt-5 to prompt engineer a prompt for gpt-5 🤔

sacred quail Aug 8, 2025, 2:45 AM

#

stray aspen yupp.ai

man, thanks for this. I just used for o3 pro. Its nice to able test this

#

btw on poe app, you can use gpt 5 high reasoning with free for multiple prompts

stray aspen Aug 8, 2025, 2:45 AM

#

sacred quail man, thanks for this. I just used for o3 pro. Its nice to able test this

dont use it too much tho

sacred quail Aug 8, 2025, 2:45 AM

#

stray aspen dont use it too much tho

why

stray aspen Aug 8, 2025, 2:45 AM

#

you got aobut 30 prompts worth of points

#

for that model

exotic gust Aug 8, 2025, 2:45 AM

#

I’m scared of what the future might bring for ai

stray aspen Aug 8, 2025, 2:46 AM

#

sacred quail why

its too expensive

#

also one thing

#

if you go here and make your chat public it reduces the cost by half

verbal nimbus Aug 8, 2025, 2:46 AM

#

stray aspen yupp.ai

Interesting

stray aspen Aug 8, 2025, 2:47 AM

#

yes very interesting

sacred quail Aug 8, 2025, 2:47 AM

#

stray aspen if you go here and make your chat public it reduces the cost by half

another nice information, ty !

stray aspen Aug 8, 2025, 2:47 AM

#

microsoft launched a 3d model website

sacred quail Aug 8, 2025, 2:47 AM

#

LM arena is still my beloved but, its interesting to see someone trying to be competitor

stray aspen Aug 8, 2025, 2:48 AM

#

dude what

#

is this genie 3

stray aspen Aug 8, 2025, 2:48 AM

#

sacred quail LM arena is still my beloved but, its interesting to see someone trying to be co...

lm arena still wins

#

its free

#

no sign up

#

what i like about yupp is that it has a lot of models

quartz light Aug 8, 2025, 2:49 AM

#

DUUUDE WHAT THE ####
https://123.nekoweb.org/ai/GPT5/COPILOTMARIO.html

#

THE ASSETS LOOK SO GOOD

#

even though jump doesn't work

stray aspen Aug 8, 2025, 2:50 AM

#

where did you make it

quartz light Aug 8, 2025, 2:50 AM

#

copilot!!!

stray aspen Aug 8, 2025, 2:50 AM

#

thats crazy

#

it looks pretty decent

quartz light Aug 8, 2025, 2:51 AM

#

stray aspen Aug 8, 2025, 2:53 AM

#

@quartz lightyo bro

#

do you have deep research

#

in copilot

quartz light Aug 8, 2025, 2:53 AM

#

yep

stray aspen Aug 8, 2025, 2:53 AM

#

thats crazy

#

i dont yet

quartz light Aug 8, 2025, 2:54 AM

#

quartz light cool

e

quartz light Aug 8, 2025, 2:54 AM

#

quartz light deep research pops up on signin

e

#

@stray aspen

quartz light Aug 8, 2025, 2:56 AM

#

quartz light deep research pops up on signin

@stray aspen did you sign in?

stray aspen Aug 8, 2025, 2:56 AM

#

yes

#

it hasnt rolled out for my account yet

#

probably because i created it outside of canada

quartz light Aug 8, 2025, 2:58 AM

#

stray aspen probably because i created it outside of canada

ohhh interesting

#

im in ireland

#

stray aspen Aug 8, 2025, 2:59 AM

#

thats great

#

google has to lock in

#

openai and their partners hit the industry hard

quartz light Aug 8, 2025, 2:59 AM

#

hey lneduo

#

have you been able to generate long code on yupp

stray aspen Aug 8, 2025, 3:00 AM

#

yes

quartz light Aug 8, 2025, 3:00 AM

#

does it cut off

stray aspen Aug 8, 2025, 3:00 AM

#

no

#

it works fine

#

but the website is laggy

verbal nimbus Aug 8, 2025, 3:01 AM

#

stray aspen but the website is laggy

It's faster than LMArena for me

stray aspen Aug 8, 2025, 3:01 AM

#

that fancy UI makes it laggy

stray aspen Aug 8, 2025, 3:01 AM

#

verbal nimbus It's faster than LMArena for me

thats crazy

verbal nimbus Aug 8, 2025, 3:01 AM

#

I'm on mobile, Web Dev Arena throws an error like 50% of the time

#

Sandbox fails to appear, voting button disappears, lol

#

ChatGPT app can't copy

#

Claude input bar still buggy

#

Why do all these AI apps have so buggy frontends

stray aspen Aug 8, 2025, 3:03 AM

#

idk

#

i dont use web dev aren

verbal nimbus Aug 8, 2025, 3:03 AM

#

Gemini is so funny

#

It manages to recreate an entire component from minified and obfuscated React code

#

But then gets stuck trying how to make an inner div fill its parent

quartz light Aug 8, 2025, 3:05 AM

#

i just caught a FATASS MOTH

#

i mean uh

#

ai ai ai ai

verbal nimbus Aug 8, 2025, 3:05 AM

#

verbal nimbus But then gets stuck trying how to make an inner div fill its parent

All it has to do was remove max-width

thorn valley Aug 8, 2025, 3:09 AM

#

stray aspen what i like about yupp is that it has a lot of models

yupp has the o3-pro model to use. i found it interesting, I'll try it later."

rare python Aug 8, 2025, 3:14 AM

#

https://fixupx.com/dylan522p/status/1953646457908547979

Dylan Patel (@dylan522p)

GPT 5 is dissapointing ngl

**💬 44 🔁 21 ❤️ 318 👁️ 16.4K **

quartz light Aug 8, 2025, 3:14 AM

#

stray aspen is this genie 3

no lol

#

its old

#

@stray aspen @thorn valley @verbal nimbus check this out

#

opus 4.1

#

https://123.nekoweb.org/ai/GPT5/yuppmario.html

verbal nimbus Aug 8, 2025, 3:19 AM

#

quartz light https://123.nekoweb.org/ai/GPT5/yuppmario.html

Whoa cool, it runs on mobile haha

#

But I can't interact with it

quartz light Aug 8, 2025, 3:19 AM

#

verbal nimbus Whoa cool, it runs on mobile haha

really?

verbal nimbus Aug 8, 2025, 3:19 AM

#

quartz light really?

Yeah, no lag

quartz light Aug 8, 2025, 3:19 AM

#

i'll try gpt 5 nano to implement nipplejs and a circle button for jump for mobile controls

solar galleon Aug 8, 2025, 3:20 AM

#

just like yesterday i could upload images in direct chat i used gemini 2.5 pro but it doesn't let me and only shows the error anyone know why

verbal nimbus Aug 8, 2025, 3:21 AM

#

IDK what's up with Web tech, nowadays stuff loads faster on mobile than my gaming PC

thorn valley Aug 8, 2025, 3:23 AM

#

quartz light really?

no lag here too

quartz light Aug 8, 2025, 3:24 AM

#

thorn valley no lag here too

:D

thorn valley Aug 8, 2025, 3:24 AM

#

well optimized, to tell the truth

quartz light Aug 8, 2025, 3:24 AM

#

its a whole game with sprites, sounds and levels in a single html file!

thorn valley Aug 8, 2025, 3:24 AM

#

but I can't interact either

quartz light Aug 8, 2025, 3:24 AM

#

incredible

quartz light Aug 8, 2025, 3:24 AM

#

thorn valley but I can't interact either

i'm trying to get nano to implement mobile support

#

it might be janky but

#

@thorn valley @verbal nimbus can yall check if there are mobile controls now? https://123.nekoweb.org/ai/GPT5/yuppmario2.html

jolly raven Aug 8, 2025, 3:34 AM

#

How can I use picture to picture.

worthy thunder Aug 8, 2025, 3:34 AM

#

Context Arena Update: Added GPT-5 (Thinking, 08-07) to 2needle (#1 @ 128k AUC), 4needle (#1 @ 128k AUC), and 8needle (#1 @ 128k AUC) leaderboards! Also added GPT-5-Mini and GPT-5-Nano. (https://x.com/DillonUzar/status/1953660295559192919)

More model results at: http://contextarena.ai

Overall GPT-5 is great for <=128k! Only exception is 8needle, Grok 4 still performs much better at <=32k compared to GPT-5, but GPT-5's performance at higher context wins out.

2needle: Top results (AUC @ 128k):

GPT-5 (Thinking, 08-07): 96.7% (#1)
GPT-5-Mini (Thinking, 08-07): 92.6% (#2)
Gemini 2.5 Flash (Thinking, 06-17): 91.5% (#3)
Gemini 2.5 Pro (Thinking, 06-05): 89.6% (#2)
Gemini 2.5 Flash (Non-thinking, 06-17): 81.7% (#5)
Grok 4 (Thinking, 07-09): 79.5% (#6)
o4-mini (Thinking, 04-16): 76.0% (#7)
...
GPT-5-Nano (Thinking, 08-07): 44.2% (#34)

8needle: Top results (AUC @ 128k):

GPT-5 (Thinking, 08-07): 50.3% (#1)
Grok 4 (Thinking, 07-09): 48.4% (#2)
GPT-5-Mini (Thinking, 08-07): 44.7% (#3)
Gemini 2.5 Pro (Thinking, 06-05): 43.9% (#4)
Gemini 2.5 Flash (Thinking, 06-17): 33.5% (#5)
o4-mini (Thinking, 04-16): 30.8% (#6)
o3 (Thinking, 04-16): 27.9% (#6)
...
GPT-5-Nano (Thinking, 08-07): 11.9% (#22)

quartz light Aug 8, 2025, 3:38 AM

#

worthy thunder Context Arena Update: Added GPT-5 (Thinking, 08-07) to 2needle (#1 @ 128k AUC), ...

niiice

stray aspen Aug 8, 2025, 3:38 AM

#

thats great

quartz light Aug 8, 2025, 3:39 AM

#

I hope they bump it up to 1M or higher

#

^

wicked root Aug 8, 2025, 3:44 AM

#

anyone know when the next set of votes will be added to the overall rankings?

quartz light Aug 8, 2025, 4:00 AM

#

gpt 5 yaps too much

obsidian cargo Aug 8, 2025, 4:11 AM

#

Anyone know what was up with Zenith? It seemed even better than Summit, which was GPT-5

stray aspen Aug 8, 2025, 4:16 AM

#

obsidian cargo Anyone know what was up with Zenith? It seemed even better than Summit, which wa...

haughty siren Aug 8, 2025, 4:44 AM

#

When do you guys think Gemini 3.0 is coming?

oblique needle Aug 8, 2025, 4:49 AM

#

yeah, I dunno... my guess is 3.0 is pretty much dead in the water unless someone else launches a more powerful model

brisk helm Aug 8, 2025, 4:58 AM

#

quartz light its a whole game with sprites, sounds and levels in a single html file!

how many prompts did it take?

#

or did u make it ur self?

quartz light Aug 8, 2025, 4:58 AM

#

brisk helm how many prompts did it take?

one

brisk helm Aug 8, 2025, 4:59 AM

#

craazy with opus 4.1 or gpt 5?

quartz light Aug 8, 2025, 5:00 AM

#

brisk helm craazy with opus 4.1 or gpt 5?

4.1

#

idk why i uploaded it to the gpt 5 folder lmao

brisk helm Aug 8, 2025, 5:01 AM

#

yeah thats why i was asking

quartz light Aug 8, 2025, 5:01 AM

#

https://123.nekoweb.org/ai/GPT5/mario2.html

brisk helm Aug 8, 2025, 5:01 AM

#

quartz light 4.1

is opus better for coding that gpt 5

quartz light Aug 8, 2025, 5:01 AM

#

trying this

#

5 minutes ago https://www.youtube.com/watch?v=boJG84Jcf-4

YouTube

OpenAI

Introducing GPT-5

Our smartest, fastest, most useful model yet, with built-in thinking that puts expert-level intelligence in everyone’s hands.

▶ Play video

#

LOL

#

wait wtf thats literally my generation what?

#

#

LOL

#

DUDE

steady vale Aug 8, 2025, 5:09 AM

#

hit my gpt-5 limit lol

#

oh 80 every 3hrs

#

damn thats low

quartz light Aug 8, 2025, 5:11 AM

#

quartz light wait wtf thats literally my generation what?

nvm i think..

lilac nimbus Aug 8, 2025, 5:29 AM

#

In my text GPT5 code is next level

hasty rock Aug 8, 2025, 5:36 AM

#

how often does lm update?

verbal nimbus Aug 8, 2025, 5:38 AM

#

quartz light <@470028951503503362> <@858135822389346344> can yall check if there are mobile c...

Yup, but it's more buggy than the previous version. If you're on PC, you can enable mobile preview on Dev Tools.

verbal nimbus Aug 8, 2025, 5:39 AM

#

oblique needle yeah, I dunno... my guess is 3.0 is pretty much dead in the water unless someone...

DeepSeek R2 is supposed to come out this month

keen beacon Aug 8, 2025, 5:41 AM

#

What is the difference between chat gpt 5 nano and gpt 5 mini?

floral comet Aug 8, 2025, 5:47 AM

#

Do the gpt5 on llm arena is a thinking model or not?

little narwhal Aug 8, 2025, 5:48 AM

#

verbal nimbus DeepSeek R2 is supposed to come out this month

Source?

terse shuttle Aug 8, 2025, 5:49 AM

#

floral comet Do the gpt5 on llm arena is a thinking model or not?

thinking

floral comet Aug 8, 2025, 5:51 AM

#

terse shuttle thinking

Cool thanks! I'm trying it right now I'm very impressed, anyways sorry for many question but do you know what variant of gpt5 is this?

terse shuttle Aug 8, 2025, 5:51 AM

#

floral comet Cool thanks! I'm trying it right now I'm very impressed, anyways sorry for many ...

lmarena have basic gpt-5, gpt-5-mini and gpt-5-nano

floral comet Aug 8, 2025, 5:52 AM

#

Oh cool! I thought there's only 1 gpt 5 model.. Thanks !

echo aurora Aug 8, 2025, 5:53 AM

#

keen beacon What is the difference between chat gpt 5 nano and gpt 5 mini?

I see them described as

gpt-5-mini: cost-optimized reasoning and chat; balances speed, cost, and capability

gpt-5-nano: high-throughput tasks, especially simple instruction-following or classification

echo aurora Aug 8, 2025, 5:53 AM

#

floral comet Oh cool! I thought there's only 1 gpt 5 model.. Thanks !

Yeah we added the other two a bit after we added 5

floral comet Aug 8, 2025, 5:54 AM

#

Yep i see that now, Thanks!

keen fulcrum Aug 8, 2025, 5:57 AM

#

@echo aurora Its hard to keep up with stealth model names. Can you show a stealth leaderboard?

sterile dust Aug 8, 2025, 6:05 AM

#

Screenshot_2025-08-08-14-04-59-332_com.microsoft.emmx.jpg

Screenshot_2025-08-08-14-05-04-498_com.microsoft.emmx.jpg

Screenshot_2025-08-08-14-05-07-082_com.microsoft.emmx.jpg

terse shuttle Aug 8, 2025, 6:06 AM

#

sterile dust

what is this?

sterile dust Aug 8, 2025, 6:06 AM

#

Hmmm...... Maybe they can't really search the web.

#

Is there other models which can search the web in lmarena?

echo aurora Aug 8, 2025, 6:06 AM

#

keen fulcrum <@283397944160550928> Its hard to keep up with stealth model names. Can you show...

I can share but overall I think that'd be tough to do. Are you getting at a leaderboard of just stealth models or just a list of all active stealth models?

sterile dust Aug 8, 2025, 6:09 AM

#

terse shuttle what is this?

The two models can't search the web, is there other models which can search the web in lmarena?

terse shuttle Aug 8, 2025, 6:10 AM

#

sterile dust The two models can't search the web, is there other models which can search the ...

idk

keen fulcrum Aug 8, 2025, 6:10 AM

#

echo aurora I can share but overall I think that'd be tough to do. Are you getting at a lead...

I would prefer the first option.

#

The elo scores are never revealed to us until model release

echo aurora Aug 8, 2025, 6:18 AM

#

keen fulcrum I would prefer the first option.

Gotcha, yeah I'm sure there are some creative things that could be done with that.

mossy drum Aug 8, 2025, 6:27 AM

#

echo aurora Gotcha, yeah I'm sure there are some creative things that could be done with tha...

Maybe let companies, who provide those anon models, decide, whether they want to disclose scores / show model on leaderboard or not... Some anonymity or mystery is fun, keep it that way.

turbid phoenix Aug 8, 2025, 6:35 AM

#

GPT 5 is not Skynet..... 🤬

molten cipher Aug 8, 2025, 6:36 AM

#

gpt 5 is insane

prime mulch Aug 8, 2025, 6:37 AM

#

People say gpt 5 have another update in few days is that real

restive rampart Aug 8, 2025, 6:38 AM

#

GPT-5 only won 33% of the 49 battles against Gemini 2.5 pro if I am reading this correclty? https://lmarena.ai/leaderboard/text

lethal current Aug 8, 2025, 6:39 AM

#

Is it normal that gpt 5 is not formatting the code? Or is that on the lma side?

pulsar rain Aug 8, 2025, 6:47 AM

#

devout vault Aug 8, 2025, 6:48 AM

#

primal orbit Aug 8, 2025, 6:50 AM

#

restive rampart GPT-5 only won 33% of the 49 battles against Gemini 2.5 pro if I am reading this...

yep. next line gemini 2.5 wins 67% of times

blazing bison Aug 8, 2025, 6:52 AM

#

Lol openai still let you use old models on pro plan

#

https://x.com/stalkermustang/status/1953602642560922082?t=WKLArmODlontfjLFgfjEtw&s=19

Igor Kotenkov (@stalkermustang)

hint: you still can choose old models, including o3 pro and GPT-4.5, if you turn on this tumbler

cc @btibor91

#

Gpt 4.5 is not dead 😮

rapid merlin Aug 8, 2025, 6:57 AM

#

i really wonder what's taking them so long to roll it out to plus, especially considering i already have it on my phone

blazing bison Aug 8, 2025, 7:00 AM

#

rapid merlin i really wonder what's taking them so long to roll it out to plus, especially co...

Clear cookies 🍪

wicked root Aug 8, 2025, 7:00 AM

#

primal orbit yep. next line gemini 2.5 wins 67% of times

HELL YEAH

blazing bison Aug 8, 2025, 7:01 AM

#

And people is reporting that apparently gpt 5 is dumb on chatgpt

#

The gpt 5 chat model is dumb

hardy lion Aug 8, 2025, 7:02 AM

#

restive rampart GPT-5 only won 33% of the 49 battles against Gemini 2.5 pro if I am reading this...

one caveat is that that is only for decisive votes. ties are filtered out. 67/33 is pretty steep, but of thier 49 battles, we don't acutally know how many they tied. It's technically possible it was 1 gpt-5 win, 2 gemini wins and 46 ties. We should find a better way to report the ties.

A simple one could be to just count it as 0.5 for each. Any suggestions?

rapid merlin Aug 8, 2025, 7:02 AM

#

blazing bison Clear cookies 🍪

damn, now i got it on my pc too, thanks

restive rampart Aug 8, 2025, 7:04 AM

#

hardy lion one caveat is that that is only for decisive votes. ties are filtered out. 67/33...

It is "Battle Count for Each Combination of Models (without Ties)", so I am assuming ties removed from it.

hardy lion Aug 8, 2025, 7:05 AM

#

oh you're right! it aught to include tie counts in both places imo

wicked root Aug 8, 2025, 7:05 AM

#

Why is this so high for gpt5

#

You guys think gpt5 will beat 2.5 this month?

hardy lion Aug 8, 2025, 7:07 AM

#

That's for style control, which takes more into account than just the raw wins/losses. You can see gemini's heade to win win-loss advantage reflected in the non-style leaderboards where it is actually above gpt-5

blazing bison Aug 8, 2025, 7:07 AM

#

https://x.com/scaling01/status/1953626236124573880?t=MCJKfIz3TFRso0pVuzeRdg&s=19

Lisan al Gaib (@scaling01)

ChatGPT literally got worse for every single Plus user today.

There's no way to reliably get thinking models anymore.
Before we had o4-mini, o4-mini-high and o3.

Now we have GPT-5 Thinking with 200 messages per week and a router that exclusively routes you to some small and

#

Plus users cooked

wicked root Aug 8, 2025, 7:08 AM

#

hardy lion That's for style control, which takes more into account than just the raw wins/l...

Im very glad to hear this. Thank you.

primal orbit Aug 8, 2025, 7:09 AM

#

could anyone explain what is "style control"?

pulsar rain Aug 8, 2025, 7:09 AM

#

is it system prompt?

wicked root Aug 8, 2025, 7:09 AM

#

primal orbit could anyone explain what is "style control"?

Apparently it removed emotes and other ‘human’ elements.

restive rampart Aug 8, 2025, 7:10 AM

#

primal orbit could anyone explain what is "style control"?

https://lmsys.org/blog/2024-08-28-style-control/

Does style matter? Disentangling style and substance in Chatbot Are...

<p>Why is GPT-4o-mini so good? Why does Claude rank so low, when anecdotal experience suggests otherwise?</p>
<p>We have answers for you. We controlled for t...

rapid merlin Aug 8, 2025, 7:11 AM

#

first test of the day on pc, and i can already say it isn't that censored which is nice

#

it also isn't nearly as lazy as o3 was, wow

wicked root Aug 8, 2025, 7:13 AM

#

rapid merlin first test of the day on pc, and i can already say it isn't that censored which ...

What have you found? 👀

hardy lion Aug 8, 2025, 7:14 AM

#

It's explained in this post: https://news.lmarena.ai/style-control/
The general idea is that research has found that even if two responses contain the same information, people will vote for ones with more "stylistic features" such as markdown, lists, bold etc.

It's even been found that people will vote for more stylastic responses even if they are inaccurate or wrong. Some companies did RLHF too hard and their models were optimized just for responses that look good

So style control learns two sets of parameters, the model strenths, and the importanes of the style features. And then the model strengths are actuall interpreted as "the model strength if all style features were equal". Those are what is reported on the style controlled leaderboards, which are the defaults. It's similar to controlled trials in medacine where they correct studies for differences in age, or other factors.

LMArena Blog

Does Style Matter in AI Evaluations?

We controlled for the effect of length and markdown, and indeed, the ranking changed. This is just a first step towards our larger goal of disentangling substance and style in Chatbot Arena leaderboard.

cedar tide Aug 8, 2025, 7:19 AM

#

Go upvote
https://discord.com/channels/1340554757349179412/1403017053031628913

rapid merlin Aug 8, 2025, 7:22 AM

#

wicked root What have you found? 👀

it was able to 1-shot optimize a bot system for a game, and it did by a lot

#

with no resistance whatsoever

native venture Aug 8, 2025, 7:27 AM

#

Hello 👋

rapid merlin Aug 8, 2025, 7:29 AM

#

hi

neon idol Aug 8, 2025, 7:34 AM

#

Gpt 5 is also in copilot

#

But is the same got 5 that is in chatgpt app?

solid brook Aug 8, 2025, 7:43 AM

#

Wth

#

After my i hit limit on gpt 5

#

It switched to gpt 4o mini

#

Not gpt 5 mini

#

Are they high

neon idol Aug 8, 2025, 7:45 AM

#

@echo aurora Yo a question. I have found on copilot GPT 5 but i the same that is chatgpt app?

keen beacon Aug 8, 2025, 7:46 AM

#

Does anyone know if in the future it will be possible to upload files to the LMArena project?

neon idol Aug 8, 2025, 7:46 AM

#

keen beacon Does anyone know if in the future it will be possible to upload files to the LMA...

Is possible upload files

pulsar rain Aug 8, 2025, 7:48 AM

#

pulsar rain

poll_question_text

What are your thoughts on lmarena using your data?

victor_answer_votes

3

total_votes

4

victor_answer_id

3

victor_answer_text

I'm extremely careful not to reveal any sensitive infor

astral prawn Aug 8, 2025, 7:48 AM

#

Dang. I'm gonna miss 4.5 🙁

keen beacon Aug 8, 2025, 7:50 AM

#

neon idol Is possible upload files

I think it only allows images, not files as such. I wanted to try with a Word file but it doesn't let me

echo aurora Aug 8, 2025, 7:53 AM

#

neon idol <@283397944160550928> Yo a question. I have found on copilot GPT 5 but i the sam...

yeah it's the standard model

keen beacon Aug 8, 2025, 7:53 AM

#

I am pissed off. People on r/Singularity are posting about their rage about GPT-5 and other models leaving... Then unsubscribing. Some reasons were: It has less personality, answers are too short, it wasn't as big of an improvement... Etc.

I really do not understand how people are like that. They are surely going to update the model like 4o to adhere to people's needs.

#

Sorry. A bit angered at the moment.

echo aurora Aug 8, 2025, 7:54 AM

#

keen beacon Does anyone know if in the future it will be possible to upload files to the LMA...

yeah it's something we're putting thought into, more file upload options would be nice

neon idol Aug 8, 2025, 7:54 AM

#

echo aurora yeah it's the standard model

So is GPT 5 medium?

torn mantle Aug 8, 2025, 7:58 AM

#

is it -chat or -thinking ?

fading moth Aug 8, 2025, 7:59 AM

#

neon idol So is GPT 5 medium?

I have same question too

neon idol Aug 8, 2025, 7:59 AM

#

fading moth I have same question too

Lol

fading moth Aug 8, 2025, 8:00 AM

#

Also, can anyone confirm the exact model version/variant of GPT-5 that is available via direct chat in lmarena?

echo aurora Aug 8, 2025, 8:02 AM

#

I'm going to double check and will followup

golden ocean Aug 8, 2025, 8:04 AM

#

wicked root HELL YEAH

SHUT UP

fading rover Aug 8, 2025, 8:05 AM

#

I am trying lmarena web for image geneation with prompt somehow most engines now a days even if mentiond 16:9 creates only square ratio. can anyone help how to specifically force to have 16:9 or 9:16

torn mantle Aug 8, 2025, 8:06 AM

#

fading moth Also, can anyone confirm the exact model version/variant of GPT-5 that is availa...

seems to me like its the default one

#

not the -chat ver

wicked root Aug 8, 2025, 8:06 AM

#

golden ocean SHUT UP

Look at mr fat cat here. Did you go long on OpenAI?

#

I went long on google man. It’s imperative Altman’s products lose to Google

fading moth Aug 8, 2025, 8:07 AM

#

torn mantle seems to me like its the default one

With thinking?

#

Or not

#

So the most most basic gpt5 model nothing else, no thinking?

misty vault Aug 8, 2025, 8:12 AM

#

astral prawn Dang. I'm gonna miss 4.5 🙁

Dang. I'm gonna miss GPT 4 preview 0314 🙁

torn mantle Aug 8, 2025, 8:13 AM

#

fading moth With thinking?

its with thinking ye

#

im not a fan of this version

keen beacon Aug 8, 2025, 8:20 AM

#

echo aurora yeah it's something we're putting thought into, more file upload options would b...

In my case I would like to use it for school, it would be great

torn mantle Aug 8, 2025, 8:27 AM

#

keen beacon In my case I would like to use it for school, it would be great

it would be nice, but why arent you using gemini on aistudio for that?

white hatch Aug 8, 2025, 8:27 AM

#

misty vault Dang. I'm gonna miss GPT 4 preview 0314 🙁

best model ever 😭

torn mantle Aug 8, 2025, 8:28 AM

#

i still think the current gemini is better than gpt5

flint skiff Aug 8, 2025, 8:32 AM

#

im using gpt 5 high on cursor, through openai api

#

is it still nerfed if I do this?

#

it feels nerfed lol but maybe im just expecting too much from gpt 5

cedar tide Aug 8, 2025, 8:33 AM

#

The comparisons that open ai doesn't want to show 😶

Screenshot_2025-08-08-11-19-45-777_com.android.chrome-edit.jpg

#

Screenshot_2025-08-08-11-20-05-586_com.android.chrome-edit.jpg

lament radish Aug 8, 2025, 8:35 AM

#

someone will tell me how to generate videos here ?

blazing bison Aug 8, 2025, 8:35 AM

#

Guys the thinking model of chatgpt is not gpt 5 thinking

#

The output is worst 100% of the time

#

There is something wrong with chatgpt interface thinking

#

😡

blazing bison Aug 8, 2025, 8:37 AM

#

flint skiff is it still nerfed if I do this?

It is, cursor will always use rag with exception when using max mode and it don't work with your own api key

floral comet Aug 8, 2025, 8:43 AM

#

Wow i tried vibecoding with gpt, it refused and says the feature I'm requesting is impossible for the current environment, I think that's what set apart between gpt5 and other models.. Other would just most probably agree and waste alot of tokens and my time

flint skiff Aug 8, 2025, 8:43 AM

#

blazing bison It is, cursor will always use rag with exception when using max mode and it don'...

yeah but I used it in vs code and it has no internet access

terse shuttle Aug 8, 2025, 8:47 AM

#

@echo aurora just out of interest, are there any plans for a 3d llm arena?

golden ocean Aug 8, 2025, 8:47 AM

#

mc bench

blazing bison Aug 8, 2025, 8:50 AM

#

lament radish someone will tell me how to generate videos here ?

#1397655624103493813

stoic ridge Aug 8, 2025, 8:59 AM

#

why my videos doesn't have background sound,

solar hollow Aug 8, 2025, 9:07 AM

#

cedar tide The comparisons that open ai doesn't want to show 😶

goes to show that hassabis is the only one not bsing

floral comet Aug 8, 2025, 9:07 AM

#

stoic ridge why my videos doesn't have background sound,

I'm assuming its copyrighted

stoic ridge Aug 8, 2025, 9:08 AM

#

Then why video long only 4sec and helf, not 5sec🥲

sly estuary Aug 8, 2025, 9:18 AM

#

GPT5 model not working right?

#

i just ask any questions but only get "Something went wrong with this response, please try again." response

upbeat owl Aug 8, 2025, 9:22 AM

#

Hello

floral comet Aug 8, 2025, 9:22 AM

#

sly estuary i just ask any questions but only get "Something went wrong with this response, ...

Try sending a second message on the same chat don't retry it. Works fine for me

sly estuary Aug 8, 2025, 9:23 AM

#

no, not work for me...

floral comet Aug 8, 2025, 9:28 AM

#

Yeah and usually it happens to me when my input is long

prime mulch Aug 8, 2025, 9:28 AM

#

Try to reload the page

blazing bison Aug 8, 2025, 9:44 AM

#

"Its not gonna be a router" they said

#

https://x.com/tszzl/status/1953638161034400253?t=LTRaKm_P0KK1xyRF1eEA_A&s=19

roon (@tszzl)

@scaling01 btw model auto switcher is apparently broken which is why it’s not routing you correctly. will be fixed soon

#

😆

hollow imp Aug 8, 2025, 9:46 AM

#

What is google bard?

pliant cliff Aug 8, 2025, 9:55 AM

#

hollow imp What is google bard?

old gemini

keen beacon Aug 8, 2025, 9:58 AM

#

hollow imp What is google bard?

The OG google model

#

I tried it when it was new. It was only good for basic stuff

#

no coding or such were good

#

Would be nice to see it back for "nostalgia" vibes

rapid merlin Aug 8, 2025, 10:08 AM

#

google bard is an absolute joke

#

it told me it can't produce a script in a coding context since it can only generate text

#

kekw

bright kayak Aug 8, 2025, 10:15 AM

#

does anyone know how much worse gpt 5 mini is to gpt 5?

barren prairie Aug 8, 2025, 10:17 AM

#

torn mantle it would be nice, but why arent you using gemini on aistudio for that?

Grmini in Gemini app is better for studying than ai studio because it has a lot of good feautures

#

Did open ai remove the old features like search and edu mode from chatgpt free tiers who knows ?

keen beacon Aug 8, 2025, 10:26 AM

#

bright kayak does anyone know how much worse gpt 5 mini is to gpt 5?

You can always put them side to side on LMArena and try the same prompts and see the difference. Benchmarks can be a hit or miss at times

bright kayak Aug 8, 2025, 10:26 AM

#

keen beacon You can always put them side to side on LMArena and try the same prompts and see...

oh i didnt notice they added the mini and nano versions

#

thanks

willow grail Aug 8, 2025, 10:34 AM

#

alright. so have u found out what is best way to use gpt5 for swe?

is it cline, cursor, poe, chatgpt?

strange elm Aug 8, 2025, 10:40 AM

#

hi i love lmarena :3

keen beacon Aug 8, 2025, 10:43 AM

#

Hello again, does anybody know what this pre-release model Velocilux could be? Came by when I was doing battle mode.

glass arch Aug 8, 2025, 10:43 AM

#

I got gpt5 to leak its system prompt to me

rapid merlin Aug 8, 2025, 10:46 AM

#

so far so bad with the gpt5 in chatgpt

#

it keeps trying to access variables before even initializing them

shadow jewel Aug 8, 2025, 10:46 AM

#

PLEASE make it so that I can share raw github with the ais 🙏

rapid merlin Aug 8, 2025, 10:46 AM

#

i noticed claude doing that aswell

glass arch Aug 8, 2025, 10:50 AM

#

chatgpt makes it easy

#

you can just make a zip file and upload it

#

then it will extract it and view the content

pulsar rain Aug 8, 2025, 10:53 AM

#

why chatgpt 5 talks like college students write note in their lectures?

terse shuttle Aug 8, 2025, 10:54 AM

#

pulsar rain why chatgpt 5 talks like college students write note in their lectures?

system prompt mb

rapid merlin Aug 8, 2025, 10:55 AM

#

the model on lmarena actually wrote the code without minifying it

#

i dont know why this model does it

#

or well to an extent

eternal niche Aug 8, 2025, 10:56 AM

#

gpt5 sucks

rapid merlin Aug 8, 2025, 10:57 AM

#

i have memories and all off so that shouldn't influence anything either

#

to be fair i didnt really use a three essay prompt, but i think it should pass this (nothing spawns, tried multiple prompts to fix it and nothing)

#

gpt-5-thinking in chatgpt.com ^

floral comet Aug 8, 2025, 11:00 AM

#

Is the gpt5 model in lmarena uses high reasoning or the medium one? Thanks for any answers!

eternal niche Aug 8, 2025, 11:00 AM

#

floral comet Is the gpt5 model in lmarena uses high reasoning or the medium one? Thanks for ...

yes

floral comet Aug 8, 2025, 11:01 AM

#

Alright thanks!

keen beacon Aug 8, 2025, 11:13 AM

#

What is considered to be good code? Noob noncoder here

rapid merlin Aug 8, 2025, 11:19 AM

#

keen beacon What is considered to be good code? Noob noncoder here

it's not really about the quality of the code itself, it's the fact it started minifying and making everything hard to read right away, claude NEVER does that

#

feels lazy to me personally

obtuse heart Aug 8, 2025, 11:22 AM

#

rapid merlin to be fair i didnt really use a three essay prompt, but i think it should pass t...

ive seen a video comparing gpt 5 against opus 4.1 with pretty similar prompts to this, and opus 4.1 beat gpt 5 by a mile

keen beacon Aug 8, 2025, 11:23 AM

#

rapid merlin it's not really about the quality of the code itself, it's the fact it started m...

Ah, okay. From what I understand, commenting the code is important

#

Holy jesus

tribal aspen Aug 8, 2025, 11:41 AM

#

anyone from the lmarena team online right now?

#

@echo aurora

white hatch Aug 8, 2025, 11:42 AM

#

I'm not sure, but i feel like gpt-5 was nerfed

tribal aspen Aug 8, 2025, 11:44 AM

#

so

#

does the gpt 5 model

#

in direct chatr

#

use thinking only?

terse shuttle Aug 8, 2025, 11:44 AM

#

white hatch I'm not sure, but i feel like gpt-5 was nerfed

same feel

tribal aspen Aug 8, 2025, 11:44 AM

#

I mean the max reasoning only?

terse shuttle Aug 8, 2025, 11:44 AM

#

tribal aspen I mean the max reasoning only?

yep

tribal aspen Aug 8, 2025, 11:44 AM

#

terse shuttle yep

why not the fast version?

#

also why is it so slow

terse shuttle Aug 8, 2025, 11:44 AM

#

in #1372229840131985540 already requested to non-think model

tribal aspen Aug 8, 2025, 11:45 AM

#

when it reasons in Copilot/gpt website it doesnt take so long

#

as much it takes in lmarena

terse shuttle Aug 8, 2025, 11:45 AM

#

tribal aspen when it reasons in Copilot/gpt website it doesnt take so long

idk

tribal aspen Aug 8, 2025, 11:45 AM

#

I wonder which one is nerfed

terse shuttle Aug 8, 2025, 11:45 AM

#

maybe because llmarena using not the same provider that using official openai

#

i don't know

tribal aspen Aug 8, 2025, 11:46 AM

#

terse shuttle maybe because llmarena using not the same provider that using official openai

I wonder if Copilot uses a nerfed model or not

#

as everyone hates copilot

terse shuttle Aug 8, 2025, 11:46 AM

#

idk

ripe brook Aug 8, 2025, 11:49 AM

#

What are the models in battle mode?

white hatch Aug 8, 2025, 11:50 AM

#

random models

ripe brook Aug 8, 2025, 11:50 AM

#

logically

#

will there be a version for phones?

wheat onyx Aug 8, 2025, 11:52 AM

#

still no GPT5 on my plus account..

#

https://x.com/OpenAI/status/1953526590325068163

OpenAI (@OpenAI)

GPT‑5 is available to all Plus, Pro, Team, and Free users starting today with access for Enterprise and Edu coming in one week. It may take a few days to roll out to all Free users.

- Pro users get unlimited access to GPT-5 & access to GPT‑5 Pro, ideal for the most challenging,

ripe brook Aug 8, 2025, 11:53 AM

#

wheat onyx still no GPT5 on my plus account..

well, wait until it's not all rolled out yet, it was rolled out to me today for my Pro subscription

ornate ether Aug 8, 2025, 11:56 AM

#

lmarena is actually insane bruh

fleet lintel Aug 8, 2025, 11:58 AM

#

ornate ether lmarena is actually insane bruh

how come?

torn mantle Aug 8, 2025, 11:59 AM

#

ornate ether lmarena is actually insane bruh

elaborate

fleet lintel Aug 8, 2025, 11:59 AM

#

wheat onyx still no GPT5 on my plus account..

i have it on my mobile but not on desktop

ornate ether Aug 8, 2025, 11:59 AM

#

just the damn concept

#

so many models, many of them paid or limited, image gen, web dev and now video gen all at one place

#

for free

#

wthelly

wheat onyx Aug 8, 2025, 12:02 PM

#

fleet lintel i have it on my mobile but not on desktop

neither for me

keen beacon Aug 8, 2025, 12:05 PM

#

ornate ether so many models, many of them paid or limited, image gen, web dev and now video g...

I wish there was more promotion about the service to get people test it more

#

and get more data

ornate ether Aug 8, 2025, 12:06 PM

#

keen beacon I wish there was more promotion about the service to get people test it more

but then enshittification would occur

#

i like it as it is

keen beacon Aug 8, 2025, 12:06 PM

#

ornate ether but then enshittification would occur

wdym? All kinds of people test it already

#

not just coders or specialists

ornate ether Aug 8, 2025, 12:06 PM

#

isnt mainstream

keen beacon Aug 8, 2025, 12:06 PM

#

I am an ordinary day to day user myself

ornate ether Aug 8, 2025, 12:06 PM

#

keen beacon I am an ordinary day to day user myself

yea me too but it isnt really that popular

#

to be considered "popular"

keen beacon Aug 8, 2025, 12:08 PM

#

ornate ether yea me too but it isnt really that popular

I wouldn't call it enshittification. Dynamic testing is the future for AI models. Though I do understand the concern of models getting nerfed from intentional bad voting

#

trying to sabotage the list as LMArena gets more popular

#

I checked the traffic yesterday and it has been increasing steadily

ornate ether Aug 8, 2025, 12:13 PM

#

keen beacon I wouldn't call it enshittification. Dynamic testing is the future for AI models...

qwen 3 randomly popping off at 10000 points

brittle tiger Aug 8, 2025, 12:18 PM

#

Huh

https://x.com/AiBattle_/status/1953792181514170750?t=Q6VrBxG_2UYnSfJOyXAyqg&s=19

AiBattle (@AiBattle_)

OpenAI seems to have chosen the Summit model over the Zenith model as GPT-5 Thinking

Unfortunate, as I preferred the outputs of Zenith over Summit for nearly all my prompts

obtuse heart Aug 8, 2025, 12:23 PM

#

ornate ether so many models, many of them paid or limited, image gen, web dev and now video g...

when i asked the opus 4.1 model it said it was sonnet 3.5

#

😭

ornate ether Aug 8, 2025, 12:24 PM

#

obtuse heart when i asked the opus 4.1 model it said it was sonnet 3.5

many models do that

wheat onyx Aug 8, 2025, 12:24 PM

#

brittle tiger Huh https://x.com/AiBattle_/status/1953792181514170750?t=Q6VrBxG_2UYnSfJOyXAyqg...

probably 5pro

ornate ether Aug 8, 2025, 12:24 PM

#

even chatgpt on it's official page when asked says it's powered by gpt 4 with a knowledge cut

#

off at oct 2024

obtuse heart Aug 8, 2025, 12:25 PM

#

ornate ether many models do that

ah so it doesnt actually matter then?

eager crag Aug 8, 2025, 12:25 PM

#

I tested out GPT-5 to make chiptune music.

#

It’s actually pretty good for an AI

wheat onyx Aug 8, 2025, 12:25 PM

#

OAI really made users have the same limits as with o3, even though it significantly reduced its costs internally

#

disappointing

ornate ether Aug 8, 2025, 12:27 PM

#

obtuse heart ah so it doesnt actually matter then?

no no the same it's the model shown in the tab

wheat onyx Aug 8, 2025, 12:27 PM

#

#

so 80 messages of GPT5 vs unlimited for 4o, even though it costs less than 4o

brave orbit Aug 8, 2025, 12:31 PM

#

‎ ‎ ‎ ‎ ‎ ‎Check My Website just a driveing game it would be cool if want to try it https://id12b.github.io/

eager crag Aug 8, 2025, 12:31 PM

#

Does GPT-5 make images too?

brave orbit Aug 8, 2025, 12:31 PM

#

brave orbit ‎ ‎ ‎ ‎ ‎ ‎Check My Website just a driveing game it would be cool if want to ...

and its maked by gpt 5 you can not think how good it is

eager crag Aug 8, 2025, 12:31 PM

#

Hello?

brave orbit Aug 8, 2025, 12:38 PM

#

pls try my website just try its really cool

#

its https://id12b.github.io/

calm sequoia Aug 8, 2025, 12:38 PM

#

keen beacon Aug 8, 2025, 12:40 PM

#

calm sequoia

Cannot say that it is phenomenal

#

Quite the same except for hallucination rate

#

And perhaps coding performance and different formatting when writing an output

#

Gemini 3.0 is where it is at when it comes out. Also Deepseek R2.

pseudo hemlock Aug 8, 2025, 12:44 PM

#

Is GPT5 MoE?

brave orbit Aug 8, 2025, 12:44 PM

#

ripe brook well, wait until it's not all rolled out yet, it was rolled out to me today for ...

just try to clear you browser cookies turst me it will do it

#

i had the same thing i just cleared my cookies and its done

languid crescent Aug 8, 2025, 12:50 PM

#

Is GPT-5 slow at answering? Sometimes I get errors and need to refresh the message and the website.

brave orbit Aug 8, 2025, 12:59 PM

#

its diff on what messages since it many think on diff messages for not hard messages it doesnt think for easy messages it just doesnt think

brave orbit Aug 8, 2025, 1:00 PM

#

languid crescent Is GPT-5 slow at answering? Sometimes I get errors and need to refresh the messa...

Trying cleaning you Browser Cookies or its just that openai servers are busy

stoic ridge Aug 8, 2025, 1:13 PM

#

When I generate videos some vids are long 8sec,but some have 5sec , why?

brave orbit Aug 8, 2025, 1:22 PM

#

however openai stats is ChatGPT
:minor: Degraded Performance soo just so you know that

molten cipher Aug 8, 2025, 1:22 PM

#

brave orbit and its maked by gpt 5 you can not think how good it is

made*

delicate rapids Aug 8, 2025, 1:23 PM

#

hello

molten cipher Aug 8, 2025, 1:23 PM

#

delicate rapids hello

Hey

misty vault Aug 8, 2025, 1:23 PM

#

Large Language Model

wintry tinsel Aug 8, 2025, 1:24 PM

#

Is king fall definitively better than GPT5

wintry tinsel Aug 8, 2025, 1:27 PM

#

wheat onyx

So it costs the same as Gemini api when Gemini is free? Lol what a joke

unkempt oak Aug 8, 2025, 1:29 PM

#

guy the are a methods for generate video while veo3 audio?

wheat onyx Aug 8, 2025, 1:30 PM

#

wintry tinsel So it costs the same as Gemini api when Gemini is free? Lol what a joke

FYI

hollow imp Aug 8, 2025, 1:40 PM

#

wheat onyx FYI

Whats the difference in performance between free tier and paid tier? The only difference I found was deep research feature

#

Talking about the Gemini web btw

clever estuary Aug 8, 2025, 1:47 PM

#

which is better, you guys?

#

guess 5 pro failed eh

obtuse heart Aug 8, 2025, 1:52 PM

#

ornate ether no no the same it's the model shown in the tab

okay thanks lol, its crazy how this website is hosting so many paid models for free, its great

#

i just wish i can use it on cline tho

hollow imp Aug 8, 2025, 1:52 PM

#

clever estuary guess 5 pro failed eh

5 pro???

#

How pro?

clever estuary Aug 8, 2025, 1:53 PM

#

no difference from what I've seen...

eternal niche Aug 8, 2025, 1:55 PM

#

зет

maiden fulcrum Aug 8, 2025, 2:02 PM

#

good morning everyone

#

could you give me a feedback about GPT-5 please, I still don't have it on my end.

balmy mist Aug 8, 2025, 2:03 PM

#

maiden fulcrum could you give me a feedback about GPT-5 please, I still don't have it on my end...

u paid or free version?

maiden fulcrum Aug 8, 2025, 2:03 PM

#

balmy mist u paid or free version?

Paid; Plus tier

balmy mist Aug 8, 2025, 2:03 PM

#

maiden fulcrum Paid; Plus tier

do a hard refresh

#

comand shift r

molten cipher Aug 8, 2025, 2:03 PM

#

i have plus tier but don't see it on my models list

maiden fulcrum Aug 8, 2025, 2:03 PM

#

balmy mist do a hard refresh

How can I do that on my iOS app?

balmy mist Aug 8, 2025, 2:03 PM

#

or control shift r

#

try updating your app

#

or connect to a diff wifi, sometimes that works

#

but gpt5 has been good imo

#

my new daily driver

molten cipher Aug 8, 2025, 2:05 PM

#

erm

maiden fulcrum Aug 8, 2025, 2:06 PM

#

balmy mist or connect to a diff wifi, sometimes that works

I have tried those methods but still don't have it

inland cedar Aug 8, 2025, 2:06 PM

#

maiden fulcrum I have tried those methods but still don't have it

maybe delete and reinstall?

#

i just updated my app

maiden fulcrum Aug 8, 2025, 2:09 PM

#

inland cedar i just updated my app

Is there other way too update the app than from the app store, because it says open rather than update

inland cedar Aug 8, 2025, 2:09 PM

#

ummmm

#

no

stray aspen Aug 8, 2025, 2:09 PM

#

molten cipher erm

just use microsoft copilot at this point lmao

inland cedar Aug 8, 2025, 2:09 PM

#

u shud delete and reinstall

maiden fulcrum Aug 8, 2025, 2:11 PM

#

i still don't have it @inland cedar

true condor Aug 8, 2025, 2:12 PM

#

So where is Qwen-Image in leaderboard?

fleet lintel Aug 8, 2025, 2:17 PM

#

google share is up a bit.. is it gpt-5 effect?

indigo hazel Aug 8, 2025, 2:18 PM

#

i dont know guys. im coding since this night with gpt5 and i feel really good with it. it doesnt make many mistakes like 2.5 pro just because for example doesnt hallucinate. im making a program in python, so it's not web

stray aspen Aug 8, 2025, 2:19 PM

#

yes its amazing

clever estuary Aug 8, 2025, 2:20 PM

#

hmmm 5 pro is really good at counting letters without reasoning

maiden fulcrum Aug 8, 2025, 2:24 PM

#

clever estuary hmmm 5 pro is really good at counting letters without reasoning

you used pro for this question?

clever estuary Aug 8, 2025, 2:24 PM

#

I mean why not, doesn't seem there's a limit rn

maiden fulcrum Aug 8, 2025, 2:26 PM

#

why not normal gpt-5 instead

clever estuary Aug 8, 2025, 2:26 PM

#

just testing it ig

wheat onyx Aug 8, 2025, 2:27 PM

#

balmy mist or control shift r

not helping. Plus - still dont have

maiden fulcrum Aug 8, 2025, 2:28 PM

#

wheat onyx not helping. Plus - still dont have

i am in the same boat

stray aspen Aug 8, 2025, 2:29 PM

#

they are slowly rolling it out

#

they are serving the whole wolrd on their website

keen beacon Aug 8, 2025, 2:30 PM

#

clever estuary just testing it ig

Try a hard math equation to test it for real

balmy mist Aug 8, 2025, 2:30 PM

#

wheat onyx not helping. Plus - still dont have

I have plus and had it since yesterday around 3pm

#

i dont use the app version tho, only web

wheat onyx Aug 8, 2025, 2:32 PM

#

balmy mist i dont use the app version tho, only web

have both versions - no upgrade on either yet

maiden fulcrum Aug 8, 2025, 2:35 PM

#

do you guys think grok 4 heavy is better than gpt-5

solid brook Aug 8, 2025, 2:36 PM

#

Guys i have a problem with chatgpt. Once i hit the limit on gpt 5 it switches to gpt 4o mini

#

No gpt 5 mini

#

Gpt 4o mini is garbage

obtuse heart Aug 8, 2025, 2:38 PM

#

do the paid models in the website have any limits when youre using them?

stray aspen Aug 8, 2025, 2:38 PM

#

poll_question_text

how long will gpt-5 be SoTA

victor_answer_votes

5

total_votes

9

victor_answer_id

1

victor_answer_text

until gemini 3 release

eternal niche Aug 8, 2025, 2:40 PM

#

guys gpt5 sucks

molten cipher Aug 8, 2025, 2:41 PM

#

eternal niche guys gpt5 sucks

no thank you

keen beacon Aug 8, 2025, 2:41 PM

#

eternal niche guys gpt5 sucks

You still going at it?

solid brook Aug 8, 2025, 2:41 PM

#

eternal niche guys gpt5 sucks

It is good idk why people hate it

eternal niche Aug 8, 2025, 2:42 PM

#

keen beacon You still going at it?

solid brook Aug 8, 2025, 2:42 PM

#

eternal niche

Lie

eternal niche Aug 8, 2025, 2:42 PM

#

solid brook Lie

no

keen beacon Aug 8, 2025, 2:42 PM

#

eternal niche

I'll wait until deepseek drops R2.

#

It will look fun

molten cipher Aug 8, 2025, 2:42 PM

#

eternal niche

DEMMMMMMMMMMMMMMMMMMMMMM

solid brook Aug 8, 2025, 2:43 PM

#

eternal niche no

I uh don't think you actually worked with the models thats why

eternal niche Aug 8, 2025, 2:43 PM

#

solid brook I uh don't think you actually worked with the models thats why

ok

obtuse heart Aug 8, 2025, 2:43 PM

#

eternal niche

gpt-5 has went on the top of the list, this is an old screenshot posted here bro

solid brook Aug 8, 2025, 2:43 PM

#

0/10 ragebait

eternal niche Aug 8, 2025, 2:43 PM

#

obtuse heart gpt-5 has went on the top of the list, this is an old screenshot posted here bro

bro

#

https://lmarena.ai/leaderboard/text/overall-no-style-control

solid brook Aug 8, 2025, 2:44 PM

#

eternal niche bro

Ok man gemini 2.5 pro be happy

#

Whatever you say

eternal niche Aug 8, 2025, 2:44 PM

#

gpt5 sucks

obtuse heart Aug 8, 2025, 2:44 PM

#

eternal niche bro

theyre the same rank

solid brook Aug 8, 2025, 2:44 PM

#

Yeah whatever you say

eternal niche Aug 8, 2025, 2:44 PM

#

obtuse heart theyre the same rank

dude

keen beacon Aug 8, 2025, 2:45 PM

#

eternal niche gpt5 sucks

Say something new, dude.

eternal niche Aug 8, 2025, 2:45 PM

#

keen beacon Say something new, dude.

why

obtuse heart Aug 8, 2025, 2:45 PM

#

eternal niche dude

eternal niche Aug 8, 2025, 2:45 PM

#

obtuse heart

solid brook Aug 8, 2025, 2:45 PM

#

keen beacon Say something new, dude.

Ignore this guy

obtuse heart Aug 8, 2025, 2:45 PM

#

okay lol hes just tryna get people mad

eternal niche Aug 8, 2025, 2:45 PM

#

just accept that gpt5 sucks

solid brook Aug 8, 2025, 2:45 PM

#

Yeah man

brisk helm Aug 8, 2025, 2:45 PM

#

eternal niche gpt5 sucks

nuh uh

solid brook Aug 8, 2025, 2:45 PM

#

Okay we got you

#

Now go be happy

eternal niche Aug 8, 2025, 2:46 PM

#

i am happy

keen beacon Aug 8, 2025, 2:46 PM

#

eternal niche just accept that gpt5 sucks

Bro is tweaking

molten cipher Aug 8, 2025, 2:46 PM

#

eternal niche https://lmarena.ai/leaderboard/text/overall-no-style-control

no style control

obtuse heart Aug 8, 2025, 2:46 PM

#

i cant believe somebody has the free time to actively ragebait on discord for like 2 minutes worth of entertainment

eternal niche Aug 8, 2025, 2:46 PM

#

molten cipher no style control

so

molten cipher Aug 8, 2025, 2:46 PM

#

so overall gpt 5 is better

eternal niche Aug 8, 2025, 2:46 PM

#

why

molten cipher Aug 8, 2025, 2:46 PM

#

google might just make a AI that will just take over the internet

keen beacon Aug 8, 2025, 2:46 PM

#

eternal niche why

Why is GPT-5 so good at coding now then?

eternal niche Aug 8, 2025, 2:46 PM

#

skynet

obtuse heart Aug 8, 2025, 2:47 PM

#

cant wait for gemini 3 tho, theyre cooking with it

eternal niche Aug 8, 2025, 2:47 PM

#

because it sucks at text

solid brook Aug 8, 2025, 2:47 PM

#

I agree with gemini 3 beating gpt 5

keen beacon Aug 8, 2025, 2:47 PM

#

solid brook I agree with gemini 3 beating gpt 5

I am sure about that.

eternal niche Aug 8, 2025, 2:48 PM

#

yeah because gpt5 sucks

molten cipher Aug 8, 2025, 2:48 PM

#

solid brook I agree with gemini 3 beating gpt 5

i agree claude 54000 will be better then gpt 5 and all other models made yet

keen beacon Aug 8, 2025, 2:48 PM

#

eternal niche yeah because gpt5 sucks

https://tenor.com/view/yukleniyor-gif-18955750

Tenor

solid brook Aug 8, 2025, 2:48 PM

#

keen beacon I am sure about that.

I mean google is just so ahead. Genie 3

obtuse heart Aug 8, 2025, 2:48 PM

#

claudes good but like the price is craaaazyyy

eternal niche Aug 8, 2025, 2:48 PM

#

keen beacon https://tenor.com/view/yukleniyor-gif-18955750

who is it

molten cipher Aug 8, 2025, 2:48 PM

#

obtuse heart claudes good but like the price is craaaazyyy

lmarena gonna gib us free access

keen beacon Aug 8, 2025, 2:49 PM

#

solid brook I mean google is just so ahead. Genie 3

Would be nice to see it used in game development and such

obtuse heart Aug 8, 2025, 2:49 PM

#

if gemini 3 releases with a lower price range than claude right now and pretty much guaranteed the best ai model overall, then anthropic is in trouble

stray aspen Aug 8, 2025, 2:49 PM

#

eternal niche guys gpt5 sucks

cmon dude

#

everyone who said that already changed their minds

solid brook Aug 8, 2025, 2:49 PM

#

obtuse heart if gemini 3 releases with a lower price range than claude right now and pretty m...

Bruh for sure will happen.

eternal niche Aug 8, 2025, 2:49 PM

#

stray aspen everyone who said that already changed their minds

they are brainwashed by 5g

molten cipher Aug 8, 2025, 2:49 PM

#

tbh wait till gpt 6 comes

stray aspen Aug 8, 2025, 2:50 PM

#

eternal niche they are brainwashed by 5g

my bielarusy mirnyja ludzi sercam addanyja rodnaj ziamli

maiden fulcrum Aug 8, 2025, 2:50 PM

#

do you guys think Grok 4 Heavy is better than ChatGPT-5?

solid brook Aug 8, 2025, 2:50 PM

#

maiden fulcrum do you guys think Grok 4 Heavy is better than ChatGPT-5?

No

stray aspen Aug 8, 2025, 2:50 PM

#

maiden fulcrum do you guys think Grok 4 Heavy is better than ChatGPT-5?

no

brisk helm Aug 8, 2025, 2:50 PM

#

maiden fulcrum do you guys think Grok 4 Heavy is better than ChatGPT-5?

hell naw

stray aspen Aug 8, 2025, 2:50 PM

#

and its horribly expensive

misty drum Aug 8, 2025, 2:50 PM

#

Noooo

brisk helm Aug 8, 2025, 2:50 PM

#

yh

obtuse heart Aug 8, 2025, 2:50 PM

#

maiden fulcrum do you guys think Grok 4 Heavy is better than ChatGPT-5?

absolutely not

zealous panther Aug 8, 2025, 2:50 PM

#

Its a 300$ or sth agent

#

Compare it to gpt 5 pro at least…

brisk helm Aug 8, 2025, 2:50 PM

#

zealous panther Its a 300$ or sth agent

just to post antisemetic tweets 😂

solid brook Aug 8, 2025, 2:51 PM

#

That garbage is overpriced af

obtuse heart Aug 8, 2025, 2:51 PM

#

what are yalls thoughts about gemini 2.5 deepthink

stray aspen Aug 8, 2025, 2:51 PM

#

grok 4 heavy is just a lot of groks talking to each other

maiden fulcrum Aug 8, 2025, 2:51 PM

#

zealous panther Compare it to gpt 5 pro at least…

Performance and intelligence wise?

zealous panther Aug 8, 2025, 2:51 PM

#

brisk helm just to post antisemetic tweets 😂

Idk why people glaze it so much like its so expensive

zealous panther Aug 8, 2025, 2:51 PM

#

maiden fulcrum Performance and intelligence wise?

All

solid brook Aug 8, 2025, 2:51 PM

#

obtuse heart what are yalls thoughts about gemini 2.5 deepthink

I think gpt 5 pro beats it

stray aspen Aug 8, 2025, 2:51 PM

#

yes

#

gpt-5 is great

maiden fulcrum Aug 8, 2025, 2:51 PM

#

zealous panther All

Which is better?

zealous panther Aug 8, 2025, 2:51 PM

#

Gemini 2.5 deepthink is catered to logic and math thouhg

zealous panther Aug 8, 2025, 2:51 PM

#

maiden fulcrum Which is better?

Gpt 5 pro

#

The benchmarks says it

#

In the last humanity test or sth gpt pro still outscored grok heavy im pretty sure

maiden fulcrum Aug 8, 2025, 2:52 PM

#

zealous panther Gpt 5 pro

ChatGPT-5 is better than Grok 4 Heavy?

zealous panther Aug 8, 2025, 2:52 PM

#

I know

zealous panther Aug 8, 2025, 2:52 PM

#

maiden fulcrum ChatGPT-5 is better than Grok 4 Heavy?

And cheaper

maiden fulcrum Aug 8, 2025, 2:52 PM

#

zealous panther And cheaper

Is it worth the 307 CAD for Pro Plan?

stray aspen Aug 8, 2025, 2:52 PM

#

no

#

dont waste your money

zealous panther Aug 8, 2025, 2:53 PM

#

maiden fulcrum Is it worth the 307 CAD for Pro Plan?

Yeah no

solid brook Aug 8, 2025, 2:53 PM

#

Man i cannot imagine google not beating gpt5. They just have to with gemini 3. If not they fall behind a lot

zealous panther Aug 8, 2025, 2:53 PM

#

80 gpt 5 prompts every 3 hours

maiden fulcrum Aug 8, 2025, 2:53 PM

#

zealous panther Yeah no

Then how to use ChatGPT-5 Pro

zealous panther Aug 8, 2025, 2:53 PM

#

If you use more than that

#

Then sure

zealous panther Aug 8, 2025, 2:53 PM

#

maiden fulcrum Then how to use ChatGPT-5 Pro

You cant without a pro sub

#

But its not worh it generally…

#

Its a big plan for minimal increase in peformance

zealous panther Aug 8, 2025, 2:54 PM

#

solid brook Man i cannot imagine google not beating gpt5. They just have to with gemini 3. I...

Theres only 1 possibility and thats if the ai peak has reached

maiden fulcrum Aug 8, 2025, 2:54 PM

#

zealous panther You cant without a pro sub

I've got ChatGPT-5 and ChatGPT-5 Thinking in Plus plan

zealous panther Aug 8, 2025, 2:54 PM

#

Like it will just slow down

zealous panther Aug 8, 2025, 2:54 PM

#

maiden fulcrum I've got ChatGPT-5 and ChatGPT-5 Thinking in Plus plan

Thats good enouh mate

solid brook Aug 8, 2025, 2:54 PM

#

zealous panther Theres only 1 possibility and thats if the ai peak has reached

I doubt that

zealous panther Aug 8, 2025, 2:54 PM

#

Pro has like a 4-5% increase

zealous panther Aug 8, 2025, 2:54 PM

#

solid brook I doubt that

Idk

#

Anything can happen really

maiden fulcrum Aug 8, 2025, 2:55 PM

#

zealous panther Thats good enouh mate

I wanna know where can I use ChatGPT-5 (high)?

zealous panther Aug 8, 2025, 2:55 PM

#

But still though i felt like gpt 5 was focusing on web dev a lot

zealous panther Aug 8, 2025, 2:55 PM

#

maiden fulcrum I wanna know where can I use ChatGPT-5 (high)?

Api

#

I think

#

Im not sure

stray aspen Aug 8, 2025, 2:55 PM

#

maiden fulcrum I wanna know where can I use ChatGPT-5 (high)?

in yuppp.ai

#

yupp.ai

obtuse heart Aug 8, 2025, 2:56 PM

#

stray aspen yupp.ai

freee?

maiden fulcrum Aug 8, 2025, 2:56 PM

#

zealous panther Api

Isn't it the same as ChatGPT-5 Thinking?

stray aspen Aug 8, 2025, 2:56 PM

#

anguilla must be getting rich from the AI domain

stray aspen Aug 8, 2025, 2:56 PM

#

obtuse heart freee?

yes

zealous panther Aug 8, 2025, 2:56 PM

#

maiden fulcrum Isn't it the same as ChatGPT-5 Thinking?

I think you can use high there ? Some dude said chatgpt thinking is locked at med

#

In app

leaden sun Aug 8, 2025, 2:56 PM

#

obtuse heart if gemini 3 releases with a lower price range than claude right now and pretty m...

If anthropic keeps being obsessed with alignment crap then they’re definitely doomed, instead of obsessing with alignment they should focus on their competitive edges, expand them and scale while keeping the costs fair and reasonable…

maybe they know what they’re doing? 🥺

patent aspen Aug 8, 2025, 2:56 PM

#

I think OAI was wise to deprecate all of the old models. If they didn't, their capacity crunch would be much worse

brisk helm Aug 8, 2025, 2:57 PM

#

is yupp better than lmarena

maiden fulcrum Aug 8, 2025, 2:57 PM

#

zealous panther I think you can use high there ? Some dude said chatgpt thinking is locked at me...

where "there"?

zealous panther Aug 8, 2025, 2:57 PM

#

maiden fulcrum where "there"?

On the api mate

stray aspen Aug 8, 2025, 2:57 PM

#

brisk helm is yupp better than lmarena

no its limited

#

but it has a lot of models

#

and you need google signup

zealous panther Aug 8, 2025, 2:58 PM

#

I think that people dont include like the response time of gpt into account

maiden fulcrum Aug 8, 2025, 2:58 PM

#

zealous panther On the api mate

Link?

zealous panther Aug 8, 2025, 2:58 PM

#

Or the fact that they improved the webdev score by like 200

zealous panther Aug 8, 2025, 2:59 PM

#

maiden fulcrum Link?

https://platform.openai.com/docs/models/gpt-5

#

I think openai definitely doomed itself by not focusing on the webdev enough…its so good

#

I would say for like a simple task it would be faster from 10 seconds or more ?

#

O3 always thinks very hard

#

Depends on the prompt

#

If its a thinking prompt

#

Relatively the same time

#

If its just a general prompt

#

Gpt 5 can be faster by a lot

eternal niche Aug 8, 2025, 3:00 PM

#

gemini 2.5 pro better anyway

zealous panther Aug 8, 2025, 3:01 PM

#

I mean i can test rn if you want me to test a prompt. I have gpt 5 on my ipad and o3 on phone

zealous panther Aug 8, 2025, 3:01 PM

#

eternal niche gemini 2.5 pro better anyway

Its not

eternal niche Aug 8, 2025, 3:01 PM

#

it is

eternal niche Aug 8, 2025, 3:01 PM

#

zealous panther Its not

stray aspen Aug 8, 2025, 3:01 PM

#

gpt-5 is greater than gemini

zealous panther Aug 8, 2025, 3:01 PM

#

Yeah ? If you dont use webdeb that is

eternal niche Aug 8, 2025, 3:01 PM

#

stray aspen gpt-5 is greater than gemini

nooo

#

brother

#

you betrayed me

zealous panther Aug 8, 2025, 3:01 PM

#

Webdev is a HUGEEEE JUMP

#

200+

zealous panther Aug 8, 2025, 3:01 PM

#

eternal niche

Theres a reason style control exist

#

Read the papers