#general | Arena | Page 94

ocean vortex Aug 10, 2025, 9:38 PM

#

lol what

#

DId you just make this up for no reason?

#

GPT5 is significantly smaller than og GPT4 and their infra is much better now. They are doing just fine

#

They also deprecated gpt4.5 for good so probably doing better than ever on compute tbh

stray aspen Aug 10, 2025, 9:42 PM

#

api or aistudio

ocean vortex Aug 10, 2025, 9:44 PM

#

Define 'clearly'. If we look at all the tests and same prompt testing, gpt5 is the clear winner lol

ripe mountain Aug 10, 2025, 9:44 PM

#

ocean vortex They also deprecated gpt4.5 for good so probably doing better than ever on compu...

how is the discontinuation of gpt 4.5 related?

blazing bison Aug 10, 2025, 9:44 PM

#

ocean vortex They also deprecated gpt4.5 for good so probably doing better than ever on compu...

sama said that on twitter and on ama

#

what you're talking about?

#

even on the podcast he said that they gonna need to cut something

ocean vortex Aug 10, 2025, 9:44 PM

#

ripe mountain how is the discontinuation of gpt 4.5 related?

gpt4.5 was their biggest model ever. Since that is dicontinued they now have much more compute to spare

autumn cargo Aug 10, 2025, 9:45 PM

#

stray aspen api or aistudio

I use api with cline + gemini pro (app)

blazing bison Aug 10, 2025, 9:45 PM

#

ocean vortex gpt4.5 was their biggest model ever. Since that is dicontinued they now have muc...

???????????

#

source?

ocean vortex Aug 10, 2025, 9:45 PM

#

???????????????

ocean vortex Aug 10, 2025, 9:45 PM

#

blazing bison source?

For what?

#

that 4.5 is their biggest model?

blazing bison Aug 10, 2025, 9:45 PM

#

no

ocean vortex Aug 10, 2025, 9:46 PM

#

Do you seriously need a source for that?

autumn cargo Aug 10, 2025, 9:46 PM

#

ocean vortex Define 'clearly'. If we look at all the tests and same prompt testing, gpt5 is t...

Subjectively. But even objectively speaking, the win-rate against 2.5 pro is very low on lmarena.

blazing bison Aug 10, 2025, 9:46 PM

#

that they not gonna cut things bcs they now have compute to spare

#

lmao

#

do you know that 4.5 was using training resources right?

#

And now with 4.5 disabled they gonna use it on gpt-6 related work

#

not inference

ocean vortex Aug 10, 2025, 9:46 PM

#

blazing bison that they not gonna cut things bcs they now have compute to spare

Read again what I wrote lol

#

And then dispute what I wrote

blazing bison Aug 10, 2025, 9:47 PM

#

#

bro is blind

#

well, just wait 2 days and when he announce sora or other shi* being cut we discuss again

ocean vortex Aug 10, 2025, 9:47 PM

#

that is nothing new, they are saying the same things all the time

#

that's how they stay in business... smh

#

They are always looking for ways to cut costs

blazing bison Aug 10, 2025, 9:48 PM

#

it's not cut costs bro

ocean vortex Aug 10, 2025, 9:48 PM

#

and this is one of the means of justifying that

blazing bison Aug 10, 2025, 9:48 PM

#

they don't have enough compute

#

lmao

#

sam is crazying traveling the world to get 500 bi and bro is saying that he has compute but don't want to use to cut costs

#

are you insane or what

ocean vortex Aug 10, 2025, 9:49 PM

#

No one does if you read everything literally. You really shouldn't take this for face value catgrin

#

They have plenty of it, the only question is the price/cost

blazing bison Aug 10, 2025, 9:49 PM

#

lol

ocean vortex Aug 10, 2025, 9:50 PM

#

smh

whole wagon Aug 10, 2025, 9:50 PM

#

It has always been the case openAI are deploying compute far too slow. The stargate project is planning to have 64k GPUs by the end of the year

#

That is slow compared to other companies

wet sparrow Aug 10, 2025, 9:51 PM

#

ocean vortex smh

Either you don't follow the news, or you're a conspiracy theorist

whole wagon Aug 10, 2025, 9:51 PM

#

https://www.datacenterdynamics.com/en/news/openai-and-oracle-to-deploy-64000-gb200-gpus-at-stargate-abilene-data-center-by-2026-report/ nevermind end of 2026 is when they expected 64k

OpenAI and Oracle to deploy 64,000 GB200 GPUs at Stargate Abilene d...

Chips will be installed in phases, with 16,000 expected to be in place by end of summer 2024

ocean vortex Aug 10, 2025, 9:51 PM

#

They have their priorities on other things - like training new models. They also won't accept the higher cost than what they feel comfortable with - that's the extent of them struggling for compute

golden ocean Aug 10, 2025, 9:52 PM

#

Bro is gpt 4o -

whole wagon Aug 10, 2025, 9:52 PM

#

OpenAI and Oracle are expected to deploy 64,000 Nvidia GB200s at the Stargate data center in Abilene, Texas by the end of 2026.

whole wagon Aug 10, 2025, 9:52 PM

#

whole wagon ```OpenAI and Oracle are expected to deploy 64,000 Nvidia GB200s at the Stargate...

This is basically pathetic rate lol

ocean vortex Aug 10, 2025, 9:52 PM

#

yeah they are expanding all the time and scaling, obviously

ocean vortex Aug 10, 2025, 9:53 PM

#

whole wagon This is basically pathetic rate lol

They have azure though

#

If they wanted, their compute is 'unlimited', but cost...

#

that's a different question

whole wagon Aug 10, 2025, 9:53 PM

#

That's their main stargate computer. Meant for training

wet sparrow Aug 10, 2025, 9:53 PM

#

ocean vortex They have their priorities on other things - like training new models. They also...

It's not about priorities. It's about not having enough gpus for both

ocean vortex Aug 10, 2025, 9:54 PM

#

Most of these tweets are meant for marketing. And to justify any potential ratelimits. They are not to be taken for face value

wet sparrow Aug 10, 2025, 9:54 PM

#

Yes, you're a conspiracy theorist. There's no point in discussing it

blazing bison Aug 10, 2025, 9:55 PM

#

Just let him yap

ocean vortex Aug 10, 2025, 9:55 PM

#

I'm literally not. lmfao

#

I'm just not naive and know how this works

blazing bison Aug 10, 2025, 9:55 PM

#

yeah bro

#

obviusly

ocean vortex Aug 10, 2025, 9:55 PM

#

???

#

are you actually dumb?

wet sparrow Aug 10, 2025, 9:56 PM

#

I know just enough about ML to know that he's telling the truth

ocean vortex Aug 10, 2025, 9:56 PM

#

Or do you seriously think them making their models smaller and cutting models from model switcher + modernizing their infra... put them at bigger strain than before? 🤣

#

LOL

wet sparrow Aug 10, 2025, 9:57 PM

#

The strain remains the same; they can simply serve more people without removing resources from research

#

ChatGPT is growing every week

twilit valve Aug 10, 2025, 9:58 PM

#

hello

ocean vortex Aug 10, 2025, 9:58 PM

#

wet sparrow The strain remains the same; they can simply serve more people without removing ...

Incorrect, the strain goes down. But obviously you can still sell it as having not enough compute, there's never gonna be enough of it technically...

wet sparrow Aug 10, 2025, 9:59 PM

#

they use the free power for research

#

So, it's not going down

ocean vortex Aug 10, 2025, 9:59 PM

#

They used it for research too since the start

wet sparrow Aug 10, 2025, 10:00 PM

#

Bro, you can't have GPUs training models and doing inference at the same time

ocean vortex Aug 10, 2025, 10:00 PM

#

But they also had to host bigger models and deal with older gpus

ocean vortex Aug 10, 2025, 10:00 PM

#

wet sparrow Bro, you can't have GPUs training models and doing inference at the same time

obviously

echo aurora Aug 10, 2025, 10:00 PM

#

poll_question_text

What version do you use the most?

victor_answer_votes

10

total_votes

22

victor_answer_id

3

victor_answer_text

Direct

wet sparrow Aug 10, 2025, 10:01 PM

#

Even if they have optimized their infrastructure and now have free GPUs for training models, the strain remains the same. They are going to use them to train models

#

When they say they don't have enough GPUs, that includes the GPUs they need for research, the GPUs they need for current inference, and the GPUs they will need for future inference if ChatGPT continues to grow at its current pace

ocean vortex Aug 10, 2025, 10:02 PM

#

wet sparrow Even if they have optimized their infrastructure and now have free GPUs for trai...

Not the same. At the very least the cost of upkeeping chatgpt service is lower.

ocean vortex Aug 10, 2025, 10:03 PM

#

wet sparrow When they say they don't have enough GPUs, that includes the GPUs they need for ...

Yes. Like I said this was the case since the very start. It's all relative.

ocean vortex Aug 10, 2025, 10:03 PM

#

echo aurora

Predictable result... catgrin

whole wagon Aug 10, 2025, 10:03 PM

#

The plus limit went from 200 to 3000. GPT5 Thinking basically unlimited for plus users, they will never hit the limit

#

It doesn't add up with the capacity tweet ngl

blazing bison Aug 10, 2025, 10:04 PM

#

Yes my friend tryed it it's misleading a little

#

because it's like 2600 mini thinkings and not 3000 gpt-5 thinking

whole wagon Aug 10, 2025, 10:04 PM

#

Yeah it has the router

ocean vortex Aug 10, 2025, 10:04 PM

#

whole wagon It doesn't add up with the capacity tweet ngl

Yeah precisely. If they were struggling for compute 3000 would have been literally impossible

whole wagon Aug 10, 2025, 10:05 PM

#

How much do you think the router helps. What percents are going to mini or even nano

blazing bison Aug 10, 2025, 10:05 PM

#

Not true because the o3 mini, o4 mini, gpt 4.5 and others was using those resources that they are giving gpt 5 mini thinking now

ocean vortex Aug 10, 2025, 10:06 PM

#

whole wagon How much do you think the router helps. What percents are going to mini or even ...

afaik nothing is going to mini/nano until you hit your usage caps

blazing bison Aug 10, 2025, 10:06 PM

#

that's why it's possible 3000 per week

#

the gpt 5 thinking model that is basically new o3 or o4 still 200 per week

ocean vortex Aug 10, 2025, 10:07 PM

#

blazing bison that's why it's possible 3000 per week

GPT5 default option is 99% more costly than to run just gpt4.1 tbh

#

and gpt5-thinking is o3 equivalent, in terms of cost

wet sparrow Aug 10, 2025, 10:08 PM

#

ocean vortex and gpt5-thinking is o3 equivalent, in terms of cost

Not even my friends who work at OpenAI have that information. You're incredible!

ocean vortex Aug 10, 2025, 10:08 PM

#

I mean roughly, that's just common sense...

#

doesn't take a genius, but obviously this is not official confirmed info, if you need for me to spell it out just for you?

blazing bison Aug 10, 2025, 10:09 PM

#

It just you making assumptions so

#

without real data there is no point

ocean vortex Aug 10, 2025, 10:09 PM

#

well then there's no point in this chat...?

#

lmao

blazing bison Aug 10, 2025, 10:10 PM

#

When i discuss things here, i bring tweets from researchs, papers, news

#

always based on some data

ocean vortex Aug 10, 2025, 10:10 PM

#

Then look at API pricing

#

that's your data

blazing bison Aug 10, 2025, 10:10 PM

#

you're just talking about things in your head and treating it as the truth

whole wagon Aug 10, 2025, 10:10 PM

#

wet sparrow Not even my friends who work at OpenAI have that information. You're incredible!

From what I hear. GPT5 Thinking is quite a bit cheaper to serve than o3 lol

ocean vortex Aug 10, 2025, 10:10 PM

#

blazing bison you're just talking about things in your head and treating it as the truth

#general message

whole wagon Aug 10, 2025, 10:10 PM

#

I mean look at the speed of it. That should be a clue lol

wet sparrow Aug 10, 2025, 10:11 PM

#

"From what I hear..."

ocean vortex Aug 10, 2025, 10:11 PM

#

lmfao

whole wagon Aug 10, 2025, 10:11 PM

#

ocean vortex Then look at API pricing

Bro what

wet sparrow Aug 10, 2025, 10:11 PM

#

There is public information about them using specific silicons that explains the O3 cut price and speed

whole wagon Aug 10, 2025, 10:11 PM

#

How is API pricing data for the cost

#

That's literally so dumb

ocean vortex Aug 10, 2025, 10:12 PM

#

whole wagon How is API pricing data for the cost

It's a good indication of their cost. There were certain rumors suggesting same size as o3 as well. Nothing at all to suggest it is bigger model for now though

whole wagon Aug 10, 2025, 10:13 PM

#

wet sparrow "From what I hear..."

Do your friends work at openAI as the janitors or smth? They should know that info if they even touched GPT5 training pipeline

#

It is not that hidden

wet sparrow Aug 10, 2025, 10:14 PM

#

whole wagon Do your friends work at openAI as the janitors or smth? They should know that in...

They work with the front end and ChatGPT

#

Yes, it's hidden from them. It's hidden from everyone

ocean vortex Aug 10, 2025, 10:14 PM

#

wet sparrow They work with the front end and ChatGPT

web development?

whole wagon Aug 10, 2025, 10:14 PM

#

I know a GTM director and he knows it

#

It's not hidden

#

How would they even hide something that fundamental that easily

wet sparrow Aug 10, 2025, 10:15 PM

#

Well, I'd rather believe my friends than you

whole wagon Aug 10, 2025, 10:16 PM

#

Your friends work in web dev mate

#

How tf would they know

ocean vortex Aug 10, 2025, 10:17 PM

#

yeah they would have no clue tbh

wet sparrow Aug 10, 2025, 10:17 PM

#

I would rather trust information from insiders than from random people on Discord.

ocean vortex Aug 10, 2025, 10:18 PM

#

wet sparrow I would rather trust information from insiders than from random people on Discor...

trust what exactly?

wet sparrow Aug 10, 2025, 10:18 PM

#

I'm leaving now. Bye-bye, conspiracy theorists!

ocean vortex Aug 10, 2025, 10:18 PM

#

that they "don't know"?

#

that's not some secret information you can trust or not trust

#

LOL

blazing bison Aug 10, 2025, 10:19 PM

#

Funny how everyone here has insiders friends lmao

#

You guys are funny

whole wagon Aug 10, 2025, 10:22 PM

#

It is kinda impossible to prove. Because then you literally out them how do you want me to prove it without doing that

glass gulch Aug 10, 2025, 10:25 PM

#

I feel like I know the answer but llmarena.ru is NOT you guys right?

whole wagon Aug 10, 2025, 10:25 PM

#

no kek

#

i just visited the site

#

kekw

#

#

kekw

ripe mountain Aug 10, 2025, 10:27 PM

#

whole wagon

wtf

ocean vortex Aug 10, 2025, 10:32 PM

#

glass gulch I feel like I know the answer but llmarena.ru is NOT you guys right?

very clearly not lol. Seems like some illegal copycat, probably using stolen keys as well tbh

#

steel blaze Aug 10, 2025, 10:39 PM

#

Can someone point me to research on automating the leaderboard with third LLM prompt creation and response evaluation? How close can it get to human results these days?

keen beacon Aug 10, 2025, 10:42 PM

#

What

zinc ore Aug 10, 2025, 10:42 PM

#

https://x.com/scaling01/status/1954546677185970271

Bruh

Lisan al Gaib (@scaling01)

Gemini 2.5 Pro has a 67% winrate against GPT-5 Thinking

keen beacon Aug 10, 2025, 10:42 PM

#

So most people just abuse lmarena to get top of the line models for free?

keen beacon Aug 10, 2025, 10:47 PM

#

keen beacon So most people just abuse lmarena to get top of the line models for free?

Ofc all this data is used to train models but so is your Google search input to feed you with personalised ads right?

cedar tide Aug 10, 2025, 10:52 PM

#

@echo aurora GPT 5 needs to be retested from the beginning on both arenas with the public API. Many people who had preview access say that GPT 5 was much better on that access than now via the official API.

tidal ginkgo Aug 10, 2025, 11:01 PM

#

keen beacon So most people just abuse lmarena to get top of the line models for free?

yes

keen beacon Aug 10, 2025, 11:01 PM

#

tidal ginkgo yes

dude

#

https://tenor.com/view/sayaka-madoka-magica-gif-10367661

Tenor

brisk helm Aug 10, 2025, 11:02 PM

#

keen beacon https://tenor.com/view/sayaka-madoka-magica-gif-10367661

bro lmarena is the sauce

keen beacon Aug 10, 2025, 11:02 PM

#

wdym?

#

for the training data?

brisk helm Aug 10, 2025, 11:02 PM

#

keen beacon for the training data?

no for ppl who cant afford the top of the line ai models

ocean vortex Aug 10, 2025, 11:02 PM

#

Just did some back to back testing gpt5-chat vs gpt5-reasoning (API) vs gpt5-router (chatgpt)

tidal ginkgo Aug 10, 2025, 11:03 PM

#

brisk helm no for ppl who cant afford the top of the line ai models

agree

keen beacon Aug 10, 2025, 11:03 PM

#

brisk helm no for ppl who cant afford the top of the line ai models

Actually not bad if you don't mind if your data will be used to train their models

tidal ginkgo Aug 10, 2025, 11:03 PM

#

70% of people found out about it when a major ai came out and wanted to use if 4 free

ocean vortex Aug 10, 2025, 11:03 PM

#

I would say that default model on chatgpt is halfway in between reasoning and non-reasoning variant on overall performance, but closer to the non-reasoning one

#

there are obvious gains to be had of having reasoning always on

keen beacon Aug 10, 2025, 11:03 PM

#

tidal ginkgo 70% of people found out about it when a major ai came out and wanted to use if 4...

Which one major AI?

tidal ginkgo Aug 10, 2025, 11:04 PM

#

gpt-5, o4

#

grok 3

#

grok 4

ocean vortex Aug 10, 2025, 11:04 PM

#

And that router does not nearly always work as it is supposed to

tidal ginkgo Aug 10, 2025, 11:04 PM

#

claude opus 4

#

most of them

keen beacon Aug 10, 2025, 11:04 PM

#

Wow

#

Someone's gambling the vc money away

stray aspen Aug 10, 2025, 11:07 PM

#

which is the greatest on lmarena: gpt-5 nano,chat, gpt-5, or mini

tidal ginkgo Aug 10, 2025, 11:08 PM

#

lol

#

not here man

tidal ginkgo Aug 10, 2025, 11:09 PM

#

stray aspen which is the greatest on lmarena: gpt-5 nano,chat, gpt-5, or mini

gpt-5

stray aspen Aug 10, 2025, 11:09 PM

#

go to video arena bro

tardy crown Aug 10, 2025, 11:10 PM

#

yo
anyone know how to make the viral videos with baby speaking?
when I try, veo3 doesnt let me because it contains minors

tidal ginkgo Aug 10, 2025, 11:11 PM

#

lol

stray aspen Aug 10, 2025, 11:11 PM

#

you dont lmao

tidal ginkgo Aug 10, 2025, 11:11 PM

#

use another ai

stray aspen Aug 10, 2025, 11:11 PM

#

thats google policy

tardy crown Aug 10, 2025, 11:11 PM

#

yea but people still bypass it

tidal ginkgo Aug 10, 2025, 11:11 PM

#

well idk

tardy crown Aug 10, 2025, 11:11 PM

#

there are tons of videos

tidal ginkgo Aug 10, 2025, 11:11 PM

#

i´m no jailbreaker

tardy crown Aug 10, 2025, 11:11 PM

#

on my fyp

tidal ginkgo Aug 10, 2025, 11:11 PM

#

use another ai

jade egret Aug 10, 2025, 11:14 PM

#

zinc ore https://x.com/scaling01/status/1954546677185970271 Bruh

huh

echo aurora Aug 10, 2025, 11:20 PM

#

cedar tide <@283397944160550928> GPT 5 needs to be retested from the beginning on both aren...

Hey sorry I’ve got cold so haven’t been following chat much. Are we saying that summit is different than what’s available now?

Many people
Where are you seeing that?

Sorry again for not following, I’m a bit out of it atm

kindred solar Aug 10, 2025, 11:51 PM

#

I may be wrong, but I think the GPT-5 model needs to be handled a little differently than others, which may be why users are dissatisfied

#

So here, the model is always reasoning, unlike on the OpenAI platform, where it automatically selects the working style depending on the prompt?

#

so we test the gpt-5 model as on openai api but chatgpt model works diffrently right?

ocean vortex Aug 11, 2025, 12:41 AM

#

cedar tide <@283397944160550928> GPT 5 needs to be retested from the beginning on both aren...

I’m calling bs on that. Try testing gpt5-high against o3-high on some decent question set and you should see for yourself that it performs better. The issue is more of people using the router (default option on chatgpt) and expecting for that to perform like SOTA reasoning model. But this wouldn’t be applicable to API.

#

And it’s not like the metrics are showing night and day difference o3 to gpt5. So those improvements you CAN see kinda align with them… If it was really performing worse I don’t think you would be able to tell it’s better (than o3)

brisk helm Aug 11, 2025, 12:47 AM

#

cedar tide <@283397944160550928> GPT 5 needs to be retested from the beginning on both aren...

agreed

low python Aug 11, 2025, 12:50 AM

#

Selling ChatGPT Plus for $10 for 3 months.

misty star Aug 11, 2025, 12:59 AM

#

verbal nimbus Aug 11, 2025, 1:00 AM

#

cedar tide <@283397944160550928> GPT 5 needs to be retested from the beginning on both aren...

The leaderboard should a slider that only considers votes within the selected time period in order to generate the rankings.

vapid zinc Aug 11, 2025, 3:10 AM

#

When will the leaderboards be updated?

echo aurora Aug 11, 2025, 3:17 AM

#

vapid zinc When will the leaderboards be updated?

Normally take about a week to get the votes/validate before posting

stray aspen Aug 11, 2025, 3:19 AM

#

dude gpt-5 is incredible for lua

#

it keeps on impressing me

#

gemini couldnt one shot this

#

#

this is great

#

i love gpt-5

#

lmao

#

i mean the fact that it works on the first try

#

this is crazy

wicked root Aug 11, 2025, 3:55 AM

#

echo aurora Normally take about a week to get the votes/validate before posting

Could you tell us if there’d be a change in the rankings?

thorn vault Aug 11, 2025, 3:58 AM

#

Hy

echo aurora Aug 11, 2025, 4:34 AM

#

wicked root Could you tell us if there’d be a change in the rankings?

Yeah I’ll make an announcement when we update

wicked root Aug 11, 2025, 4:35 AM

#

echo aurora Yeah I’ll make an announcement when we update

Thank you 👍

solid brook Aug 11, 2025, 5:01 AM

#

Yeah gemini benchmark is with unnerfed version of the model. If they benchmark with the nerfed model it would go A LOT down

#

I swear to god if they do the same with gemini 3 ......

languid crescent Aug 11, 2025, 5:05 AM

#

Hey guys, I have a question

#

I've always wanted to customize the chat (the way I prompt). How do I boldened a text (like h1-h6)? Is it possible to do a strikethrough? Other methods like #, ###, **, ``, ?? How can I do that in LMarena? It makes my prompt much cleaner and clearly instructed.\

#

Is it called "Markdown Syntax"?

wicked root Aug 11, 2025, 5:33 AM

#

man I'm getting nervous about google vs openai for August

dire cosmos Aug 11, 2025, 5:35 AM

#

same here :<

dusky aurora Aug 11, 2025, 5:44 AM

#

languid crescent Is it called "Markdown Syntax"?

seems so

grand coral Aug 11, 2025, 5:53 AM

#

📢 GLM-4.5 Technical Report is here!
We’re pulling back the curtain on how GLM-4.5 was built to excel at reasoning, coding, and agentic tasks — powered by a unique multi-stage training paradigm.

🔍 Highlights:
• Expert model iteration + self-distillation ➡️ unify reasoning, agentic, and general chat into one model
• Hybrid reasoning mode ➡️ knows when to think deeply, when to respond instantly
• Difficulty-based RL curriculum ➡️ break through performance plateaus
• Efficient function calling ➡️ more reliable tool use for code-heavy tasks

📄 Read the report: https://arxiv.org/abs/2508.06471

💬 Let us know your thoughts in the thread!

arXiv.org

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

We present GLM-4.5, an open-source Mixture-of-Experts (MoE) large language model with 355B total parameters and 32B activated parameters, featuring a hybrid reasoning method that supports both thinking and direct response modes. Through multi-stage training on 23T tokens and comprehensive post-training with expert model iteration and reinforceme...

worldly inlet Aug 11, 2025, 6:19 AM

#

S

cedar tide Aug 11, 2025, 6:37 AM

#

Yes with all the models.

hallow ridge Aug 11, 2025, 6:43 AM

#

I need help

cedar tide Aug 11, 2025, 6:44 AM

#

echo aurora Hey sorry I’ve got cold so haven’t been following chat much. Are we saying that ...

"where": on Twitter, influencers who had early access to GPT 5 They tried the same prompts after the release and the result was very far from the preview.

I'm not sure if he's right or not., but I don't think we can trust any company, we don't know if summit has not been quantized or anything for production

hallow ridge Aug 11, 2025, 6:44 AM

#

I need to get my crypto back

delicate rapids Aug 11, 2025, 6:45 AM

#

emoji_138

harsh flume Aug 11, 2025, 6:53 AM

#

Did openAI also discontinue the DeepResearch toggle button?

delicate rapids Aug 11, 2025, 6:54 AM

#

harsh flume Did openAI also discontinue the DeepResearch toggle button?

🐼

hallow ridge Aug 11, 2025, 6:55 AM

#

Hey anyone in the dark net or dark web

#

???

ocean laurel Aug 11, 2025, 7:00 AM

#

Hello

harsh flume Aug 11, 2025, 7:21 AM

#

Yea, I tried a Deep Research prompt at chatgpt just to check if it'd maybe understand that it's a deepresearch query and work accordingly, but it doesnt work

#

Bummer, they could have kept it as a function to use on their new gpt

#

If anyone here uses their API and can confirm weather or not its still avaliable as a API tool call or something id appreciate

solid brook Aug 11, 2025, 7:26 AM

#

wicked root man I'm getting nervous about google vs openai for August

Why do you think gemini 3 releases in august

wicked root Aug 11, 2025, 7:26 AM

#

solid brook Why do you think gemini 3 releases in august

I don't. I'm betting big on Gemini 2.5 retaining its title against GPT5 this month.

solid brook Aug 11, 2025, 7:27 AM

#

wicked root I don't. I'm betting big on Gemini 2.5 retaining its title against GPT5 this mon...

Bruh. Dude gpt 5 high is better than gemini 2.5 pro in every way

wicked root Aug 11, 2025, 7:28 AM

#

Without style control?

solid brook Aug 11, 2025, 7:28 AM

#

Uhm

#

Idk what is stylencontrol?

wicked root Aug 11, 2025, 7:30 AM

#

less emotes and stuff

#

you can turn it off on the lmarena website

solid brook Aug 11, 2025, 7:31 AM

#

Oh

#

Btw the gemini 2.5 in benchmarks is the unnerfed one

delicate rapids Aug 11, 2025, 7:31 AM

#

ST_dissappearwaithowdoyouspell

solid brook Aug 11, 2025, 7:31 AM

#

Not the model we have access to

#

And

#

Gemini 2.5 benchmaxed

#

But gpt 5 benchmarks are real thing

wicked root Aug 11, 2025, 7:33 AM

#

solid brook Btw the gemini 2.5 in benchmarks is the unnerfed one

you mean my 2.5 is nerfed?

solid brook Aug 11, 2025, 7:34 AM

#

wicked root you mean my 2.5 is nerfed?

Yes pretty obvious if you used gemini 2.5 on its first month

#

Go reddit

#

They talk a lot about it

wicked root Aug 11, 2025, 7:34 AM

#

So do you think GPT will beat Gemini this month?

solid brook Aug 11, 2025, 7:35 AM

#

Yes if gemini 3 does not come out

wicked root Aug 11, 2025, 7:35 AM

#

solid brook Aug 11, 2025, 7:35 AM

#

wicked root

Uhm so?

#

This is one benchmark

cedar tide Aug 11, 2025, 7:35 AM

#

@echo aurora
We want to know where the chatgpt router would arrive in the ranking. And if you add it with open ai, be sure that it is exactly the same version as the one available on chatgpt.
(that of the plus users and not pro )

wicked root Aug 11, 2025, 7:35 AM

#

but it's the most important benchmark for me lol

solid brook Aug 11, 2025, 7:36 AM

#

wicked root but it's the most important benchmark for me lol

Dude on the model ranking gpt 5 is number 1. Idc if you want to belive the lie okay then

neon idol Aug 11, 2025, 7:37 AM

#

Is better gpt 5 or gpt 5 pro?

solid brook Aug 11, 2025, 7:37 AM

#

Gpt pro lol

wicked root Aug 11, 2025, 7:37 AM

#

solid brook Dude on the model ranking gpt 5 is number 1. Idc if you want to belive the lie o...

it's not a lie. I'm talking about polymarket.

neon idol Aug 11, 2025, 7:37 AM

#

solid brook Gpt pro lol

Thx

solid brook Aug 11, 2025, 7:37 AM

#

wicked root it's not a lie. I'm talking about polymarket.

Polymarket expects gemini 3 to release

#

Dude the benchmarks are out

wicked root Aug 11, 2025, 7:38 AM

#

Not in august

solid brook Aug 11, 2025, 7:38 AM

#

Gpt 5 already beat gemini 2.5

neon idol Aug 11, 2025, 7:38 AM

#

wicked root Not in august

In septeber

wicked root Aug 11, 2025, 7:38 AM

#

What happens in sept is a non-issue for me.

solid brook Aug 11, 2025, 7:39 AM

#

wicked root Not in august

Not any point in talking to you. Go to youtube watch a good comparison vid of the 2 models

cedar tide Aug 11, 2025, 7:39 AM

#

If you interested you can Upvote my request
https://discord.com/channels/1340554757349179412/1404368537690308739

verbal nimbus Aug 11, 2025, 7:51 AM

#

#

It didn't even beat gpt-4o here.

solid brook Aug 11, 2025, 7:52 AM

#

verbal nimbus It didn't even beat gpt-4o here.

Yeah man you're right.....

vague bloom Aug 11, 2025, 7:55 AM

#

Yo guys is there any way to connect LMArena With R Studio

solid brook Aug 11, 2025, 7:55 AM

#

vague bloom Yo guys is there any way to connect LMArena With R Studio

What is r studio

terse river Aug 11, 2025, 7:55 AM

#

R language related?

#

i.. guess?

vague bloom Aug 11, 2025, 7:57 AM

#

solid brook What is r studio

Boblox

#

The R word is blocked for some reson

#

Idk why

terse river Aug 11, 2025, 7:57 AM

#

oh

solid brook Aug 11, 2025, 7:58 AM

#

vague bloom The R word is blocked for some reson

No it doesn't provide any api

vague bloom Aug 11, 2025, 8:00 AM

#

solid brook No it doesn't provide any api

Thx

lime coral Aug 11, 2025, 8:11 AM

#

solid brook Gpt 5 already beat gemini 2.5

No

solid brook Aug 11, 2025, 8:24 AM

#

lime coral No

Ok

tame granite Aug 11, 2025, 8:24 AM

#

just make point system and see what is happening 😄

solid brook Aug 11, 2025, 8:24 AM

#

tame granite just make point system and see what is happening 😄

?

tame granite Aug 11, 2025, 8:25 AM

#

solid brook ?

viscid timber Aug 11, 2025, 8:37 AM

#

where do i generate images? in video arena?

potent snow Aug 11, 2025, 8:57 AM

#

Is it somehow possible to implment midjourny?

hollow imp Aug 11, 2025, 9:12 AM

#

Are y'all really sure gpt5 is better than o3?

#

Even if it's better it would be the higher tier version
Us free tier users gpt5 isn't better than lmarena o3

#

@deep adder answer me gpt5 pfp

hardy pecan Aug 11, 2025, 9:21 AM

#

verbal nimbus

literally no one reads how this is scored, no due diligence, just copy paste misinformed charts

#

its a bit wild

mortal coyote Aug 11, 2025, 9:33 AM

#

how can i retrieve the seed code of the image i generated using Lmarena

inner gate Aug 11, 2025, 10:17 AM

#

verbal nimbus

What’s offline IQ? What does it mean

tribal aspen Aug 11, 2025, 10:19 AM

#

Y'all agree?

#

#1403773627010388161

keen beacon Aug 11, 2025, 10:21 AM

#

tribal aspen Y'all agree?

Reminds me of Perplexity for some reason or ChatGPT Ui in general

#

Doesn't mean it's a bad thing though

tribal aspen Aug 11, 2025, 10:22 AM

#

keen beacon Reminds me of Perplexity for some reason or ChatGPT Ui in general

Yeahhh

tribal aspen Aug 11, 2025, 10:22 AM

#

keen beacon Doesn't mean it's a bad thing though

But feels much better than LMArena's UI

#

If we have such a UI, I would happily use noting other than LMArena lol

keen beacon Aug 11, 2025, 10:23 AM

#

tribal aspen But feels much better than LMArena's UI

I have a problem with deleted convos coming back for some reason even if I delete them

#

on LMArena

#

I need to put it on #1343291835845578853

tribal aspen Aug 11, 2025, 10:23 AM

#

That can be dangerous sometimes

#

LoL

keen beacon Aug 11, 2025, 10:23 AM

#

tribal aspen That can be dangerous sometimes

Even if my convos leaked to others, they would be about general tests and brainstorming

#

and translation

#

basic stuff

#

no sensitive data lol

verbal nimbus Aug 11, 2025, 10:27 AM

#

inner gate What’s offline IQ? What does it mean

It means that the dataset is private

lusty narwhal Aug 11, 2025, 10:28 AM

#

hi

inner gate Aug 11, 2025, 10:28 AM

#

Ah yes datasets

inner gate Aug 11, 2025, 10:28 AM

#

lusty narwhal hi

Hey joel 👋

verbal nimbus Aug 11, 2025, 10:29 AM

#

hardy pecan literally no one reads how this is scored, no due diligence, just copy paste mis...

It's a very well known benchmark. Maybe this graphic is more familiar:

keen beacon Aug 11, 2025, 10:29 AM

#

verbal nimbus It's a very well known benchmark. Maybe this graphic is more familiar:

Is there a reason for the IQ being so low?

hardy pecan Aug 11, 2025, 10:29 AM

#

verbal nimbus It's a very well known benchmark. Maybe this graphic is more familiar:

Yes, I'm well aware. Do you know how they grade the results? The methodology?

keen beacon Aug 11, 2025, 10:29 AM

#

Concrete reason?

inner gate Aug 11, 2025, 10:31 AM

#

Does Gemini 2.5 pro have a higher version of it?

verbal nimbus Aug 11, 2025, 10:31 AM

#

hardy pecan Yes, I'm well aware. Do you know how they grade the results? The methodology?

I presume they just grade it similarly to how they'd grade a human. These IQ tests are typically pretty straightforward to mark (multiple choice, unscrambling a word, etc.). The test set is private though, that's why it's considered "offline".

inner gate Aug 11, 2025, 10:31 AM

#

Someone said they purposely dumbed down gpt5

#

And will

#

Increase later on

verbal nimbus Aug 11, 2025, 10:32 AM

#

keen beacon Is there a reason for the IQ being so low?

Sam Altman's excuse was a "router issue" (unrelated tweet). But official API status page didn't report it.

keen beacon Aug 11, 2025, 10:33 AM

#

verbal nimbus Sam Altman's excuse was a "router issue" (unrelated tweet). But official API sta...

I see.

hardy pecan Aug 11, 2025, 10:33 AM

#

verbal nimbus I presume they just grade it similarly to how they'd grade a human. These IQ tes...

Yes sure, but it's an average of the last 7 sets as stated, as you can see there is only a few results so far. Look at GPT5 , 116 and 70 iq, with an average of 93 hence that data, but there is 5 more data points they'll need to give a statistically meaningful result

verbal nimbus Aug 11, 2025, 10:33 AM

#

But I do think it's partly caused by unreliable routing. It might have underestimated the difficulty and routed it to a weaker model.

hardy pecan Aug 11, 2025, 10:33 AM

#

If it hits 116, I can assume it'll hit around there on average once all the testing is done

#

Have to wait for all the testing to be completed first,

verbal nimbus Aug 11, 2025, 10:33 AM

#

I don't think the router is very reliable.

verbal nimbus Aug 11, 2025, 10:35 AM

#

hardy pecan Yes sure, but it's an average of the last 7 sets as stated, as you can see there...

The 115 is on the MENSA (public) test, if that's what you mean by 116.

hardy pecan Aug 11, 2025, 10:36 AM

#

verbal nimbus The 115 is on the MENSA (public) test, if that's what you mean by 116.

#

This is what I mean.

#

More data is needed to smooth out the variance

#

But yes the offline is better

#

I mean mensa.no

verbal nimbus Aug 11, 2025, 10:38 AM

#

hardy pecan

Interesting, it increased for GPT-5 but the Thinking model was still worse on the second test.

hardy pecan Aug 11, 2025, 10:38 AM

#

Yes I'm confused with the thinking result too , I assume it to be much better

ocean vortex Aug 11, 2025, 10:38 AM

#

hollow imp Are y'all really sure gpt5 is better than o3?

we know for sure that gpt5-high is better than o3-high

hardy pecan Aug 11, 2025, 10:38 AM

#

I think it'll need to be tested more to smooth out variance

verbal nimbus Aug 11, 2025, 10:38 AM

#

Actually, look at o3 Pro's score too. Did it get nerfed?

hardy pecan Aug 11, 2025, 10:39 AM

#

I expect thinking to do far better that regular gpt5

verbal nimbus Aug 11, 2025, 10:39 AM

#

IQ coincidentally drops by 45% right after GPT-5's release?

ocean vortex Aug 11, 2025, 10:40 AM

#

hardy pecan I expect thinking to do far better that regular gpt5

that offline iq test is very weird. Thinking version did even worse iirc

hardy pecan Aug 11, 2025, 10:40 AM

#

ocean vortex that offline iq test is very weird. Thinking version did even worse iirc

Certainly surprising

ocean vortex Aug 11, 2025, 10:44 AM

#

and why is o3-pro lower than o3...

keen beacon Aug 11, 2025, 10:47 AM

#

verbal nimbus IQ coincidentally drops by 45% right after GPT-5's release?

Oooooof

willow grail Aug 11, 2025, 10:52 AM

#

verbal nimbus

that is gpt5 low

#

gpt5 high is the best ai ever

fleet lintel Aug 11, 2025, 10:53 AM

#

When I use chatgpt app, I get GPT low/mid or high? how do I figure this out?

willow grail Aug 11, 2025, 10:53 AM

#

fleet lintel When I use chatgpt app, I get GPT low/mid or high? how do I figure this out?

wont tell ya.

#

its my computation. only for me.

#

and the other higher ups

#

we need to start gatekeep this more often MY KINGS AND QUEENS

#

THE COMPUTATION IS OURS ONLY

fleet lintel Aug 11, 2025, 10:55 AM

#

willow grail wont tell ya.

umm... I have feeling that most of my queries are going to low or mid. I am getting better responses from gemini model 🙁
How do I force high ??

willow grail Aug 11, 2025, 10:55 AM

#

fleet lintel umm... I have feeling that most of my queries are going to low or mid. I am ge...

ok try lmarena

#

just to see if the reply there is diff.

#

gentle breach Aug 11, 2025, 10:57 AM

#

hey

willow grail Aug 11, 2025, 10:59 AM

#

fleet lintel umm... I have feeling that most of my queries are going to low or mid. I am ge...

and?

ocean vortex Aug 11, 2025, 11:01 AM

#

fleet lintel umm... I have feeling that most of my queries are going to low or mid. I am ge...

if you use gpt5-thinking rather than "gpt5" and get routed to thinking, the reasoning effort is gonna be higher

inner gate Aug 11, 2025, 11:02 AM

#

hollow imp Are y'all really sure gpt5 is better than o3?

Apparently they dumbed gpt5 down for a while idk why tho or if it’s evej true

#

Even

ocean vortex Aug 11, 2025, 11:04 AM

#

it's like..

gpt5 = gpt5-minimal/low
gpt5-thinking = gpt5-medium

neon idol Aug 11, 2025, 11:05 AM

#

ocean vortex it's like.. gpt5 = gpt5-minimal/low gpt5-thinking = gpt5-medium

Gpt 5 pro= gpt 5 high

ocean vortex Aug 11, 2025, 11:06 AM

#

neon idol Gpt 5 pro= gpt 5 high

Pro is parallel requests though

#

so it would still be more capable even with matched reasoning effort

neon idol Aug 11, 2025, 11:07 AM

#

ocean vortex so it would still be more capable even with matched reasoning effort

So he is the best gpt 5 model?

ocean vortex Aug 11, 2025, 11:08 AM

#

but yeah it could be high... For Pro sub maybe even normal gpt5-thinking is 'high', unsure

ocean vortex Aug 11, 2025, 11:08 AM

#

neon idol So he is the best gpt 5 model?

Yes pro is. But it's very expensive

neon idol Aug 11, 2025, 11:09 AM

#

ocean vortex but yeah it could be high... For Pro sub maybe even normal gpt5-thinking is 'hig...

Bro do you have a prompt for test ai?

neon idol Aug 11, 2025, 11:10 AM

#

ocean vortex Yes pro is. But it's very expensive

Bro just use it for free and unlimited on Genspark AI

verbal nimbus Aug 11, 2025, 11:20 AM

#

willow grail that is gpt5 low

It said "deeper reasoning" if you click on the thinking model bubble. No idea what mode that is

solid brook Aug 11, 2025, 11:20 AM

#

neon idol Gpt 5 pro= gpt 5 high

No gpt 5 pro reasons almost double the time

willow grail Aug 11, 2025, 11:22 AM

#

verbal nimbus It said "deeper reasoning" if you click on the thinking model bubble. No idea wh...

no it does not.

solid brook Aug 11, 2025, 11:22 AM

#

verbal nimbus

#

I mean

#

.....

verbal nimbus Aug 11, 2025, 11:23 AM

#

willow grail no it does not.

willow grail Aug 11, 2025, 11:23 AM

#

verbal nimbus

thats fake.

#

gpt5 is best model if you have done ur research

solid brook Aug 11, 2025, 11:23 AM

#

verbal nimbus

Can you stop and actually use the model?

willow grail Aug 11, 2025, 11:24 AM

#

verbal nimbus

you are being pranked.

verbal nimbus Aug 11, 2025, 11:24 AM

#

solid brook Can you stop and actually use the model?

I have on ChatGPT, it's very odd there

#

I asked it to summarize the chat but it added in a bunch of random stuff that was never mentioned

solid brook Aug 11, 2025, 11:25 AM

#

verbal nimbus I have on ChatGPT, it's very odd there

Gpt 5 pro is best if you dont have pro plan use gpt 5 on lm arena and say to it think very hard

#

Because that is their top model

ocean vortex Aug 11, 2025, 11:25 AM

#

verbal nimbus

#

can we put this to bed now?

autumn cargo Aug 11, 2025, 11:25 AM

#

Do we know whether the GPT-5 in the leaderboard is medium or high?

#

I think it should be made clear.

willow grail Aug 11, 2025, 11:26 AM

#

solid brook Gpt 5 pro is best if you dont have pro plan use gpt 5 on lm arena and say to it ...

juhm when i say think hard its worse than if i dont say how to think

verbal nimbus Aug 11, 2025, 11:26 AM

#

ocean vortex

That's closer to it, but that's not the model on ChatGPT Plus

solid brook Aug 11, 2025, 11:26 AM

#

willow grail juhm when i say think hard its worse than if i dont say how to think

How so? It has been always better for md

#

Me

willow grail Aug 11, 2025, 11:26 AM

#

solid brook How so? It has been always better for md

thats why i prefer using poe.com. u can literally pick gpt5 high there

verbal nimbus Aug 11, 2025, 11:26 AM

#

The model on ChatGPT is very odd

solid brook Aug 11, 2025, 11:27 AM

#

verbal nimbus That's closer to it, but that's not the model on ChatGPT Plus

We are comparing top model from each company. That is the top model from openai

ocean vortex Aug 11, 2025, 11:27 AM

#

verbal nimbus That's closer to it, but that's not the model on ChatGPT Plus

No this just proves that it's a sh'it test. o3-pro also scores less than o3.

autumn cargo Aug 11, 2025, 11:27 AM

#

solid brook Gpt 5 pro is best if you dont have pro plan use gpt 5 on lm arena and say to it ...

It doesn't matter if it is not gpt-5 high. The "think hard" thing is for the router to enforce gpt-5 high.

ocean vortex Aug 11, 2025, 11:28 AM

#

gpt5-pro is not even significantly better than gpt5-thinking

solid brook Aug 11, 2025, 11:28 AM

#

autumn cargo It doesn't matter if it is not gpt-5 high. The "think hard" thing is for the rou...

Bro no router in api. Also gpt 5 high is just gpt 5 on reason effort hard

verbal nimbus Aug 11, 2025, 11:28 AM

#

ocean vortex No this just proves that it's a sh'it test. o3-pro also scores less than o3.

Why would it be? All the other models are fine

solid brook Aug 11, 2025, 11:28 AM

#

Wich lmarena gpt 5 is set to high

verbal nimbus Aug 11, 2025, 11:29 AM

#

Not to mention that it tests each model multiple times, which shows changes over time.

autumn cargo Aug 11, 2025, 11:29 AM

#

solid brook Bro no router in api. Also gpt 5 high is just gpt 5 on reason effort hard

My point exactly. There is no pro, reasoning, etc in API. It's just 'minimal', 'low', 'medium', 'high'

verbal nimbus Aug 11, 2025, 11:31 AM

#

ocean vortex gpt5-pro is not even significantly better than gpt5-thinking

Is it on the API?

autumn cargo Aug 11, 2025, 11:31 AM

#

ocean vortex gpt5-pro is not even significantly better than gpt5-thinking

There is no such a thing as gpt5-pro. It's not technically correct.

ChatGPT 5 Pro: gpt-5 high
ChatGPT 5 Reasoning: gpt-5 medium
ChatGPT 5 No-Reasoning: gpt-5 chat
ChatGPT 5 Default -> Router stuff

solid brook Aug 11, 2025, 11:31 AM

#

It is sad what a world we live in. People are trashing gpt 5 just because it does not satisfy their delulu

solid brook Aug 11, 2025, 11:32 AM

#

autumn cargo There is no such a thing as gpt5-pro. It's not technically correct. ChatGPT 5 P...

I saw gpt pro reason for 11 minutes. Gpt 5 high 5 min

#

Sos they are diffrent

#

So

ocean vortex Aug 11, 2025, 11:32 AM

#

autumn cargo There is no such a thing as gpt5-pro. It's not technically correct. ChatGPT 5 P...

LOL. That's not correct at all.

#

gpt5-high is just gpt5 with high reasoning effort

verbal nimbus Aug 11, 2025, 11:33 AM

#

Ok wth is actually GPT-5 Pro

ocean vortex Aug 11, 2025, 11:33 AM

#

Pro is prompting the same model several times in parallel

#

you can also have any reasoning effort with Pro

willow grail Aug 11, 2025, 11:33 AM

#

ocean vortex Pro is prompting the same model several times in parallel

wait. source?

#

parallel compute?

verbal nimbus Aug 11, 2025, 11:33 AM

#

There's no endpoint for it

solid brook Aug 11, 2025, 11:34 AM

#

verbal nimbus Ok wth is actually GPT-5 Pro

The model in the pro plan

ocean vortex Aug 11, 2025, 11:34 AM

#

verbal nimbus There's no endpoint for it

There's no API for it yet I think but it is avail for Pro subs

verbal nimbus Aug 11, 2025, 11:34 AM

#

solid brook The model in the pro plan

Hmm, you mean they benchmarked the model manually?

solid brook Aug 11, 2025, 11:35 AM

#

verbal nimbus Hmm, you mean they benchmarked the model manually?

Yeah i mean they gad to

#

Had to

verbal nimbus Aug 11, 2025, 11:35 AM

#

It took them so long to do that for Grok

ocean vortex Aug 11, 2025, 11:35 AM

#

willow grail wait. source?

Look up grok4-heavy I think it was documented, the concept is the same here. It's nothing new.

willow grail Aug 11, 2025, 11:36 AM

#

i am biased.

#

i dont wanna read nazi stuff. not even for 100 euro

solid brook Aug 11, 2025, 11:36 AM

#

ocean vortex Look up grok4-heavy I think it was documented, the concept is the same here. It'...

Oh so you saying that gpt 5 pro is just two gpt 5 high?

ocean vortex Aug 11, 2025, 11:36 AM

#

Also gemini deep think uses similar system

ocean vortex Aug 11, 2025, 11:37 AM

#

solid brook Oh so you saying that gpt 5 pro is just two gpt 5 high?

We don't know how many instances exactly, but it's more like 10 of them tbh

verbal nimbus Aug 11, 2025, 11:37 AM

#

ocean vortex

I mean, it's nothing remarkable. That's exactly the same score as o3 without Pro.

solid brook Aug 11, 2025, 11:38 AM

#

ocean vortex We don't know how many instances exactly, but it's more like 10 of them tbh

No no way that would be expensive as hell would surpass the 200 doller pro plan price

autumn cargo Aug 11, 2025, 11:38 AM

#

ocean vortex gpt5-high is just gpt5 with high reasoning effort

Based on this pro is high

ocean vortex Aug 11, 2025, 11:38 AM

#

like cons@10 prompting except here rating of each individual response works differently. So it may choose a unique response even if it was very different from all the others

jolly kite Aug 11, 2025, 11:39 AM

#

you hello guys, is on LMArena chatgpt 5 thinking? (i searched it but coudnt find it)

solid brook Aug 11, 2025, 11:39 AM

#

jolly kite you hello guys, is on LMArena chatgpt 5 thinking? (i searched it but coudnt find...

No it is better

#

Gpt 5 thinking is gpt 5 medium

#

Lmarena gpt 5 high

verbal nimbus Aug 11, 2025, 11:39 AM

#

autumn cargo Based on this pro is high

The name has been changed

jolly kite Aug 11, 2025, 11:40 AM

#

solid brook Lmarena gpt 5 high

aaaa okey thanks ❤️

verbal nimbus Aug 11, 2025, 11:40 AM

#

verbal nimbus The name has been changed

Maybe it was a typo?

#

It's just High now

willow grail Aug 11, 2025, 11:40 AM

#

REMINDER!!
POE gives you 1000 to 2000 GPT5 HIGH PROMPTS

ocean vortex Aug 11, 2025, 11:40 AM

#

autumn cargo Based on this pro is high

This is written like that because you can have "gpt5 Pro (Medium)" or Low. They explicitly selected pro model and then selected high reasoning effort.

willow grail Aug 11, 2025, 11:40 AM

#

for only 22 EURO

solid brook Aug 11, 2025, 11:40 AM

#

Guys can you go and test models and see the reason time and not spread false info?

#

For gpt 5 pro go on youtube

autumn cargo Aug 11, 2025, 11:41 AM

#

verbal nimbus The name has been changed

Yes because there is no api at the moment called gpt5-pro

solid brook Aug 11, 2025, 11:41 AM

#

Gpt 5 high go on lmarena

autumn cargo Aug 11, 2025, 11:41 AM

#

That's what I'm saying.

ocean vortex Aug 11, 2025, 11:42 AM

#

if they just wrote gpt5-pro that would be incomplete. You can't run a request through API (they probably have early access...) with pro without selecting specific reasoning effort

verbal nimbus Aug 11, 2025, 11:42 AM

#

Coding scores are a bit confusing

solid brook Aug 11, 2025, 11:42 AM

#

verbal nimbus Coding scores are a bit confusing

Agentic coding matter more i think

ocean vortex Aug 11, 2025, 11:43 AM

#

verbal nimbus Coding scores are a bit confusing

livebench category scores are far from reliable

verbal nimbus Aug 11, 2025, 11:43 AM

#

solid brook Agentic coding matter more i think

Yeah

verbal nimbus Aug 11, 2025, 11:43 AM

#

solid brook Agentic coding matter more i think

They're closer to the expected performance in Copilot

ocean vortex Aug 11, 2025, 11:44 AM

#

@autumn cargo same applies for gpt5-pro

#

you can't run it without some specific reasoning effort

verbal nimbus Aug 11, 2025, 11:44 AM

#

Anyone know the reasoning effort of GPT-5 in GitHub Copilot?

autumn cargo Aug 11, 2025, 11:44 AM

#

ocean vortex if they just wrote gpt5-pro that would be incomplete. You can't run a request th...

I don't think so. Do you have a source on that? I believe they are using gpt-5 api with "high" reasoning effort.

solid brook Aug 11, 2025, 11:44 AM

#

verbal nimbus Anyone know the reasoning effort of GPT-5 in GitHub Copilot?

Huh?

#

5.1

verbal nimbus Aug 11, 2025, 11:45 AM

#

solid brook Huh?

Typo

#

Because it's the successor of 4.1 😆

solid brook Aug 11, 2025, 11:45 AM

#

Idk they must have wrote it somewhere

ocean vortex Aug 11, 2025, 11:45 AM

#

autumn cargo I don't think so. Do you have a source on that? I believe they are using gpt-5 a...

I just showed you screenshot of o3-pro. gpt5-high is not pro, just like o3-high isn't pro

verbal nimbus Aug 11, 2025, 11:45 AM

#

verbal nimbus Because it's the successor of 4.1 😆

It's cheaper in the API

#

At least in terms of input tokens, which is more significant I think

ocean vortex Aug 11, 2025, 11:47 AM

#

You would call this o3-pro (high):

#

and this is o3-high:

verbal nimbus Aug 11, 2025, 11:48 AM

#

verbal nimbus Coding scores are a bit confusing

The score discrepancy for low is even more confusing

#

Maybe the router just decided to send those class of problems to a dumber model

autumn cargo Aug 11, 2025, 11:51 AM

#

ocean vortex I just showed you screenshot of o3-pro. gpt5-high is not pro, just like o3-high ...

Yes of course o3-pro and o3-high are two different models. I was aware of that. But based on the (now changed) livebench leaderboard and the fact that no api was officially released as gpt5-pro, I though gpt5-pro was the same gpt5 with high reasoning effort (ie gpt-5 high). Thanks for correcting me.

ocean vortex Aug 11, 2025, 11:54 AM

#

verbal nimbus The score discrepancy for low is even more confusing

gpt5-low actually looks like a very decent model to be fair...

fervent jolt Aug 11, 2025, 11:54 AM

#

Does anyone know when the leader board is going to be updated next?

ocean vortex Aug 11, 2025, 11:54 AM

#

I noticed this when I was testing chatgpt router

#

when it routed to reasoning the responses were nearly as good as you can realistically expect from a thinking model

#

even though most definitely this is low reasoning effort

bright junco Aug 11, 2025, 11:56 AM

#

Why does my gemini 2.5 pro print incompletely? Is there a way to fix it?

verbal nimbus Aug 11, 2025, 11:57 AM

#

ocean vortex gpt5-low actually looks like a very decent model to be fair...

I see it as the successor to the GPT-4.1 coding model, but the routing seems a bit unreliable to me. The main model is probably good, but sometimes it just drops the ball.

#

Like how can there be such a discrepancy in results between the two coding categories. I think the router is messing things up

ocean vortex Aug 11, 2025, 11:58 AM

#

verbal nimbus I see it as the successor to the GPT-4.1 coding model, but the routing seems a b...

gpt5-minimal is underperforming though. I think that's the main reason they don't give you choice for no reasoning at all

#

#

the gap minimal to low is insane

#

and they can't afford to lose out to gpt4.1 lol

verbal nimbus Aug 11, 2025, 11:59 AM

#

ocean vortex and they can't afford to lose out to gpt4.1 lol

I think it's affecting the routing strategy

#

I don't think the gap is just because of "less thinking"

keen beacon Aug 11, 2025, 12:01 PM

#

gpt 5 base is just bad

ocean vortex Aug 11, 2025, 12:01 PM

#

it is not less, it is quite literally no thinking at all with "minimal"

#

it outputs 2 times less than gpt4.1

verbal nimbus Aug 11, 2025, 12:02 PM

#

ocean vortex it outputs 2 times less than gpt4.1

Yikes

#

Very weird

keen beacon Aug 11, 2025, 12:02 PM

#

horizon alpha version of gpt 5 base (juice 0) has a worse gpqa diamond score compared to gpt 4.1 nano

verbal nimbus Aug 11, 2025, 12:02 PM

#

I hope that's not the version in Copilot. It'll be a downgrade.

harsh flume Aug 11, 2025, 12:03 PM

#

I havent played the arena in a week. Any cool anonym model right now?

keen beacon Aug 11, 2025, 12:03 PM

#

horizon beta (juice 5, likely an early version of minimal) did worse than gpt 4.1 mini on gpqa diamond

verbal nimbus Aug 11, 2025, 12:04 PM

#

keen beacon horizon beta (juice 5, likely an early version of minimal) did worse than gpt 4....

Or did gpt-4.1-mini have data contamination

keen beacon Aug 11, 2025, 12:05 PM

#

openai didnt mention the horizon models in their announcement because everyone thought they were nano or mini models 😭

ocean vortex Aug 11, 2025, 12:05 PM

#

keen beacon gpt 5 base is just bad

I don't think it's bad. It's actually impressive the gains they were able to make with spatial reasoning. It's just that it's hard to make a hybrid model which would be SOTA both when maxed out and with reasoning off.

#

gpt5-minimal score is low because it's too concise when it doesn't get to use reasoning tokens

keen beacon Aug 11, 2025, 12:07 PM

#

ocean vortex I don't think it's bad. It's actually impressive the gains they were able to mak...

they svg maxxed this model among other things. cpt after cpt 🤷 the brain damage was apparent with the horizon checkpoints

#

simpleqa was 33%...

verbal nimbus Aug 11, 2025, 12:07 PM

#

ocean vortex I don't think it's bad. It's actually impressive the gains they were able to mak...

Spatial reasoning?

keen beacon Aug 11, 2025, 12:08 PM

#

but i agree about the hybrid thinking thing lol

ocean vortex Aug 11, 2025, 12:08 PM

#

keen beacon they svg maxxed this model among other things. cpt after cpt 🤷 the brain damage...

I mean if that was true gpt5-high would suck. But it beats o3-high convincingly

keen beacon Aug 11, 2025, 12:09 PM

#

ocean vortex I mean if that was true gpt5-high would suck. But it beats o3-high convincingly

no, i agree its probably the hybrid thinking thing causing the poor performance in general for the minimal/not thinking variants

#

but they definitely focused on svg specifically among other things

#

because horizon models had poor benchmarks whilst people liked the svg from those

#

the svg thing is a tangent of mine sorry 🤣

tribal aspen Aug 11, 2025, 12:10 PM

#

what is the limit of gpt 5 on lmarena direct chat?

ocean vortex Aug 11, 2025, 12:10 PM

#

but if you look at webdev arena...

tribal aspen Aug 11, 2025, 12:10 PM

#

someone please answer

ocean vortex Aug 11, 2025, 12:10 PM

#

gpt5 is now SOTA there as well

#

o3 used to do poorly

tribal aspen Aug 11, 2025, 12:10 PM

#

tribal aspen what is the limit of gpt 5 on lmarena direct chat?

???

tribal aspen Aug 11, 2025, 12:10 PM

#

ocean vortex gpt5 is now SOTA there as well

but is very fast ruins thinking

#

1415926535 8979323846 2643383279 5028841971 6939937510
5820974944 5923078164 0628620899 8628034825 3421170679
8214808651 3282306647 0938446095 5058223172 5359408128
4811174502 8410270193 8521105559 6446229489 5493038196
4428810975 6659334461 2847564823 3786783165 2712019091
4564856692 3460348610 4543266482 1339360726 0249141273
7245870066 0631558817 4881520920 9628292540 9171536436
7892590360 0113305305 4882046652 1384146951 9415116094
3305727036 5759591953 0921861173 8193261179 3105118548
0744623799 6274956735 1885752724 8912279381 8301194912
9833673362 4406566430 8602139494 6395224737 1907021798
6094370277 0539217176 2931767523 8467481846 7669405132
0005681271 4526356082 7785771342 7577896091 7363717872
1468440901 2249534301 4654958537 1050792279 6892589235
4201995611 2129021960 8640344181 5981362977 4771309960
5187072113 4999999837 2978049951 0597317328 1609631859
5024459455 3469083026 4252230825 3344685035 2619311881
7101000313 7838752886 5875332083 8142061717 7669147303
5982534904 2875546873 1159562863 8823537875 9375195778
1857780532 1712268066 1300192787 6611195909 2164201989

keen beacon Aug 11, 2025, 12:13 PM

#

ocean vortex gpt5 is now SOTA there as well

they focused on 'big model' things with the cpt/etc it seemed, svg, web dev, etc. just found it funny they focused on svg. the benefits are definitely real though in those areas. at least with the horizon checkpoints, those models were fried except in those regards

#

my points are about different things and i clustered them together for some reason and it's confusing sorry 🤣

verbal nimbus Aug 11, 2025, 12:25 PM

#

ocean vortex gpt5 is now SOTA there as well

The minimal reasoning version is #1 on Design Arena too.

autumn cargo Aug 11, 2025, 12:26 PM

#

autumn cargo Yes of course o3-pro and o3-high are two different models. I was aware of that. ...

But again there is a discussion here https://community.openai.com/t/the-least-important-question-right-now-why-is-gpt-5-pro-not-available-in-api-at-exuberant-pricing/1339471/2 that hints that gpt-5 pro is just a maxed out gpt-5. So really unless OpenAI introduce a new model named gpt-5 pro, I'm inclined to think that gpt-5 pro doesn't exist! List of models here: https://platform.openai.com/docs/models

OpenAI Developer Community

The least important question right now: Why is GPT-5 pro not availa...

Let’s read carefully: Pro and Team tier users have access to GPT-5 Thinking Pro, which takes a bit longer to think but delivers the accuracy you need for complex tasks. Then also consider: That gives us a picture of what Pro delivers: more reasoning, additionally, more context window than Plus subscribers. gpt-5-chat-latest , the non-...

verbal nimbus Aug 11, 2025, 12:28 PM

#

Gemini and GLM 4.5 seems benchmaxxed for React + Tailwind

#

On Design Arena they drop to #9 and #10, since they can't use React

hollow imp Aug 11, 2025, 12:34 PM

#

ocean vortex Pro is parallel requests though

Give me pro or high give me somethingggggg 😭

unborn sleet Aug 11, 2025, 12:36 PM

#

can someone tell me why chat gpt doesnt host previews of their code?

hollow imp Aug 11, 2025, 12:37 PM

#

verbal nimbus On Design Arena they drop to #9 and #10, since they can't use React

Whats design arena

verbal nimbus Aug 11, 2025, 12:38 PM

#

hollow imp Whats design arena

designarena.ai

#

It's like Web Dev Arena

unborn sleet Aug 11, 2025, 12:41 PM

#

verbal nimbus It's like Web Dev Arena

is it as good as aistudio builder

hollow imp Aug 11, 2025, 12:49 PM

#

verbal nimbus designarena.ai

Is it part of lmarena?

unborn sleet Aug 11, 2025, 12:55 PM

#

hollow imp Is it part of lmarena?

yes

ocean vortex Aug 11, 2025, 12:58 PM

#

autumn cargo But again there is a discussion here https://community.openai.com/t/the-least-im...

API is just not publicly released yet.
https://cdn.openai.com/pdf/8124a3ce-ab78-4f06-96eb-49ea29ffb52f/gpt5-system-card-aug7.pdf

#

brittle tiger Aug 11, 2025, 1:27 PM

#

https://x.com/patloeber/status/1954887103105380361?t=5iJUGwzPogdcwyHCdAgGIw&s=19

Patrick Loeber (@patloeber)

#

Native image gen finally coming to better model than 2.0 flash. Crazy how long that one has been out without improvement

#

Logan also showed it editing palmer luckeys tweet

willow grail Aug 11, 2025, 1:38 PM

#

https://x.com/i/status/1954577085520806067

Erik Kirkegaard (@erik_theviking_)

@apralky How can i short every one of his positions?

solid brook Aug 11, 2025, 1:47 PM

#

tribal aspen what is the limit of gpt 5 on lmarena direct chat?

What limit?

#

There is no limit on chatting with the model i think

willow grail Aug 11, 2025, 1:48 PM

#

box before eating: 340g

#

pls weigh again after eating. thank you

neon idol Aug 11, 2025, 1:49 PM

#

brittle tiger Native image gen finally coming to better model than 2.0 flash. Crazy how long t...

What is the name?

#

And how can try it

solid brook Aug 11, 2025, 1:49 PM

#

brittle tiger https://x.com/patloeber/status/1954887103105380361?t=5iJUGwzPogdcwyHCdAgGIw&s=19

Lol he is teasing gemini 3 like sama did with gpt 5

#

The death star

whole sundial Aug 11, 2025, 1:56 PM

#

https://fixupx.com/Zai_org/status/1954898011181789431

Z.ai (@Zai_org)

Introducing GLM-4.5V: a breakthrough in open-source visual reasoning
︀︀
︀︀GLM-4.5V delivers state-of-the-art performance among open-source models in its size class, dominating across 41 benchmarks.
︀︀
︀︀Built on the GLM-4.5-Air base model, GLM-4.5V inherits proven techniques from GLM-4.1V-Thinking while achieving effective scaling through a powerful 106B-parameter MoE architecture.
︀︀
︀︀Hugging Face: huggingface.co/zai-org/GLM-4.5V
︀︀GitHub: github.com/zai-org/GLM-V
︀︀Z.ai API: docs.z.ai/guides/vlm/glm-4.5v
︀︀Try it now: chat.z.ai

**💬 1 🔁 3 ❤️ 19 👁️ 957 **

stray aspen Aug 11, 2025, 1:58 PM

#

why is gpt-5 api so forgetful

#

i cant keep track of older messages in a conversation

tribal aspen Aug 11, 2025, 2:33 PM

#

solid brook There is no limit on chatting with the model i think

Ohh

inner gate Aug 11, 2025, 2:35 PM

#

What’s ur thoights on deep seek r1

#

Thoughts

eternal niche Aug 11, 2025, 2:37 PM

#

btw guys gpt5 sucks

#

gemini 2.5 pro even better

solid brook Aug 11, 2025, 2:39 PM

#

brittle tiger https://x.com/patloeber/status/1954887103105380361?t=5iJUGwzPogdcwyHCdAgGIw&s=19

Guys with this i think gemini 3 will drop this week

ornate ether Aug 11, 2025, 2:51 PM

#

inner gate What’s ur thoights on deep seek r1

deepseek is pretty nice but it explodes whenver i ask it about chinese political or geographical scenario

stray aspen Aug 11, 2025, 2:53 PM

#

inner gate What’s ur thoights on deep seek r1

great when it released

#

now its decent

#

but the reasoning time is just nasty

hearty pulsar Aug 11, 2025, 3:00 PM

#

AI going great guys

#

#

10 million input tokens to do nothing but waste compute

stray aspen Aug 11, 2025, 3:01 PM

#

what

keen beacon Aug 11, 2025, 3:12 PM

#

New imagen model will come out today

#

https://www.reddit.com/r/Bard/s/RRBTl8UlBk

From the Bard community on Reddit: Image gen today !!

Explore this post and more from the Bard community

whole sundial Aug 11, 2025, 3:13 PM

#

at least something to make up for the lack of glm image gen

#

I will try it as soon as it comes out

stray aspen Aug 11, 2025, 3:15 PM

#

why more image model

#

we want a llm

keen beacon Aug 11, 2025, 3:16 PM

#

stray aspen we want a llm

Perhaps scheduled for later

#

They'll give a taste of what's upcoming

#

https://www.reddit.com/r/Bard/s/J9hac6b0wj

From the Bard community on Reddit: Imagen GemPix - Seems like a new...

Explore this post and more from the Bard community

stray aspen Aug 11, 2025, 3:23 PM

#

holy

#

its the new image model

#

i hope it hass image editing

#

gemini 2.0 image editing was great

keen beacon Aug 11, 2025, 3:24 PM

#

stray aspen i hope it hass image editing

https://www.reddit.com/r/Bard/s/lR1Uv0MVdO

From the Bard community on Reddit: Shots fired 🗣️🗣️

Explore this post and more from the Bard community

#

Quite sure it will. This looks like the gpt 5 teaser image with image editing

stray aspen Aug 11, 2025, 3:25 PM

#

yeah

keen beacon Aug 11, 2025, 3:25 PM

#

stray aspen yeah

I already made the request on #1372229840131985540

#

Just in case

#

Lol

barren prairie Aug 11, 2025, 3:28 PM

#

keen beacon New imagen model will come out today

Today ? 😍

keen beacon Aug 11, 2025, 3:29 PM

#

barren prairie Today ? 😍

Yes

stray aspen Aug 11, 2025, 3:32 PM

#

barren prairie Today ? 😍

yo wassup

#

i saw you in the z.ai server

inner gate Aug 11, 2025, 3:32 PM

#

I feel like Gemini 2.5 pro has boosted in intelligence these past few days idk if it’s just me

stray aspen Aug 11, 2025, 3:33 PM

#

inner gate I feel like Gemini 2.5 pro has boosted in intelligence these past few days idk i...

its been the opposite for me lol

inner gate Aug 11, 2025, 3:33 PM

#

I must be lucky 😭

barren prairie Aug 11, 2025, 3:33 PM

#

stray aspen i saw you in the z.ai server

I know you are there , just clicking on your name and I saw LM arena and Z .ai

inner gate Aug 11, 2025, 3:36 PM

#

stray aspen its been the opposite for me lol

Try again I feel like they’ve made some sort of update

echo aurora Aug 11, 2025, 3:40 PM

#

poll_question_text

What version do you use the most?

victor_answer_votes

8

total_votes

14

victor_answer_id

3

victor_answer_text

Direct

willow grail Aug 11, 2025, 3:57 PM

#

which ai product has the best lip sync?

thorny cove Aug 11, 2025, 4:32 PM

#

i keep getting "Something went wrong with this response, please try again." with GPT 5 Chat

echo aurora Aug 11, 2025, 4:33 PM

#

thorny cove i keep getting "Something went wrong with this response, please try again." with...

Okay let me look into.

#

I assume it's just that model you're running into issues with?

thorny cove Aug 11, 2025, 4:34 PM

#

it also happened with main gpt 5

#

but the thing is the gpt 5 chat error is only for 1 chat

#

could it be a rate limit of some sort

stray aspen Aug 11, 2025, 4:38 PM

#

thorny cove i keep getting "Something went wrong with this response, please try again." with...

yeah its kinda broken right now

#

i dont know if its because im giving extremely long prompts

echo aurora Aug 11, 2025, 4:40 PM

#

thorny cove it also happened with main gpt 5

I'm not seeing errors with either of these models, so it may be a rate limit.

echo aurora Aug 11, 2025, 4:40 PM

#

stray aspen yeah its kinda broken right now

You're seeing the same? Has this been an ongoing issue or something you've noticed recently?

stray aspen Aug 11, 2025, 4:41 PM

#

echo aurora You're seeing the same? Has this been an ongoing issue or something you've notic...

im pretty sure im hitting a limit or something

#

it generates stuff and then it shows that error

echo aurora Aug 11, 2025, 4:41 PM

#

although I am now noticing for gpt-5 the responses are coming in pretty slow and lag.

#

were both of you seeing the same or was it just the error message?

stray aspen Aug 11, 2025, 4:43 PM

#

#

it must be a limit lol

#

its like 1500 lines of code

#

this is new

#

on the leaderboard

#

it used to be gpt-5

keen beacon Aug 11, 2025, 4:46 PM

#

stray aspen this is new

daaamn

stray aspen Aug 11, 2025, 4:46 PM

#

yeah but its just on the leaderboard

#

i guess gpt-5 is using high effort on arena too

keen beacon Aug 11, 2025, 4:47 PM

#

stray aspen i guess gpt-5 is using high effort on arena too

Yeah, that's probably the case

thorny cove Aug 11, 2025, 4:57 PM

#

echo aurora I'm not seeing errors with either of these models, so it may be a rate limit.

are rate limits only for single chats?

echo aurora Aug 11, 2025, 4:58 PM

#

thorny cove are rate limits only for single chats?

Nope

solemn plank Aug 11, 2025, 5:10 PM

#

HOW TO CREATE VEDIOS ANYONE?

cedar tide Aug 11, 2025, 5:11 PM

#

Bad webdeb

Screenshot_2025-08-11-20-10-54-324_com.android.chrome-edit.jpg

primal orbit Aug 11, 2025, 5:12 PM

#

I have an error with opus 4.1 thinking when it repeates the same response it gave to the previous message. And if I try to put a new message, it gives you have to wait for 50 minutes. But the clock doesn't go down. It's 50 minutes each time.

cedar tide Aug 11, 2025, 5:12 PM

#

2.5 flash lite still hasn't come back

rapid merlin Aug 11, 2025, 5:13 PM

#

gpt-5 is really hit or miss with styling, either it comes with something actually good or something like this

#

thorny cove Aug 11, 2025, 5:14 PM

#

echo aurora Nope

strange

cedar tide Aug 11, 2025, 5:15 PM

#

And very bad overall

Screenshot_2025-08-11-20-14-29-937_com.android.chrome-edit.jpg

#

Screenshot_2025-08-11-20-15-02-884_com.android.chrome-edit.jpg

primal orbit Aug 11, 2025, 5:15 PM

#

it was released to make gpt 5 look good.

#

and for cheap marketing points

echo aurora Aug 11, 2025, 5:16 PM

#

cedar tide 2.5 flash lite still hasn't come back

I'll flag again, thank you.

bright junco Aug 11, 2025, 5:17 PM

#

Why does my gemini 2.5 pro print incompletely? Is there a way to fix it?

thorny cove Aug 11, 2025, 5:17 PM

#

two different chats at the same time btw

#

whats the diff between gpt 5 chat and gpt 5 high

pure falcon Aug 11, 2025, 5:21 PM

#

How and Why did GPT-5 lose two votes?

cedar tide Aug 11, 2025, 5:22 PM

#

Just good in math

Screenshot_2025-08-11-20-21-22-599_com.android.chrome-edit.jpg

pure falcon Aug 11, 2025, 5:22 PM

#

https://x.com/lmarena_ai/status/1953504964275470442?s=46

lmarena.ai (@lmarena_ai)

GPT-5 soars in the Arena with the highest score to date.

#

3183 votes

#

But now the leaderboard says 3181 votes

#

pure falcon Aug 11, 2025, 5:23 PM

#

pure falcon But now the leaderboard says 3181 votes

@echo aurora Any ideas?

#

Shouldn’t we have a lot more votes since 5 whole days passed? Why did it LOSE two when all the others gained?

keen beacon Aug 11, 2025, 5:25 PM

#

pure falcon But now the leaderboard says 3181 votes

oof

echo aurora Aug 11, 2025, 5:26 PM

#

pure falcon <@283397944160550928> Any ideas?

No it's not clear to me, but will flag to the team.

pure falcon Aug 11, 2025, 5:31 PM

#

Check this tweet. Google had 26.7K, now 28K

https://x.com/conorcraig_/status/1954409951226466412?s=46

Conor (@ConorCraig_)

Numbers > Viral Tweets
https://t.co/mOyqaoffLL

#

Gemini added 1.3K votes. And GPT-5 loses 2? Lol something is very wrong about that

vernal meadow Aug 11, 2025, 5:38 PM

#

Wow Opus 4.1 improved more than I thought on the none Agentic coding task.

Impressive. Should retest it more. Will Opus 4.1 thinking be on #1? 😮

exotic nebula Aug 11, 2025, 5:39 PM

#

vernal meadow Wow Opus 4.1 improved more than I thought on the none Agentic coding task. Imp...

I dont think so. Gemini 2.5 still beats it.

#

But it is very impressive to be honest.

#

I mean, the votes are too low to decide. Lets just wait around for a bit.

thorny cove Aug 11, 2025, 5:40 PM

#

any way to fix "Something went wrong with this response, please try again." for one singular chat?

exotic nebula Aug 11, 2025, 5:40 PM

#

thorny cove any way to fix "Something went wrong with this response, please try again." for ...

Many experience it, all you can do is refresh chat or open a new one.

thorny cove Aug 11, 2025, 5:40 PM

#

thorny cove any way to fix "Something went wrong with this response, please try again." for ...

i remember after enough refreshes it did generating for like 10 minutes then went back

thorny cove Aug 11, 2025, 5:40 PM

#

exotic nebula Many experience it, all you can do is refresh chat or open a new one.

damn alright

exotic nebula Aug 11, 2025, 5:41 PM

#

True. Have to agree with you there.

#

Btw, if you dont mind me asking, which model do you like the best?

wicked root Aug 11, 2025, 5:42 PM

#

why was gemini updated but not gpt?

#

yeah seems like so

#

confidence interval points that way

#

?

#

you mean this isn't reliable?

#

bugged in what way?

#

sorry, I'm new to this

#

ah

echo aurora Aug 11, 2025, 5:50 PM

#

pure falcon <@283397944160550928> Any ideas?

Following up on this: we are looking into and will provide an update when we can.

pure falcon Aug 11, 2025, 5:51 PM

#

echo aurora Following up on this: we are looking into and will provide an update when we can...

Thank you!

stray aspen Aug 11, 2025, 5:55 PM

#

pineapple do you know anastasios

wheat onyx Aug 11, 2025, 6:02 PM

#

https://x.com/patloeber/status/1954887103105380361?s=19

Patrick Loeber (@patloeber)

zinc ore Aug 11, 2025, 6:05 PM

#

Vxtwitter where x is and it'll show the vid

floral comet Aug 11, 2025, 6:09 PM

#

May i ask, how many is the token count of gpt 5 on (LM arena)

stray aspen Aug 11, 2025, 6:10 PM

#

idk but it stops generating for me after 2000 lines of code

echo aurora Aug 11, 2025, 6:11 PM

#

pure falcon <@283397944160550928> Any ideas?

The vote is based on pre-release GPT-5 testing. After GPT-5's public launch, we created a new model entry that points to its public endpoint and collecting more votes. These additional votes we've been collecting aren't yet added to the current leaderboard. We will be merging the votes in the next leaderboard release. cc @deep adder

willow grail Aug 11, 2025, 6:11 PM

#

wheat onyx https://x.com/patloeber/status/1954887103105380361?s=19

gemini was never good at anything

floral comet Aug 11, 2025, 6:14 PM

#

stray aspen idk but it stops generating for me after 2000 lines of code

Oh okay thanks

#

It's been 15 mins, I requested a code on gpt-5 and yes I refreshed the website still same.. Hope it didn't bugged or the code is just too long🤣

echo aurora Aug 11, 2025, 6:17 PM

#

I believe yes, but will double check and update if that's not the case.

slow grotto Aug 11, 2025, 6:17 PM

#

imagen 4.0 what are you doing man

zinc ore Aug 11, 2025, 6:21 PM

#

Nooooo

stray aspen Aug 11, 2025, 6:21 PM

#

lol disappointing

echo aurora Aug 11, 2025, 6:22 PM

#

yes, confirmed.

red tangle Aug 11, 2025, 6:27 PM

#

#

legit?

#

so many grifters these days that it's hard to tell what's real

#

someone claims they found it in "source code"

willow grail Aug 11, 2025, 6:28 PM

#

red tangle

who posted it

neon idol Aug 11, 2025, 6:31 PM

#

red tangle

GEMINI 3.0?

nimble trail Aug 11, 2025, 6:34 PM

#

red tangle

This is 100% cap tbh

zinc ore Aug 11, 2025, 6:34 PM

#

Definitely made up

nimble trail Aug 11, 2025, 6:35 PM

#

red tangle someone claims they found it in "source code"

Why tf would they put this in source code anyway 😭😭

mossy drum Aug 11, 2025, 6:40 PM

#

slow grotto imagen 4.0 what are you doing man

What was the prompt?

small haven Aug 11, 2025, 6:42 PM

#

is gemini 3 coming in august

wicked root Aug 11, 2025, 6:44 PM

#

small haven is gemini 3 coming in august

Nein

small haven Aug 11, 2025, 6:45 PM

#

what are they hyping about then?

#

imagen 5?

#

lol

keen beacon Aug 11, 2025, 6:46 PM

#

small haven imagen 5?

image editing version probably

#

of the 2.0 flash image preview

#

Donno exactly

small haven Aug 11, 2025, 6:46 PM

#

makes sense

keen beacon Aug 11, 2025, 6:47 PM

#

small haven makes sense

I'm always excited for image model stuff because voting is easier on visual stuff rather than text

small haven Aug 11, 2025, 6:55 PM

#

ah there we go

#

imagen 5

sacred quail Aug 11, 2025, 6:59 PM

#

imagen series are fine but really bad at understanding prompt

#

Needs native model like gpt 1 image

sacred quail Aug 11, 2025, 7:04 PM

#

willow grail gemini was never good at anything

Wrong

#

it was great at writing,long context, analyzing videos

#

Also it was fine with reasoning and coding

velvet patrol Aug 11, 2025, 7:12 PM

#

Why did the amount of votes for gpt-5(-high) change from 3182 to 3181 after the update and the votes for 2.5pro increased by 2k?

#

is gpt-5 already gone from the arena?

whole wagon Aug 11, 2025, 7:13 PM

#

https://x.com/OpenAI/status/1954969035713687975

OpenAI (@OpenAI)

We’ve scored highly enough to achieve gold at this year’s IOI online competition with a reasoning system — placing #6 when ranked with humans and #1 when ranked with other AIs.

In just a few weeks:
• 2nd at AtCoder
• Gold medal-level at IMO
• Gold medal-level at IOI

velvet patrol Aug 11, 2025, 7:14 PM

#

hello viren 😛

whole wagon Aug 11, 2025, 7:14 PM

#

Hi

ornate agate Aug 11, 2025, 7:18 PM

#

What is eb45-turbo?

hollow pebble Aug 11, 2025, 7:20 PM

#

look at my video Today!

willow grail Aug 11, 2025, 7:21 PM

#

hollow pebble look at my video Today!

ok

hollow pebble Aug 11, 2025, 7:21 PM

#

i only have one vote left! So vote for my video please!

willow grail Aug 11, 2025, 7:21 PM

#

hollow pebble i only have one vote left! So vote for my video please!

are you confused?

#

what video?

eternal niche Aug 11, 2025, 7:23 PM

#

btw guys gpt5 sucks

willow grail Aug 11, 2025, 7:23 PM

#

eternal niche btw guys gpt5 sucks

gpt5 is the only one who can do video games

hollow pebble Aug 11, 2025, 7:23 PM

#

Here's a hint! A image to video prompt is in the video-arena-1: A Female woman in her 1950's cartoon was smiling, giggling & talking someone with beauty. Models: Veo-3-audio-fast vs Hailuo 02 Pro.

eternal niche Aug 11, 2025, 7:23 PM

#

willow grail gpt5 is the only one who can do video games

i can

hollow pebble Aug 11, 2025, 7:26 PM

#

look at video-arena-1 and vote for my video!

sacred quail Aug 11, 2025, 7:27 PM

#

eternal niche btw guys gpt5 sucks

Gpt 5 high is sota right now. Not sure if it can beat Opus 4.1 for coding but its pretty good

#

Claude must be doing something magical on codes because even benchmarks not looks great, everyone still using them for coding soo

stray aspen Aug 11, 2025, 7:29 PM

#

red tangle

inspect elemnt lmao

stray aspen Aug 11, 2025, 7:29 PM

#

small haven imagen 5

imagen gempix

hollow pebble Aug 11, 2025, 7:34 PM

#

hey guys! Look at my video on video-arena-1.

sacred quail Aug 11, 2025, 7:35 PM

#

Not in long context

dim pine Aug 11, 2025, 7:36 PM

#

Gpt5 😂

eternal niche Aug 11, 2025, 7:36 PM

#

willow grail Aug 11, 2025, 7:36 PM

#

gpt 5 very high before gta 6 last day 2025

jade egret Aug 11, 2025, 7:36 PM

#

willow grail gemini was never good at anything

nah

wheat onyx Aug 11, 2025, 7:46 PM

#

https://x.com/CyberBoyAyush/status/1954985969255628978?s=19

Ayush Sharma (@CyberBoyAyush)

Hehe got you Gemini.

Something is cooking.

Gemini 3.0 Pro🫣

#

Pretty sure that was for flash

zinc ore Aug 11, 2025, 7:48 PM

#

wheat onyx https://x.com/CyberBoyAyush/status/1954985969255628978?s=19

Fake

#

Or hallucinated AI PR I should say

#

If that's the cli one from a month back

stray aspen Aug 11, 2025, 8:05 PM

#

gemini has a github emoji on lmarena

echo aurora Aug 11, 2025, 8:21 PM

#

stray aspen gemini has a github emoji on lmarena

Where are you seeing this? Can't say I'm seeing the same.

harsh flume Aug 11, 2025, 8:25 PM

#

sacred quail Claude must be doing something magical on codes because even benchmarks not look...

I wonder how much of that is from some sunk cost thing

#

like once you get used to your current stack and workflow youd need a much higher threshold of improvement than mildly better to just send it all to air and get used to another tool, even if in practical terms it'd be just a slight annoyance to do

stray aspen Aug 11, 2025, 8:29 PM

#

echo aurora Where are you seeing this? Can't say I'm seeing the same.

its on lmarena direct mode

#

it just put that idk why

neon idol Aug 11, 2025, 8:42 PM

#

@echo aurora Sorry for this stupid question but in your opinion when Gemini 3.0 will be released?

lime coral Aug 11, 2025, 8:43 PM

#

In 3 sec

keen beacon Aug 11, 2025, 8:45 PM

#

neon idol <@283397944160550928> Sorry for this stupid question but in your opinion when Ge...

Tomorrow

hardy pecan Aug 11, 2025, 8:46 PM

#

GUIZ GPT 5 IS BAD, BRING BACK 4o!!!!

stray aspen Aug 11, 2025, 8:47 PM

#

so o3 is smarter than o3 pro

#

thats crazy

hardy pecan Aug 11, 2025, 8:48 PM

#

It's a flawed benchmark, 1 data point for o3 pro underperformed alot which brought down its average

echo aurora Aug 11, 2025, 8:51 PM

#

neon idol <@283397944160550928> Sorry for this stupid question but in your opinion when Ge...

Soon (I have no idea)

candid storm Aug 11, 2025, 9:05 PM

#

How do you know?

#

The data is not published yet right?

#

35% chance is not the same as 'gpt5 will beat gemini in remove style control'

neon idol Aug 11, 2025, 9:09 PM

#

I am the only that don't care about the message taken by lm arena for testing ai?

misty vault Aug 11, 2025, 9:12 PM

#

no

#

I even put my home address and credit card numbers in it

#

It is probably on that huggingface link

gentle plinth Aug 11, 2025, 9:13 PM

#

hardy pecan GUIZ GPT 5 IS BAD, BRING BACK 4o!!!!

5-pro

#

Normal 5 scores worse then others

hardy pecan Aug 11, 2025, 9:15 PM

#

its very scuffed, but no one actaully reads how the data is collected and presented, its bizzare. ME SEE CHART ME BELIEVE

wicked root Aug 11, 2025, 9:19 PM

#

what's happening?

keen beacon Aug 11, 2025, 9:21 PM

#

hardy pecan GUIZ GPT 5 IS BAD, BRING BACK 4o!!!!

I dont want that sycophantic thing back at all

candid storm Aug 11, 2025, 9:21 PM

#

Ok tnx, you seem like you know your stuff. I've redistributed my portfolio to account for the uncertainty

sacred quail Aug 11, 2025, 9:30 PM

#

this is NOT long context

#

i'd say after 500k token long context starts

#

Gemini has no competitor

#

Only Minimax M1 tried to be close

#

Also yes, long context is important

#

Espicially when you try learn something, if you are a student, if you need summarize or analyze of some long text, you have no option besides gemini

wicked root Aug 11, 2025, 9:34 PM

#

how so? I'm hearing mixed opinions across the board on gpt5, it seems highly polarized.

sacred quail Aug 11, 2025, 9:35 PM

#

Could be true

#

As a gemini fan im just admitting gpt 5 is best model right now

#

Better than O3

#

Also i do showing big respect towards to their "less praise" model choice

#

After 06/05 goldmane update gemini turned to 4o like praising you for everything

#

im really not liking this

#

Yeah

#

I understand people had very high expectations but

#

There is so many unnecesary critizing towards to gpt 5

#

Also im not sure but probably gpt 5 is very efficent model too which is also important

#

Better than O3 but also cheaper than O3

#

Oh really

#

interesting. But they lowered O3 price soo maybe we should compare with first api price

#

If i not remember wrong gemini 2.0 flash think was before than deepseek R1 but weirdly people didnt care

#

it was quite good

#

2.0 was trash but

#

No

#

Listen

#

2.0 was a trash model but 2.0 flash think was 3x 4x better than 2.0 flash so they did really good on that reasoning thing

#

Even if base model is trash

eternal niche Aug 11, 2025, 9:50 PM

#

btw

ocean vortex Aug 11, 2025, 9:52 PM

#

stray aspen so o3 is smarter than o3 pro

Exactly. It's a sh'it benchmark

wicked root Aug 11, 2025, 9:59 PM

#

@deep adder market’s pricing in your hypothesis

#

What? Ai market is a lot more liquid than the one I trade in

#

¯_(ツ)_/¯

wicked root Aug 11, 2025, 10:27 PM

#

you seem very confident in gpt, can I ask why? I'm new to this so I don't know a lot.

devout vault Aug 11, 2025, 10:29 PM

#

yes

eternal niche Aug 11, 2025, 10:30 PM

#

wicked root you seem very confident in gpt, can I ask why? I'm new to this so I don't know a...

devout vault Aug 11, 2025, 10:30 PM

#

eternal niche

He's trolling

heavy knoll Aug 11, 2025, 10:42 PM

#

Can someone Tell me Which one is Chat gpt 5 high

wicked root Aug 11, 2025, 10:43 PM

#

@eternal nicheman your pfp creeps me out lmao

#

I just realized... I'm SO sorry if that's your selfie

eternal niche Aug 11, 2025, 10:46 PM

#

wicked root <@1391027884889608333>man your pfp creeps me out lmao

wicked root Aug 11, 2025, 10:46 PM

#

love the eccentricity, but you've now intensified my fear

#

this is the weirdest thing I've seen in my life and i mean this in the best way imaginable LOL. What's he saying?

eternal niche Aug 11, 2025, 10:47 PM

#

wicked root this is the weirdest thing I've seen in my life and i mean this in the best way ...

sings 'Matushka Zemlya'

wicked root Aug 11, 2025, 10:48 PM

#

https://www.youtube.com/watch?v=8mtxEbvzkHs

YouTube

Татьяна Куртукова

Татьяна Куртукова - Матушка

Слушать на всех площадках: https://band.link/matushka_
"Матушка" (слова и музыка - Пётр Андреев)

Подписывайтесь на соц. сети:
Сообщество ВК: https://vk.com/tatiana.kurtukova
Личная страница ВК: https://vk.com/ts_makeeva
Instagram: https://ins...

▶ Play video

#

?

eternal niche Aug 11, 2025, 10:48 PM

#

yes

sacred quail Aug 11, 2025, 10:57 PM

#

heavy knoll Can someone Tell me Which one is Chat gpt 5 high

"gpt 5" is high effort reasoning model

#

"gpt 5 chat" is not

ocean vortex Aug 11, 2025, 11:09 PM

#

gpt5-chat is 4o-latest successor, no reasoning

#

Bluntly speaking it's probably the same as gpt5-minimal

#

just like 4o-latest kinda sorta was the same as gpt4.1

sacred quail Aug 11, 2025, 11:22 PM

#

@ocean vortex do you what is gpt 5 thinking's base model ?

#

Gpt 5 chat ?

#

Or gpt 4.5 ?

marsh stratus Aug 11, 2025, 11:23 PM

#

GPT 4.5 is too slow and expensive

ocean vortex Aug 11, 2025, 11:27 PM

#

sacred quail <@514836230802898954> do you what is gpt 5 thinking's base model ?

GPT5-minimal. It’s a hybrid reasoning model and there’s no routing on API. Routing only happens on chatgpt website

#

So like, that same model is also technically the base model

torn mantle Aug 11, 2025, 11:28 PM

#

Best model is gemini

ocean vortex Aug 11, 2025, 11:30 PM

#

They probably would have used gpt5-chat with no routing if it performed better… lol

#

But now by routing it occasionally to gpt5-low, it can comfortably beat gpt4.1

rugged brook Aug 11, 2025, 11:40 PM

#

heavy knoll Can someone Tell me Which one is Chat gpt 5 high

CHAT IS WIOTHOUT THINKING

barren ermine Aug 11, 2025, 11:41 PM

#

what model is best for development and code?

rapid merlin Aug 11, 2025, 11:42 PM

#

I'm confused about the division between gpt-5-chat and gpt-5 minimal, low and medium, can somebody enlighten me

ocean vortex Aug 11, 2025, 11:54 PM

#

rapid merlin I'm confused about the division between gpt-5-chat and gpt-5 minimal, low and me...

They all are derived from the same base model. Chat is like minimal except different fine-tuning and probably less RL training

junior sonnet Aug 11, 2025, 11:55 PM

#

can i choose the model in the vid gen?

verbal nimbus Aug 11, 2025, 11:56 PM

#

unborn sleet is it as good as aistudio builder

Different, AIStudio allows you to edit and save projects, and it has a React environment.

#

I think Design Arena uses pure HTML/JS instead for websites

brisk helm Aug 12, 2025, 12:12 AM

#

junior sonnet can i choose the model in the vid gen?

no

misty vault Aug 12, 2025, 12:16 AM

#

Large Language Model

rapid merlin Aug 12, 2025, 12:26 AM

#

ocean vortex They all are derived from the same base model. Chat is like minimal except diffe...

i see, and what about the gpt-5-thinking in chatgpt? Is that just gpt-5 chat with more think time?, something like gpt-5-chat-high if we will?

wicked root Aug 12, 2025, 1:10 AM

#

someone here said grok4 is the best at coding

#

Is this for ALL coding?

#

@deep adder opinion on grok4?

wintry tinsel Aug 12, 2025, 1:12 AM

#

Soooo my bros

#

When is Gemini 3

cloud zinc Aug 12, 2025, 1:12 AM

#

october

wintry tinsel Aug 12, 2025, 1:13 AM

#

Account was made today

#

Ban this freak

wicked root Aug 12, 2025, 1:13 AM

#

what's hardcoding?

mellow frigate Aug 12, 2025, 1:39 AM

#

What do you mean? Generating music?

inner gate Aug 12, 2025, 1:55 AM

#

wintry tinsel Account was made today

Is that bannable

#

!!?!?!

misty vault Aug 12, 2025, 1:55 AM

#

Yes, you have a chance to be banned when your account is made today

inner gate Aug 12, 2025, 1:55 AM

#

Oh wow how come

misty vault Aug 12, 2025, 2:00 AM

#

because of @hollow ivy starting a gemini 2.5 pro gooning cult

#

They will eat you alive if you say anything negative about gemini 2.5 pro

timber tulip Aug 12, 2025, 2:10 AM

#

hallo

sick spire Aug 12, 2025, 2:59 AM

#

How to Delete a Generation
lf you'd like to delete the initial prompt and generation from the bot, right-click the bot'smessage and select Apps > Delete Generation . Note that deleting the originalprompt will also delete its corresponding generation, but deleting just the generationwill leave the original prompt intact

golden ocean Aug 12, 2025, 3:02 AM

#

ts so wholesome

sand bay Aug 12, 2025, 3:03 AM

#

you know that this site is a bit deceptive

hardy lion Aug 12, 2025, 3:07 AM

#

sand bay you know that this site is a bit deceptive

GPT-5 via API without a system prompt telling it that is it GPT-5 is a bit deceptive

golden ocean Aug 12, 2025, 3:22 AM

#

stray aspen Aug 12, 2025, 3:50 AM

#

sand bay you know that this site is a bit deceptive

It's gpt 5 anyways

#

So who cares

sharp yew Aug 12, 2025, 3:55 AM

#

sand bay you know that this site is a bit deceptive

Is it LMArena?

echo aurora Aug 12, 2025, 3:56 AM

#

sharp yew Is it LMArena?

Yes, & it has been shared with the team.

hardy lion Aug 12, 2025, 3:57 AM

#

The GPT-5 model without a system prompt does not know that it is GPT-5. This can be reproduced on OpenAI's API playground

slim mesa Aug 12, 2025, 4:57 AM

#

WTF THE GPT 5 HIGH ARE ON THE CHAT?

agile bloom Aug 12, 2025, 5:04 AM

#

woah way too many gpt5 which one is for what?

solid brook Aug 12, 2025, 5:06 AM

#

agile bloom woah way too many gpt5 which one is for what?

Well the best is gpt 5 high

agile bloom Aug 12, 2025, 5:24 AM

#

solid brook Well the best is gpt 5 high

thanks my guy

tidal ginkgo Aug 12, 2025, 5:31 AM

#

hey uhhhh

#

i was off of gpt-5 for a bit

#

why is there a gpt-5-high?

#

i thought gpt-5 was high already?

#

oh that sounds bad

rare python Aug 12, 2025, 5:32 AM

#

they made it clearer

#

because new people can be confused

echo aurora Aug 12, 2025, 5:34 AM

#

It's the same model, but added the high to make it more clear

tidal ginkgo Aug 12, 2025, 5:34 AM

#

oh ok ty

#

lol

patent bane Aug 12, 2025, 5:56 AM

#

#

hmm

#

gusty helm Aug 12, 2025, 6:12 AM

#

Hey! Not sure if this was asked but i see a leaderboard update in text arena yesterday, but did the gpt5 votes remain unchanged? Score think changed a bit, but not votes? Some bug?

echo aurora Aug 12, 2025, 6:24 AM

#

gusty helm Hey! Not sure if this was asked but i see a leaderboard update in text arena yes...

Hello ablobwave - we did chat about this earlier but will share the response

The vote is based on pre-release GPT-5 testing. After GPT-5's public launch, we created a new model entry that points to its public endpoint and collecting more votes. These additional votes we've been collecting aren't yet added to the current leaderboard. We will be merging the votes in the next leaderboard release.

gusty helm Aug 12, 2025, 6:25 AM

#

echo aurora Hello <a:ablobwave:552927506957729802> - we did chat about this earlier but will...

Thank you, that makes sense; so I guess the score only updated then

lime oak Aug 12, 2025, 6:30 AM

#

guys what differint with vedio arena 1 and 2 and 3 and 4

echo aurora Aug 12, 2025, 6:37 AM

#

lime oak guys what differint with vedio arena 1 and 2 and 3 and 4

No difference, but we want to spread out generations a bit or else the channel will get too spammy.

lime oak Aug 12, 2025, 6:37 AM

#

ok nice idea

quiet dust Aug 12, 2025, 7:23 AM

#

Hi guys, is the gpt-5-high on LMArena the same as the regular basic gpt-5 model in ChatGPT?

hardy lion Aug 12, 2025, 7:24 AM

#

not quite, gpt-5-chat should be the closest to the experience in chatgpt

quiet dust Aug 12, 2025, 7:26 AM

#

So, gpt-5-high approximately at the same level as GPT-5 Thinking (medium mode)?

hardy lion Aug 12, 2025, 7:27 AM

#

gpt-5-high should be gpt-5-thinking (high mode)

quiet dust Aug 12, 2025, 7:28 AM

#

hardy lion gpt-5-high should be gpt-5-thinking (high mode)

Really? Thanks for the answer

hardy lion Aug 12, 2025, 7:29 AM

#

I'm taking the word of this openai employee on twitter:
https://x.com/ericmitchellai/status/1954680194733863200

Eric (@ericmitchellai)

@benhylak OH. Guh… I did not know we used different nomenclature in API and chatgpt

What’s GPT-5 in api is GPT-5 thinking in chatgpt

What’s GPT-5 in chatgpt is GPT-5-chat in API

🫣

elder lintel Aug 12, 2025, 7:39 AM

#

hey

hardy lion Aug 12, 2025, 7:46 AM

#

sup

keen beacon Aug 12, 2025, 7:50 AM

#

can someone help me, in lmarena its always stuck generating and idk what to do anymore, refreshing the page doesnt work for me

wicked root Aug 12, 2025, 8:08 AM

#

How many of you think Gpt5 will beat Gemini with style control OFF this month?

#

CI band is insanely wide on gpt5

dawn grove Aug 12, 2025, 8:20 AM

#

How can u get Lmarena unlimited free is there a method?

#

I*

drifting thorn Aug 12, 2025, 8:21 AM

#

dawn grove How can u get Lmarena unlimited free is there a method?

It's unlimited free because LMArena is paying for u

dawn grove Aug 12, 2025, 8:25 AM

#

But it has limitation i mean this likenit has a amount of messages

acoustic cliff Aug 12, 2025, 8:33 AM

#

Of course

keen beacon Aug 12, 2025, 8:38 AM

#

dawn grove How can u get Lmarena unlimited free is there a method?

Enjoy your data on Hugging Face later

ornate agate Aug 12, 2025, 8:47 AM

#

A fair trade.

drifting thorn Aug 12, 2025, 8:49 AM

#

A fair trade.

keen beacon Aug 12, 2025, 9:05 AM

#

No imagen model yet...

#

I thought it would get released

#

Sigh.

keen beacon Aug 12, 2025, 9:06 AM

#

drifting thorn A fair trade.

No like, really, there must be A TON of data that nobody was intended to see in these datasets right now

#

Because people don't read and think they're gaming the system by getting gpt-5 for free

sacred quail Aug 12, 2025, 9:08 AM

#

Well, they can read my conspiracy theory tests which is im testing them on every new model

#

Only problem is if they starts to believe

ocean vortex Aug 12, 2025, 9:10 AM

#

hardy lion The GPT-5 model without a system prompt does not know that it is GPT-5. This can...

wait... how come you don't have "reasoning effort" for this model?

#

oh you probably haven't verified your org since you don't have summaries either. But it's weird they aren't letting you adjust reasoning. You may be stuck on "minimal" lol

sacred quail Aug 12, 2025, 9:14 AM

#

ocean vortex wait... how come you don't have "reasoning effort" for this model?

Which site is this

ocean vortex Aug 12, 2025, 9:15 AM

#

hardy lion I'm taking the word of this openai employee on twitter: https://x.com/ericmitche...

Seems like he is confused himself lmaoo

#

This is roughly accurate, except when chatgpt decides to route your request to reasoning when you are using "GPT5". Then it is no longer gpt5-chat.

ocean vortex Aug 12, 2025, 9:17 AM

#

sacred quail Which site is this

https://platform.openai.com/chat

hardy lion Aug 12, 2025, 9:17 AM

#

ocean vortex oh you probably haven't verified your org since you don't have summaries either....

the model on the leaderboard is named gpt-5-high, it has reasoning effort high

ocean vortex Aug 12, 2025, 9:18 AM

#

hardy lion the model on the leaderboard is named gpt-5-high, it has reasoning effort high

Yes but I'm referring to your screenshot. It's odd that you don't have that option but have access to gpt5.

hardy lion Aug 12, 2025, 9:19 AM

#

ocean vortex Yes but I'm referring to your screenshot. It's odd that you don't have that opti...

oh, I'm not sure I didn't actually take that screenshot, someone else reprod and sent to me

#

The api calls are using reasoning_effort="high"

gentle plinth Aug 12, 2025, 9:54 AM

#

💀
https://x.com/sama/status/1954705761805938734

Sam Altman (@sama)

we are considering giving a (very) small number of GPT-5 pro queries each month to plus subscribers so they can try it out! i like it too.

but yeah if you wanna pay us $1k a month for 2x the input tokens feels like we should find a way to make that happen...

gentle plinth Aug 12, 2025, 9:57 AM

#

ocean vortex wait... how come you don't have "reasoning effort" for this model?

@hardy lion its a bug, you have to reload the page for the option to appear

tulip cipher Aug 12, 2025, 10:01 AM

#

what i need to do now..

torn mantle Aug 12, 2025, 10:04 AM

#

gentle plinth 💀 https://x.com/sama/status/1954705761805938734

lmao

sage heath Aug 12, 2025, 10:04 AM

#

Hello

hardy lion Aug 12, 2025, 10:04 AM

#

gentle plinth <@1394374846741221458> its a bug, you have to reload the page for the option to ...

thanks!

ocean vortex Aug 12, 2025, 10:05 AM

#

gentle plinth 💀 https://x.com/sama/status/1954705761805938734

Oh for sure they are gonna find a way to let people set their money on fire. Let them have what they want!

tulip cipher Aug 12, 2025, 10:05 AM

#

tulip cipher what i need to do now..

?

sleek crow Aug 12, 2025, 10:11 AM

#

tulip cipher ?

you can reset cookies only for lmarena on your browser to reset the limit

tulip cipher Aug 12, 2025, 10:12 AM

#

sleek crow you can reset cookies only for lmarena on your browser to reset the limit

how

sleek crow Aug 12, 2025, 10:13 AM

#

what browser do you use ?

tulip cipher Aug 12, 2025, 10:13 AM

#

btw, what's the difference between beta.lmarena.ai and lmarena.ai

tulip cipher Aug 12, 2025, 10:13 AM

#

sleek crow what browser do you use ?

firefox

sleek crow Aug 12, 2025, 10:15 AM

#

Just find on internet how to reset cookies for a website for fiefox beacuse right now im on microsoft edge

tulip cipher Aug 12, 2025, 10:15 AM

#

tulip cipher btw, what's the difference between beta.lmarena.ai and lmarena.ai

?

sleek crow Aug 12, 2025, 10:16 AM

#

i have no clue it think its just the same

spare rune Aug 12, 2025, 10:17 AM

#

whats the rate limit on gpt5 high?

indigo hazel Aug 12, 2025, 10:19 AM

#

tulip cipher btw, what's the difference between beta.lmarena.ai and lmarena.ai

i think right now they're the same. but beta would be used in case to try something, new features or to see if it all works

eternal niche Aug 12, 2025, 10:30 AM

#

btw guys gpt5 sucks

ocean vortex Aug 12, 2025, 10:33 AM

#

eternal niche btw guys gpt5 sucks

you seem obsessed with it

eternal niche Aug 12, 2025, 10:34 AM

#

ocean vortex you seem obsessed with it

gemini 2.5 pro the best

leaden sun Aug 12, 2025, 10:34 AM

#

gentle plinth 💀 https://x.com/sama/status/1954705761805938734

sam is being practical here, am sure there are enough people from a niche circle paying that

keen beacon Aug 12, 2025, 10:35 AM

#

eternal niche gemini 2.5 pro the best

I like kimi k2

eternal niche Aug 12, 2025, 10:35 AM

#

keen beacon I like kimi k2

even kimi k2 better than gpt5

gentle plinth Aug 12, 2025, 10:35 AM

#

leaden sun sam is being practical here, am sure there are enough people from a niche circle...

if you pay that for 2 years you could probably get a used h200

#

its not worth it

leaden sun Aug 12, 2025, 10:36 AM

#

gentle plinth if you pay that for 2 years you could probably get a used h200

gpt 6 coming out next soon, so this is just intermittent luxury foreplay for privileged few

gentle plinth Aug 12, 2025, 10:37 AM

#

its not that expensive to do inference at scale to justify such costs, even for larger models

leaden sun Aug 12, 2025, 10:39 AM

#

not for certain niche circle of people who dont understand the details i guess

ocean vortex Aug 12, 2025, 10:41 AM

#

leaden sun sam is being practical here, am sure there are enough people from a niche circle...

I think they already played that card with $200 sub tbh. More than that it's becoming insanity and then where do you stop...

#

Not to mention that with 1k per month you kinda already could comfortably rent hw to host any model all to yourself

unborn lantern Aug 12, 2025, 10:43 AM

#

eternal niche even kimi k2 better than gpt5

Any proven data?

eternal niche Aug 12, 2025, 10:43 AM

#

unborn lantern Any proven data?

unborn lantern Aug 12, 2025, 10:44 AM

#

eternal niche

Lol

#

Lmarena just change the name of gpt 5 to gpt 5 high or they change the api?

eternal niche Aug 12, 2025, 10:48 AM

#

yes

ocean vortex Aug 12, 2025, 10:50 AM

#

unborn lantern Lmarena just change the name of gpt 5 to gpt 5 high or they change the api?

just the name, it was high as a kite since the start on lmarena

tulip cipher Aug 12, 2025, 10:55 AM

#

eternal niche btw guys gpt5 sucks

yes

#

nice nick btw

eternal niche Aug 12, 2025, 10:57 AM

#

Что вершит судьбу человечества в этом мире? Некое незримое существо или закон, подобно Длани Господней парящей над миром? По крайне мере истинно то, что человек не властен даже над своей волей.

ocean vortex Aug 12, 2025, 11:04 AM

#

eternal niche Что вершит судьбу человечества в этом мире? Некое незримое существо или закон, п...

Co ty tu pleciesz?

eternal niche Aug 12, 2025, 11:04 AM

#

ocean vortex Co ty tu pleciesz?

🍆

woven scarab Aug 12, 2025, 11:09 AM

#

gm AI fam! 😁

cedar tide Aug 12, 2025, 11:50 AM