#general
1 messages · Page 94 of 1
DId you just make this up for no reason?
GPT5 is significantly smaller than og GPT4 and their infra is much better now. They are doing just fine
They also deprecated gpt4.5 for good so probably doing better than ever on compute tbh
api or aistudio
Define 'clearly'. If we look at all the tests and same prompt testing, gpt5 is the clear winner lol
how is the discontinuation of gpt 4.5 related?
sama said that on twitter and on ama
what you're talking about?
even on the podcast he said that they gonna need to cut something
gpt4.5 was their biggest model ever. Since that is dicontinued they now have much more compute to spare
I use api with cline + gemini pro (app)
???????????
source?
???????????????
no
Do you seriously need a source for that?
Subjectively. But even objectively speaking, the win-rate against 2.5 pro is very low on lmarena.
that they not gonna cut things bcs they now have compute to spare
lmao
do you know that 4.5 was using training resources right?
And now with 4.5 disabled they gonna use it on gpt-6 related work
not inference
Read again what I wrote lol
And then dispute what I wrote
bro is blind
well, just wait 2 days and when he announce sora or other shi* being cut we discuss again
that is nothing new, they are saying the same things all the time
that's how they stay in business... smh
They are always looking for ways to cut costs
it's not cut costs bro
and this is one of the means of justifying that
they don't have enough compute
lmao
sam is crazying traveling the world to get 500 bi and bro is saying that he has compute but don't want to use to cut costs
are you insane or what
No one does if you read everything literally. You really shouldn't take this for face value 
They have plenty of it, the only question is the price/cost
lol
smh
It has always been the case openAI are deploying compute far too slow. The stargate project is planning to have 64k GPUs by the end of the year
That is slow compared to other companies
Either you don't follow the news, or you're a conspiracy theorist
https://www.datacenterdynamics.com/en/news/openai-and-oracle-to-deploy-64000-gb200-gpus-at-stargate-abilene-data-center-by-2026-report/ nevermind end of 2026 is when they expected 64k
They have their priorities on other things - like training new models. They also won't accept the higher cost than what they feel comfortable with - that's the extent of them struggling for compute
Bro is gpt 4o -
OpenAI and Oracle are expected to deploy 64,000 Nvidia GB200s at the Stargate data center in Abilene, Texas by the end of 2026.
This is basically pathetic rate lol
yeah they are expanding all the time and scaling, obviously
They have azure though
If they wanted, their compute is 'unlimited', but cost...
that's a different question
That's their main stargate computer. Meant for training
It's not about priorities. It's about not having enough gpus for both
Most of these tweets are meant for marketing. And to justify any potential ratelimits. They are not to be taken for face value
Yes, you're a conspiracy theorist. There's no point in discussing it
Just let him yap
I know just enough about ML to know that he's telling the truth
Or do you seriously think them making their models smaller and cutting models from model switcher + modernizing their infra... put them at bigger strain than before? 🤣
LOL
The strain remains the same; they can simply serve more people without removing resources from research
ChatGPT is growing every week
hello
Incorrect, the strain goes down. But obviously you can still sell it as having not enough compute, there's never gonna be enough of it technically...
They used it for research too since the start
Bro, you can't have GPUs training models and doing inference at the same time
But they also had to host bigger models and deal with older gpus
obviously
What version do you use the most?
10
22
3
Direct
Even if they have optimized their infrastructure and now have free GPUs for training models, the strain remains the same. They are going to use them to train models
When they say they don't have enough GPUs, that includes the GPUs they need for research, the GPUs they need for current inference, and the GPUs they will need for future inference if ChatGPT continues to grow at its current pace
Not the same. At the very least the cost of upkeeping chatgpt service is lower.
Yes. Like I said this was the case since the very start. It's all relative.
Predictable result... 
The plus limit went from 200 to 3000. GPT5 Thinking basically unlimited for plus users, they will never hit the limit
It doesn't add up with the capacity tweet ngl
Yes my friend tryed it it's misleading a little
because it's like 2600 mini thinkings and not 3000 gpt-5 thinking
Yeah it has the router
Yeah precisely. If they were struggling for compute 3000 would have been literally impossible
How much do you think the router helps. What percents are going to mini or even nano
Not true because the o3 mini, o4 mini, gpt 4.5 and others was using those resources that they are giving gpt 5 mini thinking now
afaik nothing is going to mini/nano until you hit your usage caps
that's why it's possible 3000 per week
the gpt 5 thinking model that is basically new o3 or o4 still 200 per week
GPT5 default option is 99% more costly than to run just gpt4.1 tbh
and gpt5-thinking is o3 equivalent, in terms of cost
Not even my friends who work at OpenAI have that information. You're incredible!
I mean roughly, that's just common sense...
doesn't take a genius, but obviously this is not official confirmed info, if you need for me to spell it out just for you?
When i discuss things here, i bring tweets from researchs, papers, news
always based on some data
you're just talking about things in your head and treating it as the truth
From what I hear. GPT5 Thinking is quite a bit cheaper to serve than o3 lol
I mean look at the speed of it. That should be a clue lol
"From what I hear..."
lmfao
Bro what
There is public information about them using specific silicons that explains the O3 cut price and speed
It's a good indication of their cost. There were certain rumors suggesting same size as o3 as well. Nothing at all to suggest it is bigger model for now though
Do your friends work at openAI as the janitors or smth? They should know that info if they even touched GPT5 training pipeline
It is not that hidden
They work with the front end and ChatGPT
Yes, it's hidden from them. It's hidden from everyone
web development?
I know a GTM director and he knows it
It's not hidden
How would they even hide something that fundamental that easily
Well, I'd rather believe my friends than you
yeah they would have no clue tbh
I would rather trust information from insiders than from random people on Discord.
trust what exactly?
I'm leaving now. Bye-bye, conspiracy theorists!
that they "don't know"?
that's not some secret information you can trust or not trust
LOL
It is kinda impossible to prove. Because then you literally out them how do you want me to prove it without doing that
I feel like I know the answer but llmarena.ru is NOT you guys right?
wtf
very clearly not lol. Seems like some illegal copycat, probably using stolen keys as well tbh
Can someone point me to research on automating the leaderboard with third LLM prompt creation and response evaluation? How close can it get to human results these days?
What
So most people just abuse lmarena to get top of the line models for free?
Ofc all this data is used to train models but so is your Google search input to feed you with personalised ads right?
@echo aurora GPT 5 needs to be retested from the beginning on both arenas with the public API. Many people who had preview access say that GPT 5 was much better on that access than now via the official API.
yes
bro lmarena is the sauce
no for ppl who cant afford the top of the line ai models
Just did some back to back testing gpt5-chat vs gpt5-reasoning (API) vs gpt5-router (chatgpt)
agree
Actually not bad if you don't mind if your data will be used to train their models
70% of people found out about it when a major ai came out and wanted to use if 4 free
I would say that default model on chatgpt is halfway in between reasoning and non-reasoning variant on overall performance, but closer to the non-reasoning one
there are obvious gains to be had of having reasoning always on
Which one major AI?
And that router does not nearly always work as it is supposed to
which is the greatest on lmarena: gpt-5 nano,chat, gpt-5, or mini
gpt-5
go to video arena bro
yo
anyone know how to make the viral videos with baby speaking?
when I try, veo3 doesnt let me because it contains minors
lol
you dont lmao
use another ai
thats google policy
yea but people still bypass it
well idk
there are tons of videos
i´m no jailbreaker
on my fyp
use another ai
Hey sorry I’ve got cold so haven’t been following chat much. Are we saying that summit is different than what’s available now?
Many people
Where are you seeing that?
Sorry again for not following, I’m a bit out of it atm
I may be wrong, but I think the GPT-5 model needs to be handled a little differently than others, which may be why users are dissatisfied
So here, the model is always reasoning, unlike on the OpenAI platform, where it automatically selects the working style depending on the prompt?
so we test the gpt-5 model as on openai api but chatgpt model works diffrently right?
I’m calling bs on that. Try testing gpt5-high against o3-high on some decent question set and you should see for yourself that it performs better. The issue is more of people using the router (default option on chatgpt) and expecting for that to perform like SOTA reasoning model. But this wouldn’t be applicable to API.
And it’s not like the metrics are showing night and day difference o3 to gpt5. So those improvements you CAN see kinda align with them… If it was really performing worse I don’t think you would be able to tell it’s better (than o3)
agreed
Selling ChatGPT Plus for $10 for 3 months.
The leaderboard should a slider that only considers votes within the selected time period in order to generate the rankings.
When will the leaderboards be updated?
Normally take about a week to get the votes/validate before posting
dude gpt-5 is incredible for lua
it keeps on impressing me
gemini couldnt one shot this
this is great
i love gpt-5
lmao
i mean the fact that it works on the first try
this is crazy
Could you tell us if there’d be a change in the rankings?
Hy
Yeah I’ll make an announcement when we update
Thank you 👍
Yeah gemini benchmark is with unnerfed version of the model. If they benchmark with the nerfed model it would go A LOT down
I swear to god if they do the same with gemini 3 ......
Hey guys, I have a question
I've always wanted to customize the chat (the way I prompt). How do I boldened a text (like h1-h6)? Is it possible to do a strikethrough? Other methods like #, ###, **, ``, ?? How can I do that in LMarena? It makes my prompt much cleaner and clearly instructed.\
Is it called "Markdown Syntax"?
man I'm getting nervous about google vs openai for August
same here :<
seems so
📢 GLM-4.5 Technical Report is here!
We’re pulling back the curtain on how GLM-4.5 was built to excel at reasoning, coding, and agentic tasks — powered by a unique multi-stage training paradigm.
🔍 Highlights:
• Expert model iteration + self-distillation ➡️ unify reasoning, agentic, and general chat into one model
• Hybrid reasoning mode ➡️ knows when to think deeply, when to respond instantly
• Difficulty-based RL curriculum ➡️ break through performance plateaus
• Efficient function calling ➡️ more reliable tool use for code-heavy tasks
📄 Read the report: https://arxiv.org/abs/2508.06471
💬 Let us know your thoughts in the thread!
We present GLM-4.5, an open-source Mixture-of-Experts (MoE) large language model with 355B total parameters and 32B activated parameters, featuring a hybrid reasoning method that supports both thinking and direct response modes. Through multi-stage training on 23T tokens and comprehensive post-training with expert model iteration and reinforceme...
S
Yes with all the models.
I need help
"where": on Twitter, influencers who had early access to GPT 5 They tried the same prompts after the release and the result was very far from the preview.
I'm not sure if he's right or not., but I don't think we can trust any company, we don't know if summit has not been quantized or anything for production
I need to get my crypto back

Did openAI also discontinue the DeepResearch toggle button?
🐼
Hello
Yea, I tried a Deep Research prompt at chatgpt just to check if it'd maybe understand that it's a deepresearch query and work accordingly, but it doesnt work
Bummer, they could have kept it as a function to use on their new gpt
If anyone here uses their API and can confirm weather or not its still avaliable as a API tool call or something id appreciate
Why do you think gemini 3 releases in august
I don't. I'm betting big on Gemini 2.5 retaining its title against GPT5 this month.
Bruh. Dude gpt 5 high is better than gemini 2.5 pro in every way
Without style control?

Not the model we have access to
And
Gemini 2.5 benchmaxed
But gpt 5 benchmarks are real thing
you mean my 2.5 is nerfed?
Yes pretty obvious if you used gemini 2.5 on its first month
Go reddit
They talk a lot about it
So do you think GPT will beat Gemini this month?
Yes if gemini 3 does not come out
@echo aurora
We want to know where the chatgpt router would arrive in the ranking. And if you add it with open ai, be sure that it is exactly the same version as the one available on chatgpt.
(that of the plus users and not pro )
but it's the most important benchmark for me lol
Dude on the model ranking gpt 5 is number 1. Idc if you want to belive the lie okay then
Is better gpt 5 or gpt 5 pro?
Gpt pro lol
it's not a lie. I'm talking about polymarket.
Thx
Polymarket expects gemini 3 to release
Dude the benchmarks are out
Not in august
Gpt 5 already beat gemini 2.5
In septeber
What happens in sept is a non-issue for me.
Not any point in talking to you. Go to youtube watch a good comparison vid of the 2 models
If you interested you can Upvote my request
https://discord.com/channels/1340554757349179412/1404368537690308739
Yeah man you're right.....
Yo guys is there any way to connect LMArena With R Studio
What is r studio
oh
No it doesn't provide any api
Thx
No
Ok
just make point system and see what is happening 😄
where do i generate images? in video arena?
Is it somehow possible to implment midjourny?
Are y'all really sure gpt5 is better than o3?
Even if it's better it would be the higher tier version
Us free tier users gpt5 isn't better than lmarena o3
@deep adder answer me gpt5 pfp
literally no one reads how this is scored, no due diligence, just copy paste misinformed charts
its a bit wild
how can i retrieve the seed code of the image i generated using Lmarena
What’s offline IQ? What does it mean
Reminds me of Perplexity for some reason or ChatGPT Ui in general
Doesn't mean it's a bad thing though
Yeahhh
But feels much better than LMArena's UI
If we have such a UI, I would happily use noting other than LMArena lol
I have a problem with deleted convos coming back for some reason even if I delete them
on LMArena
I need to put it on #1343291835845578853
Even if my convos leaked to others, they would be about general tests and brainstorming
and translation
basic stuff
no sensitive data lol
It means that the dataset is private
hi
Ah yes datasets
Hey joel 👋
It's a very well known benchmark. Maybe this graphic is more familiar:
Is there a reason for the IQ being so low?
Yes, I'm well aware. Do you know how they grade the results? The methodology?
Concrete reason?
Does Gemini 2.5 pro have a higher version of it?
I presume they just grade it similarly to how they'd grade a human. These IQ tests are typically pretty straightforward to mark (multiple choice, unscrambling a word, etc.). The test set is private though, that's why it's considered "offline".
Sam Altman's excuse was a "router issue" (unrelated tweet). But official API status page didn't report it.
I see.
Yes sure, but it's an average of the last 7 sets as stated, as you can see there is only a few results so far. Look at GPT5 , 116 and 70 iq, with an average of 93 hence that data, but there is 5 more data points they'll need to give a statistically meaningful result
But I do think it's partly caused by unreliable routing. It might have underestimated the difficulty and routed it to a weaker model.
If it hits 116, I can assume it'll hit around there on average once all the testing is done
Have to wait for all the testing to be completed first,
I don't think the router is very reliable.
The 115 is on the MENSA (public) test, if that's what you mean by 116.
This is what I mean.
More data is needed to smooth out the variance
But yes the offline is better
I mean mensa.no
Interesting, it increased for GPT-5 but the Thinking model was still worse on the second test.
Yes I'm confused with the thinking result too , I assume it to be much better
we know for sure that gpt5-high is better than o3-high
I think it'll need to be tested more to smooth out variance
Actually, look at o3 Pro's score too. Did it get nerfed?
I expect thinking to do far better that regular gpt5
IQ coincidentally drops by 45% right after GPT-5's release?
that offline iq test is very weird. Thinking version did even worse iirc
Certainly surprising
and why is o3-pro lower than o3...
Oooooof
When I use chatgpt app, I get GPT low/mid or high? how do I figure this out?
wont tell ya.
its my computation. only for me.
and the other higher ups
we need to start gatekeep this more often MY KINGS AND QUEENS
THE COMPUTATION IS OURS ONLY
umm... I have feeling that most of my queries are going to low or mid. I am getting better responses from gemini model 🙁
How do I force high ??
ok try lmarena
just to see if the reply there is diff.
hey
and?
if you use gpt5-thinking rather than "gpt5" and get routed to thinking, the reasoning effort is gonna be higher
Apparently they dumbed gpt5 down for a while idk why tho or if it’s evej true
Even
it's like..
gpt5 = gpt5-minimal/low
gpt5-thinking = gpt5-medium
Gpt 5 pro= gpt 5 high
Pro is parallel requests though
so it would still be more capable even with matched reasoning effort
So he is the best gpt 5 model?
but yeah it could be high... For Pro sub maybe even normal gpt5-thinking is 'high', unsure
Yes pro is. But it's very expensive
Bro do you have a prompt for test ai?
Bro just use it for free and unlimited on Genspark AI
It said "deeper reasoning" if you click on the thinking model bubble. No idea what mode that is
No gpt 5 pro reasons almost double the time
no it does not.
Can you stop and actually use the model?
you are being pranked.
I have on ChatGPT, it's very odd there
I asked it to summarize the chat but it added in a bunch of random stuff that was never mentioned
Gpt 5 pro is best if you dont have pro plan use gpt 5 on lm arena and say to it think very hard
Because that is their top model
Do we know whether the GPT-5 in the leaderboard is medium or high?
I think it should be made clear.
juhm when i say think hard its worse than if i dont say how to think
That's closer to it, but that's not the model on ChatGPT Plus
How so? It has been always better for md
Me
thats why i prefer using poe.com. u can literally pick gpt5 high there
The model on ChatGPT is very odd
We are comparing top model from each company. That is the top model from openai
No this just proves that it's a sh'it test. o3-pro also scores less than o3.
It doesn't matter if it is not gpt-5 high. The "think hard" thing is for the router to enforce gpt-5 high.
gpt5-pro is not even significantly better than gpt5-thinking
Bro no router in api. Also gpt 5 high is just gpt 5 on reason effort hard
Why would it be? All the other models are fine
Wich lmarena gpt 5 is set to high
Not to mention that it tests each model multiple times, which shows changes over time.
My point exactly. There is no pro, reasoning, etc in API. It's just 'minimal', 'low', 'medium', 'high'
Is it on the API?
There is no such a thing as gpt5-pro. It's not technically correct.
ChatGPT 5 Pro: gpt-5 high
ChatGPT 5 Reasoning: gpt-5 medium
ChatGPT 5 No-Reasoning: gpt-5 chat
ChatGPT 5 Default -> Router stuff
It is sad what a world we live in. People are trashing gpt 5 just because it does not satisfy their delulu
I saw gpt pro reason for 11 minutes. Gpt 5 high 5 min
Sos they are diffrent
So
LOL. That's not correct at all.
gpt5-high is just gpt5 with high reasoning effort
Ok wth is actually GPT-5 Pro
Pro is prompting the same model several times in parallel
you can also have any reasoning effort with Pro
wait. source?
parallel compute?
There's no endpoint for it
The model in the pro plan
There's no API for it yet I think but it is avail for Pro subs
Hmm, you mean they benchmarked the model manually?
Yeah i mean they gad to
Had to
It took them so long to do that for Grok
Look up grok4-heavy I think it was documented, the concept is the same here. It's nothing new.
Oh so you saying that gpt 5 pro is just two gpt 5 high?
Also gemini deep think uses similar system
We don't know how many instances exactly, but it's more like 10 of them tbh
I mean, it's nothing remarkable. That's exactly the same score as o3 without Pro.
No no way that would be expensive as hell would surpass the 200 doller pro plan price
Based on this pro is high
like cons@10 prompting except here rating of each individual response works differently. So it may choose a unique response even if it was very different from all the others
you hello guys, is on LMArena chatgpt 5 thinking? (i searched it but coudnt find it)
No it is better
Gpt 5 thinking is gpt 5 medium
Lmarena gpt 5 high
The name has been changed
aaaa okey thanks ❤️
REMINDER!!
POE gives you 1000 to 2000 GPT5 HIGH PROMPTS
This is written like that because you can have "gpt5 Pro (Medium)" or Low. They explicitly selected pro model and then selected high reasoning effort.
for only 22 EURO
Guys can you go and test models and see the reason time and not spread false info?
For gpt 5 pro go on youtube
Yes because there is no api at the moment called gpt5-pro
Gpt 5 high go on lmarena
That's what I'm saying.
if they just wrote gpt5-pro that would be incomplete. You can't run a request through API (they probably have early access...) with pro without selecting specific reasoning effort
Coding scores are a bit confusing
Agentic coding matter more i think
livebench category scores are far from reliable
Yeah
They're closer to the expected performance in Copilot
@autumn cargo same applies for gpt5-pro
you can't run it without some specific reasoning effort
Anyone know the reasoning effort of GPT-5 in GitHub Copilot?
I don't think so. Do you have a source on that? I believe they are using gpt-5 api with "high" reasoning effort.
Idk they must have wrote it somewhere
I just showed you screenshot of o3-pro. gpt5-high is not pro, just like o3-high isn't pro
It's cheaper in the API
At least in terms of input tokens, which is more significant I think
The score discrepancy for low is even more confusing
Maybe the router just decided to send those class of problems to a dumber model
Yes of course o3-pro and o3-high are two different models. I was aware of that. But based on the (now changed) livebench leaderboard and the fact that no api was officially released as gpt5-pro, I though gpt5-pro was the same gpt5 with high reasoning effort (ie gpt-5 high). Thanks for correcting me.
gpt5-low actually looks like a very decent model to be fair...
Does anyone know when the leader board is going to be updated next?
I noticed this when I was testing chatgpt router
when it routed to reasoning the responses were nearly as good as you can realistically expect from a thinking model
even though most definitely this is low reasoning effort
Why does my gemini 2.5 pro print incompletely? Is there a way to fix it?
I see it as the successor to the GPT-4.1 coding model, but the routing seems a bit unreliable to me. The main model is probably good, but sometimes it just drops the ball.
Like how can there be such a discrepancy in results between the two coding categories. I think the router is messing things up
gpt5-minimal is underperforming though. I think that's the main reason they don't give you choice for no reasoning at all
the gap minimal to low is insane
and they can't afford to lose out to gpt4.1 lol
I think it's affecting the routing strategy
I don't think the gap is just because of "less thinking"
gpt 5 base is just bad
it is not less, it is quite literally no thinking at all with "minimal"
it outputs 2 times less than gpt4.1
horizon alpha version of gpt 5 base (juice 0) has a worse gpqa diamond score compared to gpt 4.1 nano
I hope that's not the version in Copilot. It'll be a downgrade.
I havent played the arena in a week. Any cool anonym model right now?
horizon beta (juice 5, likely an early version of minimal) did worse than gpt 4.1 mini on gpqa diamond
Or did gpt-4.1-mini have data contamination
openai didnt mention the horizon models in their announcement because everyone thought they were nano or mini models 😭
I don't think it's bad. It's actually impressive the gains they were able to make with spatial reasoning. It's just that it's hard to make a hybrid model which would be SOTA both when maxed out and with reasoning off.
gpt5-minimal score is low because it's too concise when it doesn't get to use reasoning tokens
they svg maxxed this model among other things. cpt after cpt 🤷 the brain damage was apparent with the horizon checkpoints
simpleqa was 33%...
Spatial reasoning?
but i agree about the hybrid thinking thing lol
I mean if that was true gpt5-high would suck. But it beats o3-high convincingly
no, i agree its probably the hybrid thinking thing causing the poor performance in general for the minimal/not thinking variants
but they definitely focused on svg specifically among other things
because horizon models had poor benchmarks whilst people liked the svg from those
the svg thing is a tangent of mine sorry 🤣
what is the limit of gpt 5 on lmarena direct chat?
but if you look at webdev arena...
someone please answer
???
but is very fast ruins thinking
1415926535 8979323846 2643383279 5028841971 6939937510
5820974944 5923078164 0628620899 8628034825 3421170679
8214808651 3282306647 0938446095 5058223172 5359408128
4811174502 8410270193 8521105559 6446229489 5493038196
4428810975 6659334461 2847564823 3786783165 2712019091
4564856692 3460348610 4543266482 1339360726 0249141273
7245870066 0631558817 4881520920 9628292540 9171536436
7892590360 0113305305 4882046652 1384146951 9415116094
3305727036 5759591953 0921861173 8193261179 3105118548
0744623799 6274956735 1885752724 8912279381 8301194912
9833673362 4406566430 8602139494 6395224737 1907021798
6094370277 0539217176 2931767523 8467481846 7669405132
0005681271 4526356082 7785771342 7577896091 7363717872
1468440901 2249534301 4654958537 1050792279 6892589235
4201995611 2129021960 8640344181 5981362977 4771309960
5187072113 4999999837 2978049951 0597317328 1609631859
5024459455 3469083026 4252230825 3344685035 2619311881
7101000313 7838752886 5875332083 8142061717 7669147303
5982534904 2875546873 1159562863 8823537875 9375195778
1857780532 1712268066 1300192787 6611195909 2164201989
they focused on 'big model' things with the cpt/etc it seemed, svg, web dev, etc. just found it funny they focused on svg. the benefits are definitely real though in those areas. at least with the horizon checkpoints, those models were fried except in those regards
my points are about different things and i clustered them together for some reason and it's confusing sorry 🤣
The minimal reasoning version is #1 on Design Arena too.
But again there is a discussion here https://community.openai.com/t/the-least-important-question-right-now-why-is-gpt-5-pro-not-available-in-api-at-exuberant-pricing/1339471/2 that hints that gpt-5 pro is just a maxed out gpt-5. So really unless OpenAI introduce a new model named gpt-5 pro, I'm inclined to think that gpt-5 pro doesn't exist! List of models here: https://platform.openai.com/docs/models
Let’s read carefully: Pro and Team tier users have access to GPT-5 Thinking Pro, which takes a bit longer to think but delivers the accuracy you need for complex tasks. Then also consider: That gives us a picture of what Pro delivers: more reasoning, additionally, more context window than Plus subscribers. gpt-5-chat-latest , the non-...
Gemini and GLM 4.5 seems benchmaxxed for React + Tailwind
On Design Arena they drop to #9 and #10, since they can't use React
Give me pro or high give me somethingggggg 😭
can someone tell me why chat gpt doesnt host previews of their code?
Whats design arena
is it as good as aistudio builder
Is it part of lmarena?
yes
API is just not publicly released yet.
https://cdn.openai.com/pdf/8124a3ce-ab78-4f06-96eb-49ea29ffb52f/gpt5-system-card-aug7.pdf
Native image gen finally coming to better model than 2.0 flash. Crazy how long that one has been out without improvement
Logan also showed it editing palmer luckeys tweet
What limit?
There is no limit on chatting with the model i think
What is the name?
And how can try it
Lol he is teasing gemini 3 like sama did with gpt 5
The death star
Introducing GLM-4.5V: a breakthrough in open-source visual reasoning
︀︀
︀︀GLM-4.5V delivers state-of-the-art performance among open-source models in its size class, dominating across 41 benchmarks.
︀︀
︀︀Built on the GLM-4.5-Air base model, GLM-4.5V inherits proven techniques from GLM-4.1V-Thinking while achieving effective scaling through a powerful 106B-parameter MoE architecture.
︀︀
︀︀Hugging Face: huggingface.co/zai-org/GLM-4.5V
︀︀GitHub: github.com/zai-org/GLM-V
︀︀Z.ai API: docs.z.ai/guides/vlm/glm-4.5v
︀︀Try it now: chat.z.ai
Ohh
Guys with this i think gemini 3 will drop this week
deepseek is pretty nice but it explodes whenver i ask it about chinese political or geographical scenario
great when it released
now its decent
but the reasoning time is just nasty
what
New imagen model will come out today
at least something to make up for the lack of glm image gen
I will try it as soon as it comes out
Perhaps scheduled for later
They'll give a taste of what's upcoming
holy
its the new image model
i hope it hass image editing
gemini 2.0 image editing was great
Quite sure it will. This looks like the gpt 5 teaser image with image editing
yeah
I already made the request on #1372229840131985540
Just in case
Lol
Today ? 😍
Yes
I feel like Gemini 2.5 pro has boosted in intelligence these past few days idk if it’s just me
its been the opposite for me lol
I must be lucky 😭
I know you are there , just clicking on your name and I saw LM arena and Z .ai
Try again I feel like they’ve made some sort of update
What version do you use the most?
8
14
3
Direct
which ai product has the best lip sync?
i keep getting "Something went wrong with this response, please try again." with GPT 5 Chat
Okay let me look into.
I assume it's just that model you're running into issues with?
it also happened with main gpt 5
but the thing is the gpt 5 chat error is only for 1 chat
could it be a rate limit of some sort
yeah its kinda broken right now
i dont know if its because im giving extremely long prompts
I'm not seeing errors with either of these models, so it may be a rate limit.
You're seeing the same? Has this been an ongoing issue or something you've noticed recently?
im pretty sure im hitting a limit or something
it generates stuff and then it shows that error
although I am now noticing for gpt-5 the responses are coming in pretty slow and lag.
were both of you seeing the same or was it just the error message?
it must be a limit lol
its like 1500 lines of code
this is new
on the leaderboard
it used to be gpt-5
daaamn
yeah but its just on the leaderboard
i guess gpt-5 is using high effort on arena too
Yeah, that's probably the case
are rate limits only for single chats?
Nope
HOW TO CREATE VEDIOS ANYONE?
Bad webdeb
I have an error with opus 4.1 thinking when it repeates the same response it gave to the previous message. And if I try to put a new message, it gives you have to wait for 50 minutes. But the clock doesn't go down. It's 50 minutes each time.
2.5 flash lite still hasn't come back
gpt-5 is really hit or miss with styling, either it comes with something actually good or something like this
strange
I'll flag again, thank you.
Why does my gemini 2.5 pro print incompletely? Is there a way to fix it?
two different chats at the same time btw
whats the diff between gpt 5 chat and gpt 5 high
How and Why did GPT-5 lose two votes?
Just good in math
@echo aurora Any ideas?
Shouldn’t we have a lot more votes since 5 whole days passed? Why did it LOSE two when all the others gained?
oof
No it's not clear to me, but will flag to the team.
Check this tweet. Google had 26.7K, now 28K
Gemini added 1.3K votes. And GPT-5 loses 2? Lol something is very wrong about that
Wow Opus 4.1 improved more than I thought on the none Agentic coding task.
Impressive. Should retest it more. Will Opus 4.1 thinking be on #1? 😮
I dont think so. Gemini 2.5 still beats it.
But it is very impressive to be honest.
I mean, the votes are too low to decide. Lets just wait around for a bit.
any way to fix "Something went wrong with this response, please try again." for one singular chat?
Many experience it, all you can do is refresh chat or open a new one.
i remember after enough refreshes it did generating for like 10 minutes then went back
damn alright
True. Have to agree with you there.
Btw, if you dont mind me asking, which model do you like the best?
why was gemini updated but not gpt?
yeah seems like so
confidence interval points that way
?
you mean this isn't reliable?
bugged in what way?
sorry, I'm new to this
ah
Following up on this: we are looking into and will provide an update when we can.
Thank you!
pineapple do you know anastasios
Vxtwitter where x is and it'll show the vid
May i ask, how many is the token count of gpt 5 on (LM arena)
idk but it stops generating for me after 2000 lines of code
The vote is based on pre-release GPT-5 testing. After GPT-5's public launch, we created a new model entry that points to its public endpoint and collecting more votes. These additional votes we've been collecting aren't yet added to the current leaderboard. We will be merging the votes in the next leaderboard release. cc @deep adder
gemini was never good at anything
Oh okay thanks
It's been 15 mins, I requested a code on gpt-5 and yes I refreshed the website still same.. Hope it didn't bugged or the code is just too long🤣
I believe yes, but will double check and update if that's not the case.
imagen 4.0 what are you doing man
Nooooo
lol disappointing
yes, confirmed.
legit?
so many grifters these days that it's hard to tell what's real
someone claims they found it in "source code"
who posted it
GEMINI 3.0?
This is 100% cap tbh
Definitely made up
Why tf would they put this in source code anyway 😭😭
What was the prompt?
is gemini 3 coming in august
Nein
image editing version probably
of the 2.0 flash image preview
Donno exactly
makes sense
I'm always excited for image model stuff because voting is easier on visual stuff rather than text
imagen series are fine but really bad at understanding prompt
Needs native model like gpt 1 image
Wrong
it was great at writing,long context, analyzing videos
Also it was fine with reasoning and coding
Why did the amount of votes for gpt-5(-high) change from 3182 to 3181 after the update and the votes for 2.5pro increased by 2k?
is gpt-5 already gone from the arena?
We’ve scored highly enough to achieve gold at this year’s IOI online competition with a reasoning system — placing #6 when ranked with humans and #1 when ranked with other AIs.
In just a few weeks:
• 2nd at AtCoder
• Gold medal-level at IMO
• Gold medal-level at IOI
hello viren 😛
Hi
What is eb45-turbo?
look at my video Today!
ok
i only have one vote left! So vote for my video please!
are you confused?
what video?
btw guys gpt5 sucks
gpt5 is the only one who can do video games
Here's a hint! A image to video prompt is in the video-arena-1: A Female woman in her 1950's cartoon was smiling, giggling & talking someone with beauty. Models: Veo-3-audio-fast vs Hailuo 02 Pro.
i can
look at video-arena-1 and vote for my video!
Gpt 5 high is sota right now. Not sure if it can beat Opus 4.1 for coding but its pretty good
Claude must be doing something magical on codes because even benchmarks not looks great, everyone still using them for coding soo
inspect elemnt lmao
imagen gempix
hey guys! Look at my video on video-arena-1.
Not in long context
Gpt5 😂
gpt 5 very high before gta 6 last day 2025
nah
Fake
Or hallucinated AI PR I should say
If that's the cli one from a month back
gemini has a github emoji on lmarena
Where are you seeing this? Can't say I'm seeing the same.
I wonder how much of that is from some sunk cost thing
like once you get used to your current stack and workflow youd need a much higher threshold of improvement than mildly better to just send it all to air and get used to another tool, even if in practical terms it'd be just a slight annoyance to do
its on lmarena direct mode
it just put that idk why
@echo aurora Sorry for this stupid question but in your opinion when Gemini 3.0 will be released?
In 3 sec
Tomorrow
GUIZ GPT 5 IS BAD, BRING BACK 4o!!!!
It's a flawed benchmark, 1 data point for o3 pro underperformed alot which brought down its average
Soon (I have no idea)
How do you know?
The data is not published yet right?
35% chance is not the same as 'gpt5 will beat gemini in remove style control'
I am the only that don't care about the message taken by lm arena for testing ai?
no
I even put my home address and credit card numbers in it
It is probably on that huggingface link
5-pro
Normal 5 scores worse then others
its very scuffed, but no one actaully reads how the data is collected and presented, its bizzare. ME SEE CHART ME BELIEVE
what's happening?
I dont want that sycophantic thing back at all
Ok tnx, you seem like you know your stuff. I've redistributed my portfolio to account for the uncertainty
this is NOT long context
i'd say after 500k token long context starts
Gemini has no competitor
Only Minimax M1 tried to be close
Also yes, long context is important
Espicially when you try learn something, if you are a student, if you need summarize or analyze of some long text, you have no option besides gemini
how so? I'm hearing mixed opinions across the board on gpt5, it seems highly polarized.
Could be true
As a gemini fan im just admitting gpt 5 is best model right now
Better than O3
Also i do showing big respect towards to their "less praise" model choice
After 06/05 goldmane update gemini turned to 4o like praising you for everything
im really not liking this
Yeah
I understand people had very high expectations but
There is so many unnecesary critizing towards to gpt 5
Also im not sure but probably gpt 5 is very efficent model too which is also important
Better than O3 but also cheaper than O3
Oh really
interesting. But they lowered O3 price soo maybe we should compare with first api price
If i not remember wrong gemini 2.0 flash think was before than deepseek R1 but weirdly people didnt care
it was quite good
2.0 was trash but
No
Listen
2.0 was a trash model but 2.0 flash think was 3x 4x better than 2.0 flash so they did really good on that reasoning thing
Even if base model is trash
btw
Exactly. It's a sh'it benchmark
@deep adder market’s pricing in your hypothesis
What? Ai market is a lot more liquid than the one I trade in
¯_(ツ)_/¯
you seem very confident in gpt, can I ask why? I'm new to this so I don't know a lot.
yes
He's trolling
Can someone Tell me Which one is Chat gpt 5 high
@eternal nicheman your pfp creeps me out lmao
I just realized... I'm SO sorry if that's your selfie
love the eccentricity, but you've now intensified my fear
this is the weirdest thing I've seen in my life and i mean this in the best way imaginable LOL. What's he saying?
sings 'Matushka Zemlya'
Слушать на всех площадках: https://band.link/matushka_
"Матушка" (слова и музыка - Пётр Андреев)
Подписывайтесь на соц. сети:
Сообщество ВК: https://vk.com/tatiana.kurtukova
Личная страница ВК: https://vk.com/ts_makeeva
Instagram: https://ins...
?
yes
"gpt 5" is high effort reasoning model
"gpt 5 chat" is not
gpt5-chat is 4o-latest successor, no reasoning
Bluntly speaking it's probably the same as gpt5-minimal
just like 4o-latest kinda sorta was the same as gpt4.1
@ocean vortex do you what is gpt 5 thinking's base model ?
Gpt 5 chat ?
Or gpt 4.5 ?
GPT 4.5 is too slow and expensive
GPT5-minimal. It’s a hybrid reasoning model and there’s no routing on API. Routing only happens on chatgpt website
So like, that same model is also technically the base model
Best model is gemini
They probably would have used gpt5-chat with no routing if it performed better… lol
But now by routing it occasionally to gpt5-low, it can comfortably beat gpt4.1
CHAT IS WIOTHOUT THINKING
what model is best for development and code?
I'm confused about the division between gpt-5-chat and gpt-5 minimal, low and medium, can somebody enlighten me
They all are derived from the same base model. Chat is like minimal except different fine-tuning and probably less RL training
can i choose the model in the vid gen?
Different, AIStudio allows you to edit and save projects, and it has a React environment.
I think Design Arena uses pure HTML/JS instead for websites
no
Large Language Model
i see, and what about the gpt-5-thinking in chatgpt? Is that just gpt-5 chat with more think time?, something like gpt-5-chat-high if we will?
someone here said grok4 is the best at coding
Is this for ALL coding?
@deep adder opinion on grok4?
october
what's hardcoding?
What do you mean? Generating music?
Yes, you have a chance to be banned when your account is made today
Oh wow how come
because of @hollow ivy starting a gemini 2.5 pro gooning cult
They will eat you alive if you say anything negative about gemini 2.5 pro
hallo
How to Delete a Generation
lf you'd like to delete the initial prompt and generation from the bot, right-click the bot'smessage and select Apps > Delete Generation . Note that deleting the originalprompt will also delete its corresponding generation, but deleting just the generationwill leave the original prompt intact
ts so wholesome
you know that this site is a bit deceptive
GPT-5 via API without a system prompt telling it that is it GPT-5 is a bit deceptive
Is it LMArena?
Yes, & it has been shared with the team.
The GPT-5 model without a system prompt does not know that it is GPT-5. This can be reproduced on OpenAI's API playground
WTF THE GPT 5 HIGH ARE ON THE CHAT?
woah way too many gpt5 which one is for what?
Well the best is gpt 5 high
thanks my guy
hey uhhhh
i was off of gpt-5 for a bit
why is there a gpt-5-high?
i thought gpt-5 was high already?
oh that sounds bad
It's the same model, but added the high to make it more clear
Hey! Not sure if this was asked but i see a leaderboard update in text arena yesterday, but did the gpt5 votes remain unchanged? Score think changed a bit, but not votes? Some bug?
Hello
- we did chat about this earlier but will share the response
The vote is based on pre-release GPT-5 testing. After GPT-5's public launch, we created a new model entry that points to its public endpoint and collecting more votes. These additional votes we've been collecting aren't yet added to the current leaderboard. We will be merging the votes in the next leaderboard release.
Thank you, that makes sense; so I guess the score only updated then
guys what differint with vedio arena 1 and 2 and 3 and 4
No difference, but we want to spread out generations a bit or else the channel will get too spammy.
ok nice idea
Hi guys, is the gpt-5-high on LMArena the same as the regular basic gpt-5 model in ChatGPT?
not quite, gpt-5-chat should be the closest to the experience in chatgpt
So, gpt-5-high approximately at the same level as GPT-5 Thinking (medium mode)?
gpt-5-high should be gpt-5-thinking (high mode)
Really? Thanks for the answer
I'm taking the word of this openai employee on twitter:
https://x.com/ericmitchellai/status/1954680194733863200
@benhylak OH. Guh… I did not know we used different nomenclature in API and chatgpt
What’s GPT-5 in api is GPT-5 thinking in chatgpt
What’s GPT-5 in chatgpt is GPT-5-chat in API
🫣
hey
sup
can someone help me, in lmarena its always stuck generating and idk what to do anymore, refreshing the page doesnt work for me
How many of you think Gpt5 will beat Gemini with style control OFF this month?
CI band is insanely wide on gpt5
It's unlimited free because LMArena is paying for u
But it has limitation i mean this likenit has a amount of messages
Of course
Enjoy your data on Hugging Face later
A fair trade.
A fair trade.
No like, really, there must be A TON of data that nobody was intended to see in these datasets right now
Because people don't read and think they're gaming the system by getting gpt-5 for free
Well, they can read my conspiracy theory tests which is im testing them on every new model
Only problem is if they starts to believe
wait... how come you don't have "reasoning effort" for this model?
oh you probably haven't verified your org since you don't have summaries either. But it's weird they aren't letting you adjust reasoning. You may be stuck on "minimal" lol
Which site is this
Seems like he is confused himself lmaoo
This is roughly accurate, except when chatgpt decides to route your request to reasoning when you are using "GPT5". Then it is no longer gpt5-chat.
the model on the leaderboard is named gpt-5-high, it has reasoning effort high
Yes but I'm referring to your screenshot. It's odd that you don't have that option but have access to gpt5.
oh, I'm not sure I didn't actually take that screenshot, someone else reprod and sent to me
The api calls are using reasoning_effort="high"
we are considering giving a (very) small number of GPT-5 pro queries each month to plus subscribers so they can try it out! i like it too.
but yeah if you wanna pay us $1k a month for 2x the input tokens feels like we should find a way to make that happen...
@hardy lion its a bug, you have to reload the page for the option to appear
what i need to do now..
lmao
Hello
thanks!
Oh for sure they are gonna find a way to let people set their money on fire. Let them have what they want!
?
you can reset cookies only for lmarena on your browser to reset the limit
how
what browser do you use ?
btw, what's the difference between beta.lmarena.ai and lmarena.ai
firefox
Just find on internet how to reset cookies for a website for fiefox beacuse right now im on microsoft edge
i have no clue it think its just the same
whats the rate limit on gpt5 high?
i think right now they're the same. but beta would be used in case to try something, new features or to see if it all works
btw guys gpt5 sucks
you seem obsessed with it
gemini 2.5 pro the best
sam is being practical here, am sure there are enough people from a niche circle paying that
I like kimi k2
even kimi k2 better than gpt5
if you pay that for 2 years you could probably get a used h200
its not worth it
gpt 6 coming out next soon, so this is just intermittent luxury foreplay for privileged few
its not that expensive to do inference at scale to justify such costs, even for larger models
not for certain niche circle of people who dont understand the details i guess
I think they already played that card with $200 sub tbh. More than that it's becoming insanity and then where do you stop...
Not to mention that with 1k per month you kinda already could comfortably rent hw to host any model all to yourself
Any proven data?
Lol
Lmarena just change the name of gpt 5 to gpt 5 high or they change the api?
yes
just the name, it was high as a kite since the start on lmarena
Что вершит судьбу человечества в этом мире? Некое незримое существо или закон, подобно Длани Господней парящей над миром? По крайне мере истинно то, что человек не властен даже над своей волей.
Co ty tu pleciesz?
🍆
gm AI fam! 😁
it was gemini 3
The Discord community in RP trolls pissed me off
is gpt5's context limit on the website still 8k/32k/128k?
ee ti moe opisanie so stima spizdil
yes
tvar'
For vibes yes , id say its the best
But gpt-5-thinking or high is the new standard for complex stuff
Ok
is gpt 5 the best ai model now?
No
then what is the best
Depends on what you use it for
Id say gpt thinking. For vibes kimi 2 and claude are better
what about creativity?
