#general
1 messages · Page 87 of 1
Ohh
LMAO
If GPT5 didn't live up to expectations, why is it doing so well on the leaderboard? And why is polymarket acting like Google won the race this month?
Who said it didn't live up to expectations? I think it did live up to them
It's a great model
the best we have
3.5 gonna destroy 3.0
why is polymarket giving the odds at 80% in favor of Gemini?
Oh, because of Gemini 3.0?
gem 3.0 isnt coming this month apparently
gap seems to be narrowing though
but new 2.5 update
because it's a market for lmarena leaderboard with style control disabled. 2.5Pro still leads in this specific one leaderboard
yes, but if gpt5 is doing so well in everything else, wouldn't it take over soon?
Fortunately i could switch teams just in time
it can but it's close, 2.5Pro is benchmaxxed on lmarena
@candid stormI don't see your name.
and since 2.5Pro is technically leading at this moment on leaderboard with their settings, the odds are that
hm I see
Im RIC25
ah I see you. Grinch127 here
Smth that really annoyed me about openai powerpoint is a lot is just totally incorrect. Like here, 50 is the number of tests they ran not the percent. Imagine getting paid 8 figures and messing up smth so trivial
Nice profit/loss!
And 50 is lower than 47.5
🫡 I'm a bit worried if GOogle will be able to keep up the lead until the end of this month
it makes me wonder what other bugs and stuff they have. entire team stayed up all night prepping and cant even spot errors on a bar chart
"Afk money grind" ahh
It's large
It loves language
and it's a model
Ok
are all other models gone
i like im always the very first for literally everything for some reason lol
yup
My website still didn't updated same with the app
What are you gonna ask the pro model?
no idea
I really like that "introducing GPT-5" screen
It could be way better but it's a step in the right direction
Hope we can reach a level of aesthetics similar to Web 2 Gloss again
Damn I have Pro and only see thinking
damn nerds
im a plus user and i still only have it on my phone
:((
ye they give me stuff early. sometimes it gives a popup to ask if i want to be early tester for features or whatever
.
people called it crap without trying it? thats kinda weird
bro it's one shotting things that opus was sad trying to accomplish
it's bcs i'm trying with real world shi* now instead of some random create a game or site
The negativity is due to it not matching the gains seen in previous gpt 2 -> gpt 3 and gpt 3 -> gpt 4 and hype posting leading up to release. It's clearly v good. Sentiment might get better
There’s treatment centers for gambling addiction
what about the lmarena rating, though
I haven't tested it thoroughly yet as i dont have it on my computer yet
its still SoTA wdym
it was below gemini for a bit which is a bit concerning
bro i think it's matching, atleast for code
anyways gpt 5 is a great model and its better than 2.5 pro
it was some benchmark i believe
but yeah, i will see y'all tomorrow as i'm heading off to sleep
yeah it's better
it's sota
for me is underwhelming codex cli using gpt 5 on a project with 600k + tokens and identifying right files and doing modifications without creating any bug
and my prompt was not even good, it just did
obviously its sota lol
even in the benchmarks it was usually
lets see if it is SoTA in livebench
true
o3 was lazy
it's not
well, i didnt try it on chatgpt interface yet
for now i'm burning some dollars on the api
i think it saved me 5 hours of work for $2
🤓
When we adding Genie 3 ? 😼
that must be awfully expensive
Upvote it! #1403076345138774066
Is genie two at least public?
still not able to solve simple chess puzzles with human words unfortunately
is GPT-5 Thinking and GPT-5 Pro coming to lmarena?
Same bro I having trouble wait it will take a while to get to your place
It's just rolling out slowly
Since people live is us and openai is there they got it fast
gpt-5-pro is not even api avaliable
so no
and thinking i think already is?
Oh I see
So the pro version aren't on the api?
Like o3-pro
Nope
Dang
it's not necessary btw
they said EoD for Plus accounts
this model takes more than 10 minutes for asnwer
damn is it just me or is gpt5 pretty meh at UX design?
claude does it a lot better for me
yeah it's meh
so better for backend stuff?
but if you give a example he is really good copycat
idk
i can use a example of a design that i like and gpt-5 can create using that
claude is not so good with this
trying it now
but if you don't have idea of what you want, then claude is better
yeah I mean gpt 5 is useful its just not as versatile as I expected
where are you using it?
its only medium reasoning on cursor rn
why do people even like cursor
I feel like they nerf models hard
everytime
it nerfs
you are not seeing the true potential
bcs they use like 3k tokens and rag
whats ur pipeline
rgn i'm using codex cli
they updated it
the reason i'm testing
claude code still better i think, but i'm not paying for claude anymore so
i think yes
i'm using api credits rgn bcs it was not working
the login thing
but they added it
I stopped my max 20x sub on claude cuz I thought gpt 5 was gonna be way better lol
might sub again
I don't see it on the leaderboard?
it felt so garbage on cursor
well idk
holy fkkkk
even sonnet feels garbage
im sure they have it super nerfed
on cursor
cuz they made it free
bro they made it free, what you think?
for a week or smth
yeah
but like thats dumb no?
people that use it gets a bad taste
like this makes me wanna stay away from cursor more lol
When do you think GPT 5 pro will come out as an api
All right
i'm not saying never ,but idk

probably a few months
but it kinda seems like openai is cooked now if this is all they got
I mean ill try it yeah
setting it up right now
are u impressed?
like actually
how does it fare to opus
i'm
opus failed 5 tries and gpt-5 one shotted a task on 600k tokens project
i didnt send 600k tokens btw, it was using cc and now codex cli
hmm
ur using vs code?
bro it was like 8k lines modifications without creating any bug
yes
yeah that sounds crazy
i never see an AI do that much modifications without break anything before
its way cheaper than opus too
it's C# btw
ok thats impressive
i run the test just for fun, i didnt believe in it bcs my parament was if opus and gemini 2.5 can't so no model can
i liked too
Still sad that O3 will disappear
Btw in free version on android, you can select reasoning or non reasoning
So which reasoning mode running in mobile app for free version ?
Medium reasoning ?
guys
i figured out the true release date of gpt 5
gpt-5: 2025-08-05T20:29:37 UTC
gpt-5-mini-2025-08-07: 2025-08-05T20:31:07 UTC
gpt-5-mini: 2025-08-05T20:32:08 UTC
gpt-5-nano-2025-08-07: 2025-08-05T20:38:23 UTC
gpt-5-chat-latest: 2025-08-01T18:35:06 UTC
gpt-5-2025-08-07: 2025-08-01T19:09:20 UTC```
ofc but how long it would take gemini to catch up? gpt 5 is newer after all.
nah..
gpt-5-chat-latest was made on august 1st lol
so
yall
so on lmarena we are using a version thats 2 days older
connect the dots of the models which released like horizon beta n stuff
What about gemini?
can someone connect the dots and using this info figure out which of these models zenith, horizon beta etc could be
its way below
wasn't livebench laughable
is livebench legit now
really?
💀
Not in all aspects
Clearly
The benchmark is private
There is 10 public question and 400 private ones. The public ones not used in the testing
grok 4 is actually an amazing model
Guys, in mobile gpt 5 thinking using which reasoning mode ? Medium reasoning or high ? Or do we need to buy plus for able to select high reasoning
idk how the reasoning levels work
the api just says gpt-5
what about the gpt-5 in lmarena
oh, I thought thinking isnt available on free plan at all since they removed model selection
hey guys, so did anyone figure out here what zenith was?
still waitin
fine ill try it
No one can
it was deepseek r2
no, I posted this:
a
Oh ok
this way we can find out what zenith could be
gotchui
holy
^
i know xd
gpt-5-chat-latest
yeah pretty sure
created 1st of august
likely just an edit of zenith
dingdingding
so zenith was not a thinking model?
How do people fall for the simple bench 90% GPT5 "leak". They must be incredibly dumb
thinking models are not separate
^
Is zenith gonna be on the leaderboard?
Actually makes me lose faith in humanity to see the sheer number of people blindly believing it
Without even a second of critical thoughts
why
wow so when normal gpt 5 thinks in chatgpt it's zenith?
zenith was so good
so, zenith was probably gpt 5, summit could be mini and lobster could be nano
there are 2 separate gpt 5 models though
they said summit is gpt 5
how do we select how much we want gpt 5 to think in lmarena
lmarena mods
gpt-5-nano was made 4 days later
you dont
o
is anyone using gpt 5 codex cli rn
fr yes cap
A heroic white police dog with shiny blue eyes, wearing a full police uniform, is bravely rescuing a brown rabbit from drowning in a fast-flowing river. The dog is standing in the water, strong and determined, holding the frightened rabbit gently in his mouth while people watch from the riverbank with admiration and awe. The scene is realistic and emotional, with splashing water, dramatic lighting, and a clear sky. The dog is the hero of the town, and everyone loves and respects him
well yikes
grok 5?
i thought it would be grok 4.1 or 4.5
since its just a coding version
i just realised the placement of the lines might actually represent the exact date of release
@deep adder question 10 is solved
thats what musk said
Brillant
got this result with gpt 5 thinking twice! i guess using that logic, gpt is really good at figuring out the date but seems like my theory is either wrong or the model is delayed
its probably just gpt 5 high
hey thats cool
no signup too
deep research pops up on signin
cool
gpt 5 in microsoft copilot sucks
really?
yes
which site?
woah
you basically get points from rating the AIs
and you spend them points to use the ais
gpt-5 on microsoft copilot is great
but you have to ask it to think deeply first
so, https://obl.dev
also in poe app you can select high reasoning mode
so apparently zennith was gpt-5 too, but another version, and summit won for some reason and they killed zenith
😢
whats that
RIP zenith
plz gemini 3
how does the reasoning effort of gpt-5 work?
the gpt-5 of microsoft copilot is smarter than the one in lmarena
Interesting
yeah that website sucks
yupp.ai is miles better
if we tallk about using paid models for free of course
if i use gpt-5 to prompt engineer a prompt for gpt-5 🤔
man, thanks for this. I just used for o3 pro. Its nice to able test this
btw on poe app, you can use gpt 5 high reasoning with free for multiple prompts
dont use it too much tho
why
I’m scared of what the future might bring for ai
its too expensive
also one thing
if you go here and make your chat public it reduces the cost by half
Interesting
yes very interesting
another nice information, ty !
microsoft launched a 3d model website
LM arena is still my beloved but, its interesting to see someone trying to be competitor
lm arena still wins
its free
no sign up
what i like about yupp is that it has a lot of models
DUUUDE WHAT THE ####
https://123.nekoweb.org/ai/GPT5/COPILOTMARIO.html
THE ASSETS LOOK SO GOOD
even though jump doesn't work
where did you make it
copilot!!!
yep
e
@stray aspen did you sign in?
yes
it hasnt rolled out for my account yet
probably because i created it outside of canada
ohhh interesting
im in ireland
yes
does it cut off
It's faster than LMArena for me
that fancy UI makes it laggy
thats crazy
I'm on mobile, Web Dev Arena throws an error like 50% of the time
Sandbox fails to appear, voting button disappears, lol
ChatGPT app can't copy
Claude input bar still buggy
Why do all these AI apps have so buggy frontends
Gemini is so funny
It manages to recreate an entire component from minified and obfuscated React code
But then gets stuck trying how to make an inner div fill its parent
All it has to do was remove max-width
yupp has the o3-pro model to use. i found it interesting, I'll try it later."
no lol
its old
@stray aspen @thorn valley @verbal nimbus check this out
opus 4.1
Whoa cool, it runs on mobile haha
But I can't interact with it
really?
Yeah, no lag
i'll try gpt 5 nano to implement nipplejs and a circle button for jump for mobile controls
just like yesterday i could upload images in direct chat i used gemini 2.5 pro but it doesn't let me and only shows the error anyone know why
IDK what's up with Web tech, nowadays stuff loads faster on mobile than my gaming PC
no lag here too
:D
well optimized, to tell the truth
its a whole game with sprites, sounds and levels in a single html file!
but I can't interact either
incredible
i'm trying to get nano to implement mobile support
it might be janky but
@thorn valley @verbal nimbus can yall check if there are mobile controls now? https://123.nekoweb.org/ai/GPT5/yuppmario2.html
How can I use picture to picture.
Context Arena Update: Added GPT-5 (Thinking, 08-07) to 2needle (#1 @ 128k AUC), 4needle (#1 @ 128k AUC), and 8needle (#1 @ 128k AUC) leaderboards! Also added GPT-5-Mini and GPT-5-Nano. (https://x.com/DillonUzar/status/1953660295559192919)
More model results at: http://contextarena.ai
Overall GPT-5 is great for <=128k! Only exception is 8needle, Grok 4 still performs much better at <=32k compared to GPT-5, but GPT-5's performance at higher context wins out.
2needle: Top results (AUC @ 128k):
- GPT-5 (Thinking, 08-07): 96.7% (#1)
- GPT-5-Mini (Thinking, 08-07): 92.6% (#2)
- Gemini 2.5 Flash (Thinking, 06-17): 91.5% (#3)
- Gemini 2.5 Pro (Thinking, 06-05): 89.6% (#2)
- Gemini 2.5 Flash (Non-thinking, 06-17): 81.7% (#5)
- Grok 4 (Thinking, 07-09): 79.5% (#6)
- o4-mini (Thinking, 04-16): 76.0% (#7)
... - GPT-5-Nano (Thinking, 08-07): 44.2% (#34)
8needle: Top results (AUC @ 128k):
- GPT-5 (Thinking, 08-07): 50.3% (#1)
- Grok 4 (Thinking, 07-09): 48.4% (#2)
- GPT-5-Mini (Thinking, 08-07): 44.7% (#3)
- Gemini 2.5 Pro (Thinking, 06-05): 43.9% (#4)
- Gemini 2.5 Flash (Thinking, 06-17): 33.5% (#5)
- o4-mini (Thinking, 04-16): 30.8% (#6)
- o3 (Thinking, 04-16): 27.9% (#6)
... - GPT-5-Nano (Thinking, 08-07): 11.9% (#22)
niiice
thats great
anyone know when the next set of votes will be added to the overall rankings?
gpt 5 yaps too much
Anyone know what was up with Zenith? It seemed even better than Summit, which was GPT-5
When do you guys think Gemini 3.0 is coming?
yeah, I dunno... my guess is 3.0 is pretty much dead in the water unless someone else launches a more powerful model
how many prompts did it take?
or did u make it ur self?
one
craazy with opus 4.1 or gpt 5?
4.1
idk why i uploaded it to the gpt 5 folder lmao
yeah thats why i was asking
is opus better for coding that gpt 5
trying this
5 minutes ago https://www.youtube.com/watch?v=boJG84Jcf-4
Our smartest, fastest, most useful model yet, with built-in thinking that puts expert-level intelligence in everyone’s hands.
LOL
wait wtf thats literally my generation what?
LOL
DUDE
nvm i think..
In my text GPT5 code is next level
how often does lm update?
Yup, but it's more buggy than the previous version. If you're on PC, you can enable mobile preview on Dev Tools.
DeepSeek R2 is supposed to come out this month
What is the difference between chat gpt 5 nano and gpt 5 mini?
Do the gpt5 on llm arena is a thinking model or not?
Source?
thinking
Cool thanks! I'm trying it right now I'm very impressed, anyways sorry for many question but do you know what variant of gpt5 is this?
lmarena have basic gpt-5, gpt-5-mini and gpt-5-nano
Oh cool! I thought there's only 1 gpt 5 model.. Thanks !
I see them described as
- gpt-5-mini: cost-optimized reasoning and chat; balances speed, cost, and capability
- gpt-5-nano: high-throughput tasks, especially simple instruction-following or classification
Yeah we added the other two a bit after we added 5
Yep i see that now, Thanks!
@echo aurora Its hard to keep up with stealth model names. Can you show a stealth leaderboard?
what is this?
Hmmm...... Maybe they can't really search the web.
Is there other models which can search the web in lmarena?
I can share but overall I think that'd be tough to do. Are you getting at a leaderboard of just stealth models or just a list of all active stealth models?
The two models can't search the web, is there other models which can search the web in lmarena?
idk
I would prefer the first option.
The elo scores are never revealed to us until model release
Gotcha, yeah I'm sure there are some creative things that could be done with that.
Maybe let companies, who provide those anon models, decide, whether they want to disclose scores / show model on leaderboard or not... Some anonymity or mystery is fun, keep it that way.
GPT 5 is not Skynet..... 🤬
gpt 5 is insane
People say gpt 5 have another update in few days is that real
GPT-5 only won 33% of the 49 battles against Gemini 2.5 pro if I am reading this correclty? https://lmarena.ai/leaderboard/text
Is it normal that gpt 5 is not formatting the code? Or is that on the lma side?
yep. next line gemini 2.5 wins 67% of times
Lol openai still let you use old models on pro plan
Gpt 4.5 is not dead 😮
i really wonder what's taking them so long to roll it out to plus, especially considering i already have it on my phone
Clear cookies 🍪
HELL YEAH
And people is reporting that apparently gpt 5 is dumb on chatgpt
The gpt 5 chat model is dumb
one caveat is that that is only for decisive votes. ties are filtered out. 67/33 is pretty steep, but of thier 49 battles, we don't acutally know how many they tied. It's technically possible it was 1 gpt-5 win, 2 gemini wins and 46 ties. We should find a better way to report the ties.
A simple one could be to just count it as 0.5 for each. Any suggestions?
damn, now i got it on my pc too, thanks
It is "Battle Count for Each Combination of Models (without Ties)", so I am assuming ties removed from it.
oh you're right! it aught to include tie counts in both places imo
That's for style control, which takes more into account than just the raw wins/losses. You can see gemini's heade to win win-loss advantage reflected in the non-style leaderboards where it is actually above gpt-5
ChatGPT literally got worse for every single Plus user today.
There's no way to reliably get thinking models anymore.
Before we had o4-mini, o4-mini-high and o3.
Now we have GPT-5 Thinking with 200 messages per week and a router that exclusively routes you to some small and
Plus users cooked
Im very glad to hear this. Thank you.
could anyone explain what is "style control"?
is it system prompt?
Apparently it removed emotes and other ‘human’ elements.
first test of the day on pc, and i can already say it isn't that censored which is nice
it also isn't nearly as lazy as o3 was, wow
What have you found? 👀
It's explained in this post: https://news.lmarena.ai/style-control/
The general idea is that research has found that even if two responses contain the same information, people will vote for ones with more "stylistic features" such as markdown, lists, bold etc.
It's even been found that people will vote for more stylastic responses even if they are inaccurate or wrong. Some companies did RLHF too hard and their models were optimized just for responses that look good
So style control learns two sets of parameters, the model strenths, and the importanes of the style features. And then the model strengths are actuall interpreted as "the model strength if all style features were equal". Those are what is reported on the style controlled leaderboards, which are the defaults. It's similar to controlled trials in medacine where they correct studies for differences in age, or other factors.
it was able to 1-shot optimize a bot system for a game, and it did by a lot
with no resistance whatsoever
Hello 👋
hi
Wth
After my i hit limit on gpt 5
It switched to gpt 4o mini
Not gpt 5 mini
Are they high
@echo aurora Yo a question. I have found on copilot GPT 5 but i the same that is chatgpt app?
Does anyone know if in the future it will be possible to upload files to the LMArena project?
Is possible upload files
What are your thoughts on lmarena using your data?
3
4
3
I'm extremely careful not to reveal any sensitive infor
Dang. I'm gonna miss 4.5 🙁
I think it only allows images, not files as such. I wanted to try with a Word file but it doesn't let me
yeah it's the standard model
I am pissed off. People on r/Singularity are posting about their rage about GPT-5 and other models leaving... Then unsubscribing. Some reasons were: It has less personality, answers are too short, it wasn't as big of an improvement... Etc.
I really do not understand how people are like that. They are surely going to update the model like 4o to adhere to people's needs.
Sorry. A bit angered at the moment.
yeah it's something we're putting thought into, more file upload options would be nice
So is GPT 5 medium?
is it -chat or -thinking ?
I have same question too
Lol
Also, can anyone confirm the exact model version/variant of GPT-5 that is available via direct chat in lmarena?
I'm going to double check and will followup
SHUT UP
I am trying lmarena web for image geneation with prompt somehow most engines now a days even if mentiond 16:9 creates only square ratio. can anyone help how to specifically force to have 16:9 or 9:16
seems to me like its the default one
not the -chat ver
Look at mr fat cat here. Did you go long on OpenAI?
I went long on google man. It’s imperative Altman’s products lose to Google
With thinking?
Or not
So the most most basic gpt5 model nothing else, no thinking?
Dang. I'm gonna miss GPT 4 preview 0314 🙁
In my case I would like to use it for school, it would be great
it would be nice, but why arent you using gemini on aistudio for that?
best model ever 😭
i still think the current gemini is better than gpt5
im using gpt 5 high on cursor, through openai api
is it still nerfed if I do this?
it feels nerfed lol but maybe im just expecting too much from gpt 5
someone will tell me how to generate videos here ?
Guys the thinking model of chatgpt is not gpt 5 thinking
The output is worst 100% of the time
There is something wrong with chatgpt interface thinking
😡
It is, cursor will always use rag with exception when using max mode and it don't work with your own api key
Wow i tried vibecoding with gpt, it refused and says the feature I'm requesting is impossible for the current environment, I think that's what set apart between gpt5 and other models.. Other would just most probably agree and waste alot of tokens and my time
yeah but I used it in vs code and it has no internet access
@echo aurora just out of interest, are there any plans for a 3d llm arena?
mc bench
why my videos doesn't have background sound,
goes to show that hassabis is the only one not bsing
I'm assuming its copyrighted
Then why video long only 4sec and helf, not 5sec🥲
GPT5 model not working right?
i just ask any questions but only get "Something went wrong with this response, please try again." response
Hello
Try sending a second message on the same chat don't retry it. Works fine for me
no, not work for me...
Yeah and usually it happens to me when my input is long
Try to reload the page
"Its not gonna be a router" they said
@scaling01 btw model auto switcher is apparently broken which is why it’s not routing you correctly. will be fixed soon
😆
What is google bard?
old gemini
The OG google model
I tried it when it was new. It was only good for basic stuff
no coding or such were good
Would be nice to see it back for "nostalgia" vibes
google bard is an absolute joke
it told me it can't produce a script in a coding context since it can only generate text

does anyone know how much worse gpt 5 mini is to gpt 5?
Grmini in Gemini app is better for studying than ai studio because it has a lot of good feautures
Did open ai remove the old features like search and edu mode from chatgpt free tiers who knows ?
You can always put them side to side on LMArena and try the same prompts and see the difference. Benchmarks can be a hit or miss at times
oh i didnt notice they added the mini and nano versions
thanks
alright. so have u found out what is best way to use gpt5 for swe?
is it cline, cursor, poe, chatgpt?
hi i love lmarena :3
Hello again, does anybody know what this pre-release model Velocilux could be? Came by when I was doing battle mode.
I got gpt5 to leak its system prompt to me
so far so bad with the gpt5 in chatgpt
it keeps trying to access variables before even initializing them
PLEASE make it so that I can share raw github with the ais 🙏
i noticed claude doing that aswell
chatgpt makes it easy
you can just make a zip file and upload it
then it will extract it and view the content
why chatgpt 5 talks like college students write note in their lectures?
system prompt mb
the model on lmarena actually wrote the code without minifying it
i dont know why this model does it
or well to an extent
gpt5 sucks
i have memories and all off so that shouldn't influence anything either
to be fair i didnt really use a three essay prompt, but i think it should pass this (nothing spawns, tried multiple prompts to fix it and nothing)
gpt-5-thinking in chatgpt.com ^
Is the gpt5 model in lmarena uses high reasoning or the medium one? Thanks for any answers!
yes
Alright thanks!
What is considered to be good code? Noob noncoder here
it's not really about the quality of the code itself, it's the fact it started minifying and making everything hard to read right away, claude NEVER does that
feels lazy to me personally
ive seen a video comparing gpt 5 against opus 4.1 with pretty similar prompts to this, and opus 4.1 beat gpt 5 by a mile
Ah, okay. From what I understand, commenting the code is important
Holy jesus
I'm not sure, but i feel like gpt-5 was nerfed
same feel
I mean the max reasoning only?
yep
in #1372229840131985540 already requested to non-think model
when it reasons in Copilot/gpt website it doesnt take so long
as much it takes in lmarena
idk
I wonder which one is nerfed
maybe because llmarena using not the same provider that using official openai
i don't know
I wonder if Copilot uses a nerfed model or not
as everyone hates copilot
idk
What are the models in battle mode?
random models
still no GPT5 on my plus account..
GPT‑5 is available to all Plus, Pro, Team, and Free users starting today with access for Enterprise and Edu coming in one week. It may take a few days to roll out to all Free users.
- Pro users get unlimited access to GPT-5 & access to GPT‑5 Pro, ideal for the most challenging,
well, wait until it's not all rolled out yet, it was rolled out to me today for my Pro subscription
lmarena is actually insane bruh
how come?
elaborate
i have it on my mobile but not on desktop
just the damn concept
so many models, many of them paid or limited, image gen, web dev and now video gen all at one place
for free
wthelly
neither for me
I wish there was more promotion about the service to get people test it more
and get more data
but then enshittification would occur
i like it as it is
wdym? All kinds of people test it already
not just coders or specialists
isnt mainstream
I am an ordinary day to day user myself
yea me too but it isnt really that popular
to be considered "popular"
I wouldn't call it enshittification. Dynamic testing is the future for AI models. Though I do understand the concern of models getting nerfed from intentional bad voting
trying to sabotage the list as LMArena gets more popular
I checked the traffic yesterday and it has been increasing steadily
qwen 3 randomly popping off at 10000 points
when i asked the opus 4.1 model it said it was sonnet 3.5
😭
many models do that
probably 5pro
even chatgpt on it's official page when asked says it's powered by gpt 4 with a knowledge cut
off at oct 2024
ah so it doesnt actually matter then?
OAI really made users have the same limits as with o3, even though it significantly reduced its costs internally
disappointing
no no the same it's the model shown in the tab
Check My Website just a driveing game it would be cool if want to try it https://id12b.github.io/
Does GPT-5 make images too?
and its maked by gpt 5 you can not think how good it is
Hello?
Cannot say that it is phenomenal
Quite the same except for hallucination rate
And perhaps coding performance and different formatting when writing an output
Gemini 3.0 is where it is at when it comes out. Also Deepseek R2.
Is GPT5 MoE?
just try to clear you browser cookies turst me it will do it
i had the same thing i just cleared my cookies and its done
Is GPT-5 slow at answering? Sometimes I get errors and need to refresh the message and the website.
its diff on what messages since it many think on diff messages for not hard messages it doesnt think for easy messages it just doesnt think
Trying cleaning you Browser Cookies or its just that openai servers are busy
When I generate videos some vids are long 8sec,but some have 5sec , why?
however openai stats is ChatGPT
:minor: Degraded Performance soo just so you know that
made*
hello
Hey
Large Language Model
Is king fall definitively better than GPT5
So it costs the same as Gemini api when Gemini is free? Lol what a joke
guy the are a methods for generate video while veo3 audio?
FYI
Whats the difference in performance between free tier and paid tier? The only difference I found was deep research feature
Talking about the Gemini web btw
okay thanks lol, its crazy how this website is hosting so many paid models for free, its great
i just wish i can use it on cline tho
no difference from what I've seen...
зет
good morning everyone
could you give me a feedback about GPT-5 please, I still don't have it on my end.
u paid or free version?
Paid; Plus tier
i have plus tier but don't see it on my models list
How can I do that on my iOS app?
or control shift r
try updating your app
or connect to a diff wifi, sometimes that works
but gpt5 has been good imo
my new daily driver
erm
I have tried those methods but still don't have it
maybe delete and reinstall?
i just updated my app
Is there other way too update the app than from the app store, because it says open rather than update
just use microsoft copilot at this point lmao
u shud delete and reinstall
i still don't have it @inland cedar
So where is Qwen-Image in leaderboard?
google share is up a bit.. is it gpt-5 effect?
i dont know guys. im coding since this night with gpt5 and i feel really good with it. it doesnt make many mistakes like 2.5 pro just because for example doesnt hallucinate. im making a program in python, so it's not web
yes its amazing
hmmm 5 pro is really good at counting letters without reasoning
you used pro for this question?
I mean why not, doesn't seem there's a limit rn
why not normal gpt-5 instead
just testing it ig
not helping. Plus - still dont have
i am in the same boat
Try a hard math equation to test it for real
I have plus and had it since yesterday around 3pm
i dont use the app version tho, only web
have both versions - no upgrade on either yet
do you guys think grok 4 heavy is better than gpt-5
Guys i have a problem with chatgpt. Once i hit the limit on gpt 5 it switches to gpt 4o mini
No gpt 5 mini
Gpt 4o mini is garbage
do the paid models in the website have any limits when youre using them?
how long will gpt-5 be SoTA
5
9
1
until gemini 3 release
guys gpt5 sucks
no thank you
You still going at it?
It is good idk why people hate it
Lie
no
DEMMMMMMMMMMMMMMMMMMMMMM
I uh don't think you actually worked with the models thats why
gpt-5 has went on the top of the list, this is an old screenshot posted here bro
0/10 ragebait
bro
gpt5 sucks
theyre the same rank
Yeah whatever you say
dude
Say something new, dude.
why
Ignore this guy
okay lol hes just tryna get people mad
just accept that gpt5 sucks
Yeah man
nuh uh
i am happy
Bro is tweaking
no style control
i cant believe somebody has the free time to actively ragebait on discord for like 2 minutes worth of entertainment
so
so overall gpt 5 is better
why
google might just make a AI that will just take over the internet
Why is GPT-5 so good at coding now then?
skynet
cant wait for gemini 3 tho, theyre cooking with it
because it sucks at text
I agree with gemini 3 beating gpt 5
I am sure about that.
yeah because gpt5 sucks
i agree claude 54000 will be better then gpt 5 and all other models made yet
I mean google is just so ahead. Genie 3
claudes good but like the price is craaaazyyy
who is it
lmarena gonna gib us free access
Would be nice to see it used in game development and such
if gemini 3 releases with a lower price range than claude right now and pretty much guaranteed the best ai model overall, then anthropic is in trouble
cmon dude
everyone who said that already changed their minds
Bruh for sure will happen.
they are brainwashed by 5g
tbh wait till gpt 6 comes
my bielarusy mirnyja ludzi sercam addanyja rodnaj ziamli
do you guys think Grok 4 Heavy is better than ChatGPT-5?
hell naw
and its horribly expensive
Noooo
yh
absolutely not
just to post antisemetic tweets 😂
That garbage is overpriced af
what are yalls thoughts about gemini 2.5 deepthink
grok 4 heavy is just a lot of groks talking to each other
Performance and intelligence wise?
Idk why people glaze it so much like its so expensive
All
I think gpt 5 pro beats it
Which is better?
Gemini 2.5 deepthink is catered to logic and math thouhg
Gpt 5 pro
The benchmarks says it
In the last humanity test or sth gpt pro still outscored grok heavy im pretty sure
ChatGPT-5 is better than Grok 4 Heavy?
I know
And cheaper
Is it worth the 307 CAD for Pro Plan?
Yeah no
Man i cannot imagine google not beating gpt5. They just have to with gemini 3. If not they fall behind a lot
80 gpt 5 prompts every 3 hours
Then how to use ChatGPT-5 Pro
You cant without a pro sub
But its not worh it generally…
Its a big plan for minimal increase in peformance
Theres only 1 possibility and thats if the ai peak has reached
I've got ChatGPT-5 and ChatGPT-5 Thinking in Plus plan
Like it will just slow down
Thats good enouh mate
I doubt that
Pro has like a 4-5% increase
I wanna know where can I use ChatGPT-5 (high)?
But still though i felt like gpt 5 was focusing on web dev a lot
Api
I think
Im not sure
freee?
Isn't it the same as ChatGPT-5 Thinking?
anguilla must be getting rich from the AI domain
yes
I think you can use high there ? Some dude said chatgpt thinking is locked at med
In app
If anthropic keeps being obsessed with alignment crap then they’re definitely doomed, instead of obsessing with alignment they should focus on their competitive edges, expand them and scale while keeping the costs fair and reasonable…
maybe they know what they’re doing? 🥺
I think OAI was wise to deprecate all of the old models. If they didn't, their capacity crunch would be much worse
is yupp better than lmarena
where "there"?
On the api mate
no its limited
but it has a lot of models
and you need google signup
I think that people dont include like the response time of gpt into account
Link?
Or the fact that they improved the webdev score by like 200
I think openai definitely doomed itself by not focusing on the webdev enough…its so good
I would say for like a simple task it would be faster from 10 seconds or more ?
O3 always thinks very hard
Depends on the prompt
If its a thinking prompt
Relatively the same time
If its just a general prompt
Gpt 5 can be faster by a lot
gemini 2.5 pro better anyway
I mean i can test rn if you want me to test a prompt. I have gpt 5 on my ipad and o3 on phone
Its not
it is
gpt-5 is greater than gemini
Yeah ? If you dont use webdeb that is