#general
1 messages · Page 84 of 1
guys competetion is good.
US were created by Europe lol
yeah "americans" are just bunch of europeans
only came to the land few hundred years ago
why is this thing not working
What is currently unfolding in US, Europe (or EU) has already lived through and hopefully put behind...
lets not go down this road pls
I get it here is not the place for political stuff
google needs to lock in
lol gemini 2.5 deepthink is probably better than gpt5
how do you know that
i have
it's miles better than o3 pro
i have completely switched from chatgpt to gemini usage because o3 has been such a poor showing
gemini is completely free with almost no usage at all
o3 pro thinks for 20 minutes and is like at max 5% better than o1 pro
the deepthink is very limited
they added nice feature to google gpt5 launch time conveniently. If you google for "time 10am pt" this will convert to your local time. How nice of them 😇
i love google
gpt5 in 20 minutes?
only 2 hours and 20 minutes left
no
guys why it feel like openai slowed time
gpt5 isn't even out lol
i did but they nerfed the hell out of their pro model
yes i noticed that
gemini 2.5 pro was so powerful before when it first released which shows google true strength

gpt 5 - pricy
gemini 3 - free
good competetion is always good for us
gpt 5 50 euro per token lol
openai has already lost on API and enterprise
lol wtf
anthropic
claude is like a human when it comes to coding
why would you use a 10x more expensive model
claude is so expensive. someone needs to beat them at coding to challenge them
the markets are idiots
how is it?
show us
lmarena no style control google is dominating
it's not even close
lol no they arent
Source: Trust me bro
theres somehow a 75% chance openai has best model by end of august on polymarket 😂 😂 😂
based on lmarena no style control
i did
$1k
Well, no one need to believe me 🤷♂️
it's my first time betting on polymarket
thats what a liar would say bro
no proof at all
LMFAO
you cant do noone of those
??
wall street is better than polymarket traders
HF analysts actually know how to do math
a lot of smart money has googl rn
like a loooot
When gpt 5 was announced?
It's a small enough market that I could move the market
lol
imagine if it does not
impossible
what is st
wall street
yes impossible. that is the reason i used the term "imagine"
I wouldn't read that as "the market thinks it's a 94% chance"
all of the good HFs are subscribed to semianalysis
I think it's gonna be made avail immediately. Too big of a release not to. At the very least to their Pro subs same day. But likely more
and have really smart analysts constantly looking at compute & energy that google vs openai are building
whereas polymarket is people who invest based on twitter vibes
with this much hype they have to release a version to free users
Semi analysis is good if you want invest in the supply chain of the relevant companies
i believe u
@blazing bison is saying a lot of baloney
hoW?
Proof?
he would never do such thing
this guy is such a troll
Im not the only one with access btw
theres no gpt 5 on copilot yet
ye
do u have the smart mode thing on copilot?
i dont
You need to change things in frontend to access
Yes
ye
i saw some other people get it
was it just random or did u have to do something?
Random
guys i have gpt6
i have gpt-6 on copilot rn
Idk too much limited
bro you're far behind I have gpt 7
guys ive just been granted access to gpt 8.5 pro high max reasoning 1 billion context
and why not LOOL
Idk
u can send photos here
he sent too many naughty ones
ewwww
finetuned?
i like these ngl
price wont be likeable tho
Guys what would you think will be the free tier model with no limits?
a lot of insiders has bought openai
given that summit and zenith have already been tested in the arena, they have enough confidence to do so
lol
why didnt lm arena give us the sota model for free smh
2 hours left

exactly
so greedy
no one is deciding to buy or sell openai secondaries based on summit and zenith
guarantee 99% of polymarket traders can't even buy secondaries
it's very few individuals
it's mostly funds that are buying up secondaries
i think horizon beta has given me the best plug and play main menu for roblocks out of all the LLMs ive tried
they having a livestream in about 2 or so hours right
yes
`how 2 get
your best bet is using horizon beta
whats zenith
an ai model that was in battle mode (best gpt 5 version)
they said it was really good
but i never got the chance to try it
Actual chatgpt plus and pro sora?
Hey @echo aurora , I was wondering if I could talk with you through DMS or a ticket in this server?
I think they know but also don't have models that deal well with it. No AI company is profitable and wall street models generally are focused on 6-18 months in the future at best
Yeah, my DMs are open or you can DM @oak python
bruh
China is in actual flames now, just because they open source good models doesn’t mean the country is holding itself together lol
ModMail isn't properly setup
why'd they snatch it
why would pineapple take away zenith from us??!?
Hmm okay thanks I’ll look into, very odd. You can DM me
Sent a DM.
where did you grab this from?
But might be fake
take it with a grain of salt
most likely yeah
Okay I mean on the 13th*
wasnt strawberry o1
Look at the strawberries in his username
bruh
fake
But again; take it with a grain of salt
he could have at least attempted to make it look realistic
yeah no
What makes it look unrealistic? The insane scores?
you obviously do not get a base model performing like that on arc agi 2
Very curious to see if the creative writing is less repetitive than even something like Gemini
Yea, i think that will be getting an upgrade
(Hopefully there'll be less/no em-dashes)
horizon beta is so good at lua
I can kinda believe it, but I don’t think the gap will be that big between Gemini 2.5 and base gpt 5
no
agi is when spinny hexagon+snake
we are close to AGI
agi is fake
not this guy again
stop reposting that idiot
Ok, sure. Sorry.
No No,
ok good
I'm just saying i'll stop posting it here.
thanks
No problem!
it is
i dont think there will be a huge difference
probably 5-9%
they are just capitalizing on the hype they have been building
Ahhhhhhhhhh
better be much better at coding than 4.1 opus (:
Guys quick question how do we do text to video on LMArena.ai
More info can be found in #1397655624103493813 , but the TLDR is use /video in #video-arena-1 #video-arena-2 #video-arena-3
So I can’t do it on the website
Video Arena is currently only available through our Discord
some cool demos
oops
Gpt 5 on copilot feels like 4o v2
30 minutes
show us
I hope it's not the same
Copilot is also kind of mid to begin with ngl
it is
Reminder we have our Staff AMA tomorrow with the dev behind our Video Arena bot, if you have any specific questions be sure to add them here
please vote :)
👀
12 minutes!
vibe coded, lol
https://websim.com/@rat/gpt-5-countdown-party-2
256k sounds right, I don’t want to get my hopes too high
From my first impressions, it's not good, but I had good results with zenith
I have this
how do I use the bot in direct messages
You're unable to, it only works in #video-arena-1 #video-arena-2 #video-arena-3
oh ok. It would be cool if you could
Hey pineapple when gpt-5 added in lmarena?
Be sure to share feedback about the bot in #bot-feedback
Is gpt 5 released?
Ok
Is the demo in 5 mins or is the full release in 5 mins
Ooo
@echo auroraadd gpt 5 to the arena
What is that compared to Gemini
Give him time
1 minute yall
the stream is in 10 minutes but when will it be released actually
Yea it's cheaper than gpt 4.5
400k context
200
4o
2.5 dollar input
10 output
I think it was 200k for 4o
Most likely
o3 deep research is still the smartest one, tho with huge latency
even a free tier? goddamn
What is the name of this site?
still don't have audio or video input which is annoying
yes it does, bottom is no think top is with think
5 increase is so disappointing
Very happy to hear about it
@clever estuary What is the name of this site?
openAI api site
graph made by chatgpt
byebye anthropic
Thankyou
Let s test and see if it is great or as always just a hype
The disparity between GPT-5 Thinking score (incredible) and no-think (awful) is pretty crazy
wait thats a crow?
how many thinking tokens is it gonna use kek
omg we both likes crows
yeah idk what to think about that
barely better than 4o
is it not a crow?
not even a 1% increase over 4o is disgusting
Thinking score is amazing and I don't know why you would use it without but definitely interesting
thats pathetic
playground? what site
HOLY
that was quick
damn
I mean I’m pretty sure thinking is automatic now
is the gpt 5 in arena think
Lmao
GPT-5 seems to be weaker or comparable to Deep Think on all benchmarks without tools
GPT 5 live direct arena!
gpt 5 crushed the webdev by far
yeehaw
summit was GPT-5? Then what was zenith?
beast
Ig it's time google will release something 😂
FINALLY YAY!!!
@echo aurora I love you
Only a 21-point ELO lead over 2.5 Pro. We're good
its time for everyone to release something
Love you guys
crazy
Since they want #1
WOW
Hell yeah
lmarena first with gpt5 on the web, gratz

they are deprecating everything???
what a joke, AI pioneer not even to be able to make a proper graph, embarassing
@echo aurora thanks a lot
cooked
I'm feeling pretty good ngl
the hype was a marketing stunt
Hey brian! Looks like gpt 5 is indeed becoming underrated right?
dawg they're deprecating all the previous models
crazy
that's actually crazy
gpt 5 worse than Gemini 2.5 pro?
Yes
I have a question, as soon as they started the live stream, they already published GPT 5 on the official website?
No
guys, when? 🔥
Well animated you know gpt 5 is way better than gemini 2.5 pro right?
Now Google can have a good sleep 😆
no ones interested about non style control
I think this is the "google won its over" moment.
Im gonna be sure of it after testing
what video model this AI is using and how many videos i can generate in a day?
i'd like to see the hallucination rate of gemini before saying this, i dont know
I predicted a 50 elo jump max above o3, I was right, AI is hitting a plateau
8 videos
daily?
I’m gonna wait until Gemini 3 to fully say that
cant wait for gemini 3 to destroy OAI
Yes
and model video model is it?
Let's wait for gpt5 thinking on arena
You can't select a model. It will independently select.
horizon beta chat was disabled
ok
ohh ,thanks for the info bro
No problem
Isn’t thinking incorporated onto the main model itself
ok
With a flash 😆
Uh?
@echo aurora can you shed some light on what zenith was? there wasn't jsut summit
They can't. It's against their policy.
gpt 5 pro
so summit was gpt 5
does anyone know what makes gpt 5 better than opus 4 at coding?
what is the limit for video arena?
Dunno, there have to be different versions still.
eight
i cant wait to see gpt 5 failing at your tests lmao, send images pls
yes
what was zenith
is the gpt 5 in lmarena reasoning?
what do you think zenith is?
im guessing it is because it took forever to answer my prompt
Gpt 5 has got 50 percent in hfe
big question atm
That's crazy
Am I reading the evals incorrectly? GPT-5 looks underwhelming to me.. What am I missing?
is every model in lmarena even real
is it actually the real model
yes not sure for grok 4 tho
it's an improvement, just not that significant.
it looks like it, it took like 2 minutes to answer an image examination question
Things are slowing down but GPT-5 was supposed to be like multiple levels better ..
Traffic
Many ppl are using it
yes
ok
is it gpt 5 think or no think
Why didn't lmarena add got 5 thinking on leaderboard yet? If it is zenith, it should have had enough votes like summit/gpt5main ?
that's possible too, and i think the image you sent is fake
so gpt5 is crap
gemini 2.5 thinks lesser than it
tho idk
maybe Gemini IS better
gemini 3 will destroy it
Barely better than Gemini
It's too positive/sycophantic like 4o
Don't like it
I don't think all of GPT-5 is thinking. It might do some to determine which to route to but they wouldn't differentiate between main and thinking if both were thinking.
i think next version of gemini 2.5 is going to be better than gpt-5
I can't wait
I thought OAI was 3-4 months ahead of Google. I think it's behind now
i wonder how strong the thinking one is
Gemini 2.5 was released in March. Google has to have much better version internally now.
But is was improved at coding over the time
They have benchmarks for main and thinking. If they are same, why different benchmarks?
polymarket sudden huge moved in favor of gemini models for Aug
it may have improved in coding, but it became sycophantic like 4o
the gemini gemini guy kilpatrick say this week was an exciting week so maybe gemini 3 experimental tomorrow ?
oh no it's more censored
gemini 3 might drop in Aug-end but most likely in Sept or Oct. I intially thought it would be Dec but they are accelerating the pace as per some sources
Google is now focusing at releasi ng some feautures on gemini app
LMAO
Gemini is much less censred than the opposition
Didn't we already get genie 3?
he said exciting week, not exciting day
is gpt-5 better than gemini
slightly
But the story book too
Wait thats a steal
I think wolfstride model is better than gpt-5 ... this is very dissappointing 🙁
Gemini 3 aint coming in aug tho
gemini 3 100%
what is this creature talking
shut up
gpt 5
nah
best model is https://www.codecademy.com/learn/learn-lua
Yes. They are surely updating the gpt image model
it literally wasn't though
fr
I use them to help me, not just copy.
it thought for less time than summit on 90% of tasks
lmao
it also had less juice
that was NOT pro
@deep adder hi
they just picked the checkpoint that won in elo even though it was less performant 
openai never help themselves do they
Oh dear....
is the gpt in lmarena the pro thinking and not normal one
is gpt-5 good? (better than claude 4.1 opus?)
in logan we trust
so gpt 5 is 200 elo better than gpt 4 0314 (the og one)
who is the only voter in favor of GPT-5 ? :). who is the crazy person? Reveal yourself
Damn so GPT-5 was on lmarena since the 27th of July 🤯
LMAOOOOOOOOOO
well...
what does this mean, people dont like gpt-5?
craig eating his words again
underperformed expectations
...
craig stop licking openai's boots
lol
Hii, this gpt-5 has the thinking mode activated by default? Like the o3??
fr
no
it's google mate
it's @deep adder 🙂 not surprised
Absolutely huge
it has thinking bruv
but no by default
yes
2 years and half to get 200 elo boost, we'll see where we at in 2 years and a half but I predict max 100-120 elo if we keep the same pace but yeah sometimes people dont realise when a model hallucinates especially on hard task so an elo plateau is to be expected
damn the context windows 😭 https://x.com/ahmetbuilds/status/1953511311175737370
Can we people stop complaining already? Take a deep breath
OMG!
jesus
grok 4 honestly kinda sucks
for any type of question that isn't just benchmark maxxing types
If I remove style control... 2.5 > Gpt-5.. WTF
8k are you serious
It's crazy and kinda scary how much Gemini 2.5 Pro hallucinates.
Yeah, although if you look at the system prompt, I think the model doesn't enforce a limit most of the time on auto.
This release is breaking my heart :(. I had soo much hopes
What if Google also fails 🙁
Yeah, I didn’t like grok 4 that much when I tried it
For example, I told it to use Google Search. It told me it did. But you can see that there are no citations returned, which mean it didn't.
idk ask yourself why ai progress is slowing down all of a sudden
so much hype but so little progress
I really hope it doesn't get dementia past the said limit, would be pretty damn hard to use for anything that isnt a simple task
We’ve probably hit a plateau
Google has incredible stuff upcoming there is no worries trust me. I am a bit shocked by this gpt5 release
i thought they would have cooked for sure
Tbh i tried gpt5 and the vibes are great
i hope google doesnt mess up
Then I told it to explicitly include the link from each website it found, and a quote from each. And it just started making non-existent links and quotes up. Like what the heck.
Hi how good is gpt 5
craig still coping
michelle please stop i can't see you through the tears in my eyes
its mid
like i tried actual help for real world coding and it was pretty good
maybe dont judge pure off benchmarks just yet
It's completely lying to the user about having used the search tool during thinking.
4.5 type of thing?
A lot less hallucinations which is a big thing
Needs more votes
Well then how can I make the GPT-5 on the lmarena can think?? Like the reasoning models?
You know, the human brain can reach further than an ai
To be more down
?
Is it not thinking?
You guys are so negative
guys i honestly think that the thing of hallucinations is still really good even if it's not the best model in every task
AI is NOT replacing jobs 💀
GPT-5 Context Window Predictions
7
14
2
256k
😕
1 year wait for this trash btw
So
more yea
that;s one good thing about this rlease 😄 😄
2 years for a mid model
How does gpt 5 compare to opus 4
saved to my gifs
Does it have more sophistication?
its not trash cmon lol. it is still SOTA its not just the jump expected
This is Gemini 2.5 Pro btw, not GPT-5.
gpt 5 seems... better at translation than any other from openai
censorship update
people will stick with claude, believe me
this jump is the same amount of improvement from claude 4.0 to 4.1
what is my timeline 😭
What was zenith
I was told it has not the thinking mode by default, Soo I don't know how to activate it :'v cuz there are no buttons
we dont nkow
LMAOOOOOOO
agreed.. still very disappointing .
not true
What is airline
It said it was trained onopenai data so
I fell for OAI hype again... this is like 5th time I fell for it
Maybe it was gpt 4o update but they wanted to call ot gpt5
lawsuit incoming?
gpt 5 is really a 4o v2
Literally any AI hype ever…except google thouh
Is GPT-5 a router model?
sheesh
gpt-4oo
that aint great ngl
Is there a gpt 5 heavy or a gpt 5 thinking yet?
This is the worst "upgrade" ever!
50 ELO improvement over 2.5 gemini.. that was my mid-level expectation.. i was hoping for 65+
so gpt-5 suck?
Why are people thinking this is bad? This looks fantastic to me
Maybe it's a hybrid model like clause 3.7 and qwen3
It doesn't suck, it's just overly underwhelming and isn't the revolution everyone wished for, far from it
they hyped it as agi and beyond
Simple-bench needed right now! can't wait for those results
oh
so they hype it up way too much lol
"a team of phd experts on demand" yeah no it is not that lol
Solely expecations. o3 was bigger jump. people rightly or wrongly expected a big jump with 5 series
looks like it kills on all tests, and 1/5 the hallucination
Thats what sam hype man does!
guys please tell me is it even betetr than opus 4.1?
Hello, how do I make videos in 9:16 size?
it is not fantastic.. remove style control and 2.5 pro is better than gpt-5. that's how dissappointing the model is
Wow finally Sam hit the wall
this long context performance is really good
actually game changing tbh
i really like gpt5 ngl. i actually used it
Jse it bruh
it's same improvement from o3 to gpt 5 than from gemini 2.5 experimental 03-25 to gemini experimental 05-06
Google DID "out accelerate" Sam
Hello, how do I make videos in 9:16 size?
lol
the benchmarks arent capturing smth it feels
Any coding benchmark guys
gemini 3.0 deep thinking is gonna leave gpt 5 in the dust
looks like GPT5 is much better at writing, coding, and overall knowledge, while having 1/5 the hallucinations. Pretty huge imo
Its topping webdev so idk why pissed
Sama is saying the same on twitter...
You're unable to set the size of the output
GPT 5 is confirmation this crsp has hit a wall and we’ve been being grifted into thinking it hasn’t for over a year now
Disappointed again
that's why i think is really good
Ran a writing task (albeit meme one) and it fails to Gemini 2.5
its good at lua
Hi
I don't think ARC-AGI is reliable for models released after the benchmark. Just look at how o3 (High) scores 68.8% on ARC-AGI-1 vs. Opus 4's 35.7%, but for ARC-AGI-2 they score about the same (Opus scores higher now). The current models will probably do badly on ARC-AGI-3.
AI has hit a wall guys
They don’t know how to scale or improve it anymore it’s all investor hype
This is what Sam commented on the Chart crime "wow a mega chart screwup from us earlier--wen GPT-6?! correct on the blog though.
"
@deep adder is agian the only voter in favor of gpt-5... are you OAI employee?? reveal to us
Ai has limits, the brain does not
"GPT-5 is here - and it’s #1 across the board.
🥇#1 in Text, WebDev, and Vision Arena
🥇#1 in Hard Prompts, Coding, Math, Creativity, Long Queries, and more
Tested under the codename “summit”, GPT-5 now holds the highest Arena score to date." THEN WHAT WAS ZENITH???
context: zenith was better
its gpt 5.5
GPT5Pro?
reveal zenith
Does anyone know anything about monitors?
summit > zenith
no
Yup if zenith is google damn son we're gonna have fun
GUYS! GPT-6 COMFIRMED! (gone wrong!)
we'll see how gpt 5 will fix the bug in my project
This crushes Opus on SWE Bench. 75% vs 67.6%
it;s OAI model for sure
oh
openai is not the same without Ilya.
We need ilya back tbh
Would be nice if there was a graph with Claude, and the price on the x-axis.
His balding head made the company bold
we'll see these from 3rd party
at this point deepseek r2 will cook them
Opus 4.1 scores 74.1% without thinking btw
nah.. OAI will remain top company for atleast couple of years... but direction is not great
Opus 4.2 will cook this
lmarena doesn't works
how so?
good to know, it's not on the leaderboard I saw
Lmarena is dxomark 2.0
guys, how can i use gpt 4.5 in lmarena?
Admittedly I am extremely confused how they are expecting to reach AGI
is lmarena down
This does not seem the path to do so
google will be the first to reach agi
Opus 4.1 was just released yesterday I think. Only a small improvement on SWE-bench though.
SCAM HYPEMAN
gpt-5 tend to go straight to the point compare to gemini 2.5 pro
Yes. It's down
They think they can declare AGI by the end of the year (reports mention this in relation to the Microsoft deal) HOW???
ah so it's basically on par for this test
how long?
uh oh, I'm seeing the same, escalating now.
I have same problem
Mybe 3 mints
Yeah model selector be having some problems
bruh :/
However the battle mode works
Anthropic should have no problem for staying ahead in coding then, they said big improvements in coming weeks
Max tokens for gpt 5?
i was just making my homework with it
guys, how can i use gpt 4.5 in lmarena?
Guys! have we hit a wall??? first claude opus 4.1 scores 3 percent more and now this?
lmarena model selection is not working
Lmao
finally a model that can generate a Minecraft clone
The subset of problems and the framework is a bit different, I think. And one has thinking whereas the other doesn't. SWE-bench bash-only mode would be more fair: https://www.swebench.com/. They'll probably show up soon.
100%
Max tokens for gpt 5?
looool
a chinese model also did it
i missed one thing
What model is that
i think OAI is definitely hitting the wall... let's see if google is in the same boat.. gemini 3 will tell us
"Grok 5 will be out before the end of this year and it will be crushingly good
" ELON ON TWITTER!!!
gpt-5
Which model?
GPT-5?
that's quiet impressive. try asking it all the problem with chunk generating, glass, water, vvv to see how it would solve
something with gpt
is avaible on lmarena
So the entire hypetrain depends on Gemini 3 since both Anthropic and OpenAI hit a wall
Max tokens for gpt 5?
Or try asking it to solve old-fashioned JS script-order loading issues. Claude had a seizure, lol.
API said 128K
WE WAITED 2 YEARS FOR THIS???
Even the december odds are shifting...
How much is it? Idk
Yeah, we're looking into
yeah that's crap, gemini is 1 million
waited so long for this garbage
thanks
2 years!
Model selector back on again
why is it down?
The AIDS chart
guys, how can i use gpt 4.5 in lmarena?
I'm seeing the same
I hope thta deepSeek won t do the same thing , waiting ages for garbage
1 million token point is barely enough if you want it to read all the text in a book 🤣
JUST IN: GPT-4.5 got removed from chatgpt's website
lets what they give us. those people never say a word
bro is not a twitter bot
anthropic is still great but only for coding.. i still have hopes on them in this area
gpt 4.5's dead, dead from api and dead from chatgpt web
great
Gemini hallucinates so terribly though. It sometimes doesn't even tell me I forgot to attach a document; it just makes one up.
its not even removed
Backfired? Lmaoooo
How much better is gpt5?
it's just a cheap ass model for general uses anyway
bros had to make the most misleading graph ever
This is deeply saddening.
@echo aurora which gpt-5 version is it that's displayed on lmarena?
thats crazy work ngl
Thanks for your info
first impression: concise, straight to the point
its great for lua
Try pasting Claude's system prompt into other LLMs. Might help.
Is gemini 2.5 screwed?
who uses lua noob
Lmfao
I overslept and I woke up realizing that GPT five is out
roblocks
no
Theres two models called "gpt oss 120b" on lmarena rn
Why did they release this? If they did it for the normies, why did they hype it so much!?
yeah thats trash
whats wrong with it
GPTOSS 120 B is a open source model made by GPT. ChatGPT says it’s just as powerful as 4o
no its bad
go back to sleep again then 🤣
its the worst open source model ever
Same or bugs?
Why
They didn't release any benchmarks for the Mini and Nano versions of GPT-5.
hmm will take a look
GUYS! GPT-5 supposed to be the best model for cost to performance though.
probably means they are about at the level of 4.1 Mini and Nano
Is it better than Grok though?
it barely beat gemini 2.5 pro in bendmark.
its working now @echo aurora
Ok
no
ITS NOT A REVOLUTIONARY MODEL! its an effcient model
good old graphs made by chatgpt
it didnt beat grok in arc agi 2
Polymarket’s saying gemini’s over
so far gpt5 is good
Google has the efficiency advantage because of their custom TPUs. It's kinda crazy that you get free unlimited use of Gemini 2.5 Pro on AIStudio with 1M context.
I'm still seeing a few things off, but yeah overall should be back 
Wait so is it over for gemini?
"GPT-5 results on ARC-AGI 1 & 2!
Top line:
65.7% on ARC-AGI-1
9.9% on ARC-AGI-2
" IT DOESN'T EVEN MATCH o3 FROM DECEMBER AT ARC AGI 1!!!
Nothing will replace 1M context. just paste the whole book and it know everything
No Gemini still better than GPT five
the standard version
ok thx
LOGAN! say it! say the damn words! "Gemini Gemini Gemini"!!!
Does that automatically include thinking

They Didn't increase their knowledge cut off parameters
GPT-5 kinda falls short on the context window size and knowledge cutoff date.
this one?
Gemini may be the best now but google will neuter it once they have market monopoly we need heavy competition to keep Gemini good
Damn
Where is this picture from
Is there another way to get GPT five to search
why that old?
Because Joe Biden is not the president
nah its actually so good
Not anymore
i have many things to say
bruh
this is literally stale model wth
from openai models
Dem
Why is the cut off day 2024 oct
Bad data? Idk. I thought Sam Altman said it was a router model though. (few months ago)
gpt-5 on chatgpt is dumber than the one in API????
thats how long it took them to get this release out 💀
crazy
Should be fixed now 
Openai logins are broken lol
"gpt-5 fast facts:
- hits sota on pretty much every eval
- way better than claude 4.1 opus at swe
-
5× cheaper than opus
-
40% cheaper than sonnet
- best writing quality of any model
- way less sycophantic" - OpenAI employee
Roon failed us.
Trash writing still
The mini version has even an older knowledge cutoff date
barely any better
“Black and white vector-style silhouette of a confident bearded man wearing sunglasses, modern hairstyle
They ought to make a good 1m parameter model
openai has to retire and give its compute to google
google deepmind has to
the nano
Secret Gemini models have better writing!!!
Those are garbage
WTF
how the hell
Google already have good ones, OpenAI dont at all
its a mini model bruh it should not take a year to train
The one in ChatGPT is a different model with a different cutoff date. https://platform.openai.com/docs/models/gpt-5-chat-latest vs. https://platform.openai.com/docs/models/gpt-5
Gemini 3 and grok 5 will win 100%
the chatgpt 5 is dumber than the API
Not even nano has 1m
is chatGPT 5 on the website yet?
that"s what I said...
didnt someone from google say this was gonna be an exciting week
I am not a google employe
Of course they will, they have to release a "competitor" to this new trashai model
What model has had the biggest positive reception here?
gemini
Claude is the most self-aware and agentic still. Gemini doesn't even tell me I passed in the exact same attachment twice, whereas Claude is like "HoLdUp"
Can I send a video here real quick?
i was the first to say that LOL
true
Gemini 2.5 Pro already has many things better than gpt 5
Gemini 3 Pro will be groundbreaking
yeah its pretty full model
gpt 5 is garbage
Gemini three pro might be better than Grok four I mean it has a chance to be better
Since 2.5 pro is better than GPT five
kimi k2 are you serious ?
HOW IS GPT 5 WINNING???
I wonder how Gemini will be
Where's GLM ? 🙁
default gemini 2.5 pro praises all your question no mater how stupid it is. It get very annoyed
glm aint sota
its such a minor upgrade
Stop with the A = A arguments. its a trash model sir.
Now make a simple prompt against that behaviour and put it into system prompt
Try pasting the Claude system prompt into it
Gemini is definitely sycophantic, lol
What do you mean?
Well, at least gpt 5 nano accepts pictures
Also does anyone have access to to 5 right now lol
gemini is the most sycophantic model
this guy on the livestream is just bsing
Lmarena
literally saying nothing
yes
It literally is not! arc agi is only one of the benchmarks where it falters!
on lmarena
filler words
now the livestream is just pure yapping
selling
He tried his best
we will get back to selling kek
They're actually tied, as the confidence interval overlaps.
is it this one? https://www.reddit.com/r/ClaudeAI/comments/1ixapi4/here_is_claude_sonnet_37_full_system_prompt/
thats not how the betting works
Will they do the pokemon Bench with GPT-5?
it is the actual value only
I find it funny they copied gemini 2.5 pro pricing exactly lol
like $1.25/$10
Wait is gpt5 a reasoning model?
Could we get gemini 3 this month?
yes
Roon did not fail us
The official one is here: https://docs.anthropic.com/en/release-notes/system-prompts. I removed unnecessary lines (Anthropic product info).
did you even watch the stream?
why some people hating on GPT-5?
Bro can I dm you
I want to talk a lot about gemini 2.5 and stuff. Not the news or about the model but about the uses of the model and my personal experience and cases
I didn't
they spit in our faces again
first with this garbage open source model
and now with gpt 5
What happened