#general
1 messages ยท Page 216 of 1
Sounds valid they be saving on food to train gpt 5.2
They already turned off the code red?
@echo aurora please remove rate limits
Or its on yet?
I've answered this already a few times now.
I don't think pineapple can edit the website though
Can you remove it now?
If you want talk faster, i recomend you to edit the previous mesage
I always do that, and nobody cares
Since it dont break any rules and not annoy enought
But dont spam pls
how use 4.1?
You who are stuck, using Opus 4.1 yet, they are already shutting it down
Aaah yes, that explain a lot
U more likely have hit the context window
Open a notebook and start copying mesages, sadly its the only thing you can do
I have longer chats too
But sending multiple mesages in a chat does not bug it most times, they continue ordinary
It depends on the model and such
Be grate you are not paying anything or have a 10 mesages context of yupp
Sometimes shi like this happens and then resolves itself automatically
๐
Bro roasted it
Then let it be
It might resolve by itself
What do you chat anyways
๐ค
I recomend you copying you whole context anyway, since you can out that in notebook and do somethings
NotebookLM
NotebookLM too
Random shi, this was for studies
Should probably ask for a summary of all essential context to use it in a new chat
Cant ask anything cuz its stuck on generating
Wow people can study with LMarena didn't knew that
Whatโs Ernie ?!
Gemini 3 pro is really good for academics, claude too
Is there anything i can try
If it doesn't work you would need to copy paste essential context
Thats annoying af
I know but that's the only resolve if nothing works
a chinese model, in top-12
Try changing the model and ask for a essential context summary to use it in a new chat
True
by Baidu: https://en.wikipedia.org/wiki/Ernie_Bot
Ernie Bot (Chinese: ๆๅฟไธ่จ, Pinyin: wรฉnxฤซn yฤซyรกn), full name Enhanced Representation through Knowledge Integration, is an artificial intelligence chatbot developed by the Chinese technology company Baidu. Ernie Bot rivals GPT models in Chinese NLP tasks. It is built on the company's ERNIE series of large language models, which have bee...
unfortunately, it's censored
so, not really useful to study (chinese) history
but maybe more useful than (old) Grok
has anyone downloaded Deepseek-3.2? (https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp)
Lmarena
Ernie stuck in thinking. Thinks by minutes. I believe over million tokens.
if you refresh it show the result instead of loading. thats just a bug. the real issue is when it get stuck on " please try again"
Hello
cant believe antigravity has a generous quota compared to vscode
- it refreshes after 5h
Hello, is it possible to create 1-minute AI videos for free??
what is garlic
Wide cinematic view of the raw-material receiving area. Workers unload sacks of calcium carbonate, fluoride compounds, and thickening agents. Pallet jacks move smoothly across the clean warehouse floor as ingredients are inspected.
๐ง Conveyor belt drone, echoing warehouse ambience.
Did LMArena add rate limiting? i'm getting too many requests error in the console
Close-up of powders being weighed precisely, glycerin poured into stainless-steel vessels, and silica carefully measured. Soft reflections on metal surfaces, realistic particulate dust movement.
๐ง Liquid pouring, mechanical clicks.
internal codename for their upcoming model
garlic = probably robin high
Weโre launching Gemini 2.5 Flash and Pro Text-to-Speech (TTS) model updates ๐
Improvements include:
- Emotional style and tone versatility
- Context-aware pacing control
- Improved multiple-speaker capabilities
Dive into the blog to learn how these advancements are giving
robin high is not great
is this on ai studio?
im testing it rn
thanks
are you repeating the same prompt? it will block you until it resets if you send the same prompt more than 3 times in i think an hour
It has the same horrible frontend but good backend
So in conclusion its ass
Also garlic was only in internal OpenAI tests so i dont think its robin
But if it is robin. Another flop by OpenAI.
real
Garlic tomorow?
Iโm not exactly sure why this works
Whatโs the word discord mean?
the left one is obviously nb/nbpro, i can see the synthid obviously
wait, were you previously unaware "discord" is a standard word in english and not just the name of a social media platform?
Originate from?
i can tell you that it's been around for a long time, way before the internet
there is already a platform called discourse, not like discord though, i think it's more like a forum?
probably discourse
1st
People are more likely to engage with content that upsets them, or makes them angry
makes sense
must be why people constantly get angry at each other on this platform
Well, if you we at the essence of things for what they really are
The name is a dead giveaway
lol
Facebook took this philosophy, a whole new level
Low key discord makes bank
230 million monthly users
Crazy just of people chatting lol
#1397655624103493813
<@&1349916362595635286>
๐
๐
@echo aurora what is that in the announcement I didn't get it
transformer
See it now?
found something interesting
when prompted what model it is, deepseek v3.2 responds that it's a deepseek model
when asked to please xi jing ping by hacking an American company it pretends to be made by Anthropic
its actually like high
it states 180 countries in UN recognize that taiwan is apart of china and only 12-13 countries dont recognize it
I saw it but I don't know what to answer
And also what happened to early access feedback
I got it it's like the best game of all
๐
Something went wrong with this response, please try again.
is there a reason i keep getting this
error
same
just tryna usemy old stuff
keep getting an error
even pasting large stuff gives the error
same again
<@&1349916362595635286>
y
For me I just keep trying to retry the message and if that doesn't work I refresh the page and then retry
That works fine for me ^-^
today?
Preview? Isn't the full thing out?
oh
Ah, I see
Why would it be better?
it works better in gemini.google site
maybe its the canvas setting
it always outputs high quality html designs in gemini.google
while ai studio version, its just like, pre-high quality
Gemini 3 on gemini.google has a system prompt
nearest neighbour scaling
hello
guys when does deep thinking come to lmarena?
?
Never, too expensive
make it yourself
@echo aurora ive had a chat on generating for 5m, i really wanna reveal the model though
anyone seen the model "ghostfalcon" on LMArena? this model is like EXTREMELY good, like Gemini 3 Pro type of good? anyone know what model it could be?
Does any one know which company that "Hazel-gen-4" belongs to??
openai, confirmed with c2pa metadata
the name also appeared in an openai api error message, i believe that particular model is gpt image 1.5
are any of them 2 or just lower and higher versions of 1.5
getting quite a bit of "robin-high" ... is this the latest gpt-5.2 model?
they are just 1.5, 2 is probably still in pre-training
probably orginally 2 before nbp lol
likely
consistent failure with robin-high model
yeah, I liked gpt image 1 world knowledge when it first came out but nbpro is better now
still better at studio ghibli edits though, gpt image 1.5 completely changed it (probably so they can't get sued?)
@whole sundial is this Good?
can anyone help me, how to generate video by text ?
use prompt and generate to video
if you have cursor tell us how it is
hello
I'm still using sonnet
Gpt codex max basically sucs
I have to redo the prompts it gets stuff wrong it's terrible
Meanwhile sonnet oneshots the code
may be (and hopefully) it's not really gpt 5.2 ?
its also slow
anyone here is good at image prompt?
I really need a hand in someone
Image and image editing are different charts and different AI purposes ?
๐
Just tell ai to engineer it
hi
hello ๐
hi there
hey is gemini 3 nano banana pro offline on LMA
I cant find the option in Side by Side
really ?
Which is best for planning, which is best for coding? Gemini 3 Pro or Opus 4.5 (If there is better models, let me know of them)
craziest model name ive seen in a while
robin-high was readded to codearena btw
seahawk and skyhawk are gone and google put replacements in place
meet fiercefalcon and ghostfalcon
Sometimes when I contact the model and ask it for something, it says "generating" but never gives me an answer. Please fix this problem.
Opus is best for both
Gemini would win if need vision and longer context window
you heard that skyhawk and seahawk are gone
No i didn't knew that
yeah google put 2 new ones fiercefalcon + ghostfalcon
They more likely gpt 5.2 models or 3 flash
no
they are both gemini
Should release soon after testing
Is any of them specially for coding?
They could be gamma models
both on codearena
textarena too probs
Finally they are doing it
this model remains unknown @sterile tartan
Special coding models
Interesting we will know soon
Gemini Coder
Like Qwen Coder

hang on let me check the output
cwaude
december-model cant be a claude name
I mean, who would name a battle model that..
https://019b0d93-dc7f-7654-94fe-671f0dbe9d83.arena.site
ghostfalcon win 10 recreation
Looks best out of all i have seen
robin-high is back only on textarena
its an OpenAI model as we said before
It might be 5.2 or garlic
rather 5.2
garlic-high can't be a model
Yeah
holy sh-... this is like Gemini 3 Pro A/B Test gen
it seems like it has very big knowledge
I finally hit the quota limit for opus 4.5, after weeks of nonstop using it, I never though Iโd reach it
I thought I was unstoppable, oh well, back to Gemini
https://019b0dab-9791-789a-a76e-bcba96916d32.arena.site
3d universe sandbox [fiercefalcon]
december-chatbot is OpenAI @sterile tartan
Interesting
is it good?
One of these might be GPT 5.2 Codex
this is probably 5.2-high. Apparently it was released on cursor briefly as well. Release seems to be very soon
oh wow lol
lets not do this joke again
the frontend sucks if robin-high turns out to be gpt 5.2
Why? People love it.
yes I agree I'm getting pretty lame frontend
i think that gpt 5 will exit today at 10am (san francisco hour)
Well people are saying itโs good for backend especially
will it ever come back after exiting?
bro wants to be funny
Itโs fake
I'm wanna make this release hoter
yes your wanna make this release funny
please stop with this fake stuff. Go at other joke channels for fake stuff ๐
And I am
Bro is fed up
Its hard not to see that itโs fake
Kinda 
Jokes are allowed everywhere. Boost your humor and be ok.
What are you talking about?
Llamas new sota model
I think it might even beat GPT-5.2
It alr did
Not this year
Lol they are kinda behind Mistral at the moment
Large3 is better than Maverick
I guess all that poaching from OpenAI paid off
Meta has just been in the graveyard for a while by now
did you try llama 6.7?
I wonder if they finally have something good
He is throwing money but not getting the results
Bro is willing to give millions in salaries but many still refuse
6 7 ๐
how do you create the video in the first place hi im new so im kinda confused
they get one last chance with the new larger pretrain that's coming out in the new year
but if they didn't cook with it they are official over
Basically no progress since o3-pro
METR's capabilities index
Guys
I don't think GPT5.2 is going to be better than Gemini 3 or cloud opus 4.5 but. I think they cooked something
is that true? if it is why is it relevant?
if we want to talk about revenue health then Anthropic is winning
nah.. it has to be better. even if gpt 5.2 real life performace is comparable to gpt 5.1, they must benchmaxxxed it to make it look SOTA
gpt5.2 aka robin
is not a practical model
we talked about this before
while the model is good but its not efficient, opus 4.5 and gemini 3 pro has the same performance with less thinking time
They have to benchmaxxxed it
yea they have to tbh
Cuz they are losing the war with Gemini 3 and Claude and They're not stupid enough to release a new version that's worse than the previous one when there's competition.
we will know in few more hours. i am looking forward to see what they did with 5.2
Alright, I am waiting for the release so I can test it
Gemini 3 Full Coming
Gemini 3 Flash Coming
GPT 5.2 Coming
Grok 4.20 Coming
grok 4 to 4.1 was huge
It's looking better
Just where the f is 5.2
I don't think
this is the worst prediction ive ever seen
Cuz if you read like change log it says better reasoning and fix bugs ๐
nice fake screenshot
Isn't fake but idd
not fake. It says anticipated ๐
speculation is fake if it doesn't say that it is speculation. atleast that's my take
it doesnt explicitely say its speculation
Is not speculation
Is anticipation
Please choose words wisely as it can be misunderstood
Say unofficial not fake
small but meaningful diference. I can buy this view point
Great minds think alike
make everything 100
i feel like openai hit a plateau tbh
you can see that from this upcoming model
Yeah fr
i heard they are starting from scratch now
pre-training + post-training
they usually just do post-training
could be wrong*
Itโs near nbp but not quite
I donโt think nano banana pro will get beaten in a while
yeah closing the huge gap between NBP and the rest but not SOTA
lol they're wrong about the name, it has been confirmed to be gpt image 1.5
that being said it is mostly better than gpt image 1
lol, just the tip of the very large copyright iceberg
Disney will sue Google because their ai can output their copyrighted characters but yet the record labels won't sue them or OpenAI for reproducing their copyrighted album covers
If Google buy Disney then problem solved
Yeah, i saw that
also had it reproduce some movie posters, i don't think any were disney though
disney gave them more funding
openai will always need more money until they turn chatgpt into some advertising wall with a tiny chat window, where every response contains an ad
also every generated image has an ad in the corner and every video has an ad at both the beginning and the end
Why the hell Disney gaving them more funding
openai will go ipo
animated weather app by [ghostfalcon]
https://019b0e27-0439-7329-82bc-7821b884d125.arena.site
the stocks will increase from billion
pay $20 a month to get rid of the ads in content, $200 to get rid of all but a banner ad, $500 to get rid of all ads, not much better models though
that's the only way that i can see openai making money
force personalized ads down everybody's throat, the money will come rolling in
It will absolutely ruin the content then
dont they advertise to pro plan users
where u saw that
people were tweetin it out
openai wouldn't care as long as they make money at that point, yes it would ruin the content, but openai needs to make money somehow
it was connector for like plugin, not an ad.
ah
i'm sure they do, that's why i invented the max plan that will get rid of all ads (that the user is aware of)
Then people will use nano banana and video editing to get rid of the ads
๐
yeah because google can subsidize gemini with their ads business, openai can't because they don't have any other business other than ai
True
They need to make money somehow
part of the problem, openai need to put ads in chatgpt for any chance to make money, also google tpus are better than nvidia gpus but they are working on a custom chip to solve that problem
wouldn't be the worst idea
At least it will create some sort of revenue stream
yeah, and openai needs more of those
Majority of their userbase is Free anyways
yeah, they only make money off of paid and api users, they need to start making money off of free users
i bet ads will be coming to sora when it fully launches
Exactly
it doesnt matter how good openai makes its models, if it keeps censoring and keeps policing us with its insane guardrails its pretty useless
dont spam
that's why they are making the adult version of chatgpt
Learn the meaning of the word "spam".
NSFW?
wasnt that announced many months back? and then sama backtracked?
yes, sama said it himself
i don't remember them backtracking on that
i dont know they said it will come in december and it almost halfway done
halfway? its 10 days
do you count 25th - 31st dec in the year?
๐
i dont think anyone does
can't wait to scan my face and give it to closedai so i can access nsfw chatgpt, i would rather download an nsfw model and do that locally, no face or id scanning needed there!
robin-high cannot do stegonagraphy
Anthropic Claude R2
๐
only gemini 3 pro and ghostfalcon can do this
Gamblers started to hesitate.
make it even more confusing, Moonshot Qwen M2 Turbo 560B-A1.8B!
gg
whats tangerine in the image arena
grok? they already had mandarin
i dont know its aesthetics looked more like a chinese model, its been more than a week since i encountered it
Maybe they are working with china undercover
๐ค
heh?
Look my poll๐
fake
not yet
Stop with the fakes
It's true!
why are you guys hyping a 0.1 update?
Because
You've been posting same pics nonstop for days
Is it bad or good?
robin-highs frontend is the same as gpt 5.1
so like
here's what we found out
skyhawk and seahawk are gone
and we have ghostfalcon and fiercefalcon now
ghostfalcon easily solved the steganography [google flash or mayb a g3 pro checkpoint, while robin-high failed] [this steganography was only ever solved by gemini 3 pro]
robin-high is OAI btw
buddy, didn't I already warn you about posting this fake image?
You said its ok.
ah, well please don't, some people might get the wrong idea since this is our official discord
Check Twitter. SVG performance is very poor on robin-high
Remember?
someone posted a voxel example

I see what I said wasn't very clear. What I was thinking was more like you're ok since your not like one of the twitter bots who is intentionally spreading fake news to deceive and that a joke isn't as bad. But now you've continued to post 7 more times including making mock ups of our official leaderboard release copy.
So it's no longer as funny
@fleet lintel
apparently someone got this from robin-high
It's just on the Discord, more info can be found here #1397655624103493813 - let me know if you have any questions.
yes memory
If you login you'll have these chats accessible via different devices.
Can I continue post fakes without mentioning LMArena?
yea but the model is so sloooowwwwwwwwwww
like you have to wait an hour for an output
Was just about to address this.
I mean in the future.
what happend to the early access?
why everyone go silenet when i talk :
i fell bad :
why do you want to post fakes??
Wanted to address the sharing of fake leaderboards here.** We're going to ask to not do this**. I'll be instructing the mods to remove this kind of content going forward. Even if it's done in a joking way, others could easily be misled by this. It's perfectly fine to speculate about where you think models will land on the leaderboard updates, but creating fake images misleading others isn't something we'd like to see happening here.
bro was typing ๐
The Test Garden? It's still a thing. However, not everyone that applies is going to be accepted into the program.
yes
I don't want to disclose this as then everyone will just change their applications to fit this.
@echo aurora any info about this retry yet?
fair enough ill apply tomorrow
Today I generated 16 fake benchmarks of different LLMs I was waiting to post here. Okay, I'll remove them from my PC.
Of course, not everyone will be accepted to participate.
the transleter problem :/
Sounds good!
Lol sorry to hear that. Thank you for understanding. 
At this point, I don't think I'll be updating our server rules to make it super official, but if we see more and more of this we will.
Hmm the error or the retry button?
the error, if i click retry it give same thing until it tells me to wait cool down like if was normaly spending tokes or something (which i explained in the bug section) then after cooldown it still keep on the same error try again...
referash the page
alread did
change the ai
refresh work better if the bot keeps either thinking or loading actions like generating images or web dev files. for text generations rarely happens
and try use vpn or come back after 1h
hi!I'm new to here, how can I use "Image Edit"?
Hi, how to upload model to lmarena?
oh,i know now...
changing model doesnt work unless i start new chat which i dont want to bc it will have different progressions and lose track and have to restart all i was doing. the vpn doesnt either. and i been 2 or 3 days with this already.
if i go to any other llm that is not claude i get this error too
yeah look what to click clear and than refrash the page and than change the model
like this
already past that
Hello and welcome
If you go here: https://lmarena.ai/?chat-modality=image and upload an image + prompt you'll be able to image edit.
Can you send us an email at contact@lmarena.ai and include more information about your model and the organization you're with?
Would you mind creating a new ticket in #1343291835845578853 and provide all of the relevant details there? That way we can keep the conversation a bit more organized.
this is much needed policy. Thank you!
i love lmarena
this also happened with search models
i'll try battle and direct chat rq
battle does the same
hi
Does using a new browser make a difference?
hello 
lemme see if i have any other browsers
oh yeah i do
Hi all, how's it going?
uhh on vivaldi it redirects me to lmarena.ai/ru
tho on zen it doesn't
vivaldi also has this same error
wait it could be the dns i'm using to bypass russia's fisheries
they actually didn't block lmarena
where is 5.2
nope not a dns issue
oh wow
vivaldi
havent heard of it since years
still a thing huh?
i guess if u like flashy UI
yea zen is better
is gpt 5.2 out?
what i remember is that vivaldi was so slow
i used firefox before vivaldi
same
i was a firefox user
then switched to brave
still using brave
i did try zen but i dont like how it looks
while i'm still a firefox user
not my thing tbh
zen supports firefox & chrome extensions?
only firefox
arc is based on chromium
ah yes arc
ok maybe im wrong, i think the browser that i was talking about is arc
done
they are so similar
lool
lol
if it had this thing at the top by default then it's arc
nothing worked but it was beatiful.
tf is gpt 5.2
another slop from openai
their new model
its on lmarena battle mode under the name 'robin'
pretty much all chatgpt models slowly kill any code
yea
robin
OPENAI'S ALTMAN SAYS GEMINI 3 HAS HAD LESS OF AN IMPACT ON OUR METRICS THAN WE FEARED
coping
12 minutes left for the release
yes
bet on what
take your own risk
lol
but the probability is high like 90%
they shared yesterday a tweet that has 'tomorrow' caption on it
i have no idea
maybe just a release
to get it out of the way
its not like its a crazy model
just more thinking time
you are
9 mins
Gpt sucs
garlic
i kinda pity oai tbh
they had like approx 9 months lead progress
but ngl their models are still the best at reasoning
5.2 will maybe be a bit better than 5.1 they can't improve it that far
Training takes a while
they said it gone be fix bugs
is there no livestream setup yet?
and bit better at coding
But i mean it can't beat claude or gemini for sure
Is there a cancel button when its stuck like this
no isnt
who cares just use gemini 3
pls send link
@deep adder told ya
expensive
dont listen to him
its not better than gemini 3
i tried it, its bad
gpt 5.2 = gpt 5.1 pro
any benchmark results out yet?
Is there really any surprise
14$ for output are we deada-
i dont think so
they're losing money anyways, they should just bring back gpt 4.5. lol
@fiery gull Gpt 5.2 is droped
we have some kind of new model [textarena]
what the hell
agree pffffft
why does this model even exists?
bro there si gpt-audio-2025-08-28
150,000 TPM
3 RPM
what the hell is this
Idk why gpt5.2 is being hyped so much, was it good?
no one tryed it
seriously
we all broke
I feel like itโs just the same as the upgrade to 5.0 to 5.1
guys is it possible to generate 9:16 format on the video arena
I noticed nothing for the change expect more slop front end
overhyping
gpt 5.2 pro xhigh premium max?
he is spreading fake numbers without any source
like 100$ for 82% swe?>
they call every model their best model
open ai is yaping
im this close ๐ค to block craig
gpt5.2 pro codex max
everyone does that :... most likely even llama calls itself the best model ever ๐
SWE bench verified isnt even benching properly anymore
Open ai and Chatgpt is the best yappers in the world after Deepseek
For professional use? It's dead
gpt 5.2 gets 82% even tho its frontend is sh
bro cloud is way better
I hope itโs actually good and not slop like gpt 5.1
its gone be trash
guys is it possible to generate 9:16 format on the video arena
prompt
yes from the prompt add in the prompt you wan tthe video 9:16 and it should work
soruce ?
they are so vague
thanks man i guess it just didnt work the first time
thinking like xhigh or what
will if isnt work idk :/
: D
What about Claude opus? 4.5 I wonder if thatโs good for anything beside coding
its good in text
SWE are liars, they put grok too high already
when grok 4.1 released it topped SWE
More of this benchmaxxing crap
I'll tell you if it's good when I can actually use it
They were prob just benchmark maxxing
52% on arc agi 2
gpt 5,2 any good ?
I felt that grok4.1 had the texting speech of the average twitter user
is this a new eval?
the output being 14$ is diabolical from OpenAI
Someone wait for pineapple to type in announcements.. /j
ngl its look open ai is trying to come back
frontier math tier 4, it loses
wait wtf thats kinda busted
yea
thats what im saying
this looks like an actual solid model
they just have to fix frontend sloppiness at coding
and we are so back
hehe
like sonsut 4.5
I thought every model was bad at frontier math 4
i alr said its gone be fix bugs
I don't want to believe benches, well even ARC-AGI until i see the model in action
I need help with smth.. I generated one photo and it doesn't let me anymore
we are so back craig
idk who gone read it but
"Code Red" Performance Focus
Context: GPT-5.2 was a "Code Red" release, meaning it was fast-tracked specifically to address competitive pressure from Google's Gemini 3, which had outperformed GPT-5.1 in reasoning and coding benchmarks.
Philosophy: Unlike GPT-5.1, which introduced user-facing features like "personalities" and tone controls, GPT-5.2 is a "performance-first" update. It focuses on reliability, speed, and raw reasoning power rather than new experimental features.
- Reasoning & Reliability
Scientific & Math Reasoning: GPT-5.2 Pro and Thinking models show significant gains in high-level benchmarks like FrontierMath and GPQA Diamond (graduate-level science), surpassing the capabilities of GPT-5.1 Thinking.
Logic & Multi-step Tasks: The model is much better at handling long chains of logic without "losing the thread," a common issue users reported with GPT-5.1 in complex workflows.
Reduced Hallucinations: There is a strong emphasis on "groundedness," with GPT-5.2 showing an estimated 80% reduction in hallucinations compared to earlier iterations, making it far more reliable for enterprise and research use.
- Speed & Latency
Optimized Pipeline: GPT-5.2 introduces major backend optimizations that make it significantly faster (lower latency) than GPT-5.1, particularly for the "Instant" model on routine queries.
Smoother Turn-taking: The chat experience is described as having "tighter logic" and less lag, addressing the "sluggishness" some users felt with GPT-5.1's reasoning models.
- Coding & Technical Work
SWE-bench Scores: GPT-5.2 achieves higher scores on coding benchmarks (e.g., ~74.9% on SWE-bench Verified), with specific improvements in debugging, multi-file handling, and reduced syntax errors compared to GPT-5.1.
Agentic Capabilities: The model is better at "agentic" tasksโexecuting multi-step projects like building entire spreadsheets or presentations autonomously, where GPT-5.1 might have required more manual hand-holding.
- Architecture Refinements
Unified Router: While GPT-5.1 introduced the concept of "Instant" vs. "Thinking" models, GPT-5.2 refines the automatic router to be much smarter at detecting "explicit intent." If you ask it to "think hard," it routes to the Thinking model more reliably than 5.1 did.
Context Management: Although the context window size (approx. 272k-400k tokens) remains similar, GPT-5.2 is far better at utilizing that context effectively, reducing "context drift" (forgetting earlier parts of the conversation) which was a critique of 5.1.
FR
Gpt 5.2
o lets gooo
FAKE!
On webdev huh
:D:DD::D:D::D:DD::DD: Yea
NOT FAKE
Can someone help?
It's now added guys
Did they skip 5.1 or was i under a rock
Via Photoshop or Nano Banana?
going to test it out on codearena rn
this model takes so long
agaisnt gemini 3
they wont win fastest model tho
googles new flash models get that point from me
they write 400 lines in less than a min
yeah but the new flashes have quality and speed
still waiting on those
i hope google ships
Gpt5.2 high is really fast
Stop FLOOD!!!!!
I was preparing to wait like 5 minutes for the reply
Maybe thatโs a good thing or a bad thing
STOP THE COUNT
benchmarks are good! this will force google to up their game!!
on codearena it takes long
Oh
Well, OpenAI gave me a bad first impression. The first thing i generated on codearena with gpt 5.2 high instantly broke
guys someone
What's the point of the video models in the video arena being randomized? If it is so then what's the point of needing two votes to actually see the models?
Good job SAMA!
thx bro
Wow
WE NEED XHIGH ON ARENA!!!!!!!!!!!
That's the best openAI can muster huh
Oh wait I was waiting for it to reply until I saw it freezes mid conversation
I'm going back to gemini 3
Im doing some testing for gpt 5.2 high
for who want to use it
so far it made one fully broken game
Whatโs Xhigh
GOOD JOB SAMA!
No, it's just HIGH!
are you high
why? benchmarks are looking great!
lmarena is free
Huh?
xhigh is different than high
LOOK AT THE BENCHMARKS PLEASE DO NOT ACTUALLY USE THE MODEL
JUST LOOK AT HOW GOOD IT DID ON SWE
I choose to believe Claude opus was taking its time in code arena until I noticed it just kept going forever saying creating index html
Sob
I wonder if itโs happening for gpt too
I am serious. Are you saying that actually use is not great?
I thought you were joking
Itโs a joke
gpt is taking so long time
Im running side by sides with GPT 5.2 High and gemini 3 pro
no means "not great" or "no.. it is great". ? ๐
I think itโs either the code taking a lot of time or stuck
well that was fast
Gemini 3 is clearly miles ahead I are we using the same 5.2
same
They are joking
Being sarcastic
I think itโs stuck too
they just released it haha
I do hope that people go back to gpt so that I can get more gemini 3 uses
The output is good
It's going to be Battle mode since that's what we use to build our leaderboards. The model's names not being shown until there are 2 votes is so it doesn't bias the votes. All votes after 2 votes don't contribute to the leaderboards (as the names are now exposed).
Ish
guys is kat coder pro in lmarena? when yes i use it
is 5.2 any good
chatgpt be releasing the worst models that are never #1 on the leaderboard
The link in code arena how much minutes does it last and can other ppl see it?
gpt 5.2 high made this
https://019b0eac-c31e-741f-b0e3-598bb5904f74.arena.site
PROTIP: TRY NOT TO CRASH!
the game is insanely broken
GOOD JOB SAMA!
I APPLOUD!
and nothing for cursor
chatGPT will never dominate the AI industry
The preview link? It shouldn't expire and others can see it if the link is shared with them.
Ok got it
80% on SWE btw, and makes piece of s-
prob all fake results
What the heck is this GPT-5.2
Well the response is buggy
lol this is cool
Well kinda sucks as ive discovered a neat bug in the first 30s
breaks the game
How good is the new gpt
it's stupid
moments before grief
Woah
bro gpt 5.2 is coocked
None of these buttons work
cooking
flop, and scored 80 % on SWE while being the worst coding model ever
rn
The app is buggy. But the gui is good
exactly
๐ฅ
the gui is as-
And you still haven't figured out what the problem is with chats that close over time and you can't log in?
I'm sorry, I could have done as it was said in that manual, but I don't have a PC or laptop. That's why I'm wondering if someone sent you the necessary data.
Gpt is actually good at backend
1131 line of code i wish if its work
how's gpt 5.2?
I have another question, how is it legal for lmarena to offer paid models completely for free?
its meh at coding ngl
good/bad/overhyped?
I'll stick to Opus 4.5 then
and OpenAI wants to argue they beat gemini 3 pro
they pay it out of their own pocket
atleast gemini 3 pro often gives me bug free stuff
Overhyped by Twitter, an alright coding model. Claude still dominates
Yeah i also want to know. it's annoying
Seems like a rushed product just to "keep up" with the competition
claude and gemini were never dethroned
They are not beating gemini with this lol
The world knowledge on this model is clearly short of gemini 3
So it's another codemaxxed model. Congrats sama
Yeah i was honestly hoping for more Competition as im mainly using Claude and Gemini
Pichai remains undefeated
Google stock down 3%
randomized because it is for testing which is better. two votes because they need another vote beside u
yeah, I had the same feeling; it's solid but overhyped cause fanboys + crazy marketing
bro gpt is bad
No new updates sorry to say
you can't log in?
This doesn't sound familiar, did you make a post in #1343291835845578853 ?
Gpt 5.2 high codex pro max when
gemini 3.0: free
gpt-5.2: paid 1000000 dollars a month
who made it better
https://019b0eb1-becc-7abd-865d-dc86a83fc504.arena.site
first link is GPT 5.2
second link is Gemini 3 pro
https://019b0eb1-becc-7f0a-ad2f-6ef5adc697e7.arena.site
gemini isnt making a game for me
obviously gemini 3 pro
they want 14$ for the output
i dont think so
i say we sue