#general
1 messages · Page 222 of 1
Got the auto again
Yeah this is an experiment, meaning a small percentage of people are going to be seeing it currently. cc @cloud zinc
Also same with auto modality. 
Oh okay
Now I got some type of new ui
Instead of icons it's like a dropdown
what is it?
benchmark unequal to general intelligence
people wont use gpt 5.2 because its too pricy
if gemini 3 pro is basically still the king why pay more for gpt
nobody is king
Gpt useless 🤣
👋

👋
hello everyone what's up? I want to know how to face swap a pic with her face and her into a video on Instagram from another lady, thanks
Speaking of removed models, will speciale ever make a comeback? It's been a long while since it got deleted.
deepseek 3.2 speciale?
yea
has it ever been on lmarena?
yes but it was deleted the same day it came to lmarena
oh
grok removed?
I also heard some complaints about reve-v1 and reve-fast-edit being removed, it was replaced with the stealth models epsilon and epsilon-fast, some people would like to be able to select these models again
are they still reve v1 tho
they might be updated
it's an upgraded model, but I know its removal and replacement with an unselectable model upset some people
It was prob the company decision, not LMArenas
is there a limits for gemeni-3-pro?
Yes, all models will have some kind of rate limit associated with them.
peak new model name @whole sundial
its a textarena model
i hope its not amazon
so best overall ai model is gemini 3 pro for December 2025?
gpt 5.2
frr?
yes
lmarena doesnt allow pdf upload. you have to upload txt manually
my txt file has 40,000 words in it, i wanted to use it to analyse using gpt5.2 as a txt file instead of the entire thing as text input
yes and we're definitly improving everything your talking about ? each new month a new model with better capabilities came out
and it wont stop in 2026 for sure
less hallucination, follow task much better, better at coding
like everything we want its what they're working on
anyone know any site that have nanobanana 4K model?
yes aistudio
no it is a scam
scam models
sota in what
I agreed. They are not look too big.
Nobody likes it. Even Scam Altman.
5.2 is #1 model
in benchmarks
5.2 benchmaxxing
It is designed for benchmarks only, not for real tasks.
yes for real task also
It's a banchmark.
that openai made...
GDPval is a banchmark.
for work tasks
of course a model by openai will perform best on an openai made benchmark
show where its bad
But still a banchmark.
Agreed. They design banchmarks to win in them other models.
I will laugh when it will get lower place in Text Arena than GPT 5.1
🤣
so what are u basing ur opinion on
Truth.
Banchmark is a banchmark. Period.
ur truth is a lie
Is banchmark a banchmark or not?
I won!
This clown is GPT 5.2, not me. It lied for my demand, is it a smart output?
Hello! Let's keep disagreements respectful and friendly.
Highly agreed👍
yes, gpt 5.2 is best
OpenAI got 500 billion dollars from the US government, Anthropic got nothing, Google got nothing and Gemini 3 Pro and Opus 4.5 both outperform GPT 5.2. How is that possible? Scam Altman's mismanaging of the company is the reason.
us didnt give openai money, u are wrong
us only approved the stargate project
By the way, right answer is yes, GPT 5.2 said wrong answer, like always.
This Stargate Project is a gift of the most powerful AI farm to OpenAI.
not a "gift"
Gift
its called investment
The same
Contract isn't a gift. Investment is.
Even if a gift(but it's not) 500B and 200M is 2500X difference.
500 billion over several years
google can't get that much gift cuz they are bad
Google to Anthropic and US government (taxpayers money) to OpenAI. Are you feeling the difference?
is it just me or do you guys think that most people use LMArena to get all the paid ai's for free
And taxpayers are good 👍
Yes
not us government, its softbank
But some like me, don't
Anyway Scam Altman will get those money. Anthropic and Google will not.
cuz scam google are bad
Why?
cuz gpt 5.2 is #1
its bad because of the safety
where is it? u are making a rumor, not actual reality
so far with my testing, it looks bad
Gemini hallucinates the fact it has dall e 3? Interesting....
Is it just me, or are others experiencing a bug when trying to login? Essentially, after doing an email login, after entering email/password, clicking the login button doesn't do anything. Are others seeing the same?
yes
Thank you. Will share with the team.
Maybe because this "test" is paid by Scam Altman?
60% say bad, 40% say good. GPT 5.2 is feeling so bad.
how many votes?
u voted with ur alt ok
@cloud zinc If GPT 5.2 is #1 as you said multiple times today, why members of LMArena hate it so much?
no one is hating
Proofs!
Give proofs instead of repeating false accusations!
vote not over 23 hours left
See you after 23 hours! Bye! 👋
Look my previous poll
#general message
ur poll is rigged
Give proofs, false accuser!
Hey going to ask we move on from this conversation.
This doesn't seem to be very productive and is just escalating a bit here and there.
He said I vote from my alts in my polls with zero evidence.
Just accuse, accuse, accuse, again and again! Without proofs.
give proof u are not lying
can anyone help me a bit?
yes
Troll! 😡
so u have no proof
I added you to block list. Bye! 👋
thought so
im working on something with python, been workin on it for a few days, then the chat decides to just stop, not being able to send emssages (i mean like i can send but it gives me that annyoing red prompt "something went wrong with this response, please try again") i refreshed, restarted, tried typing again, same thing
using ai ofc
also gemini 3.0
u need to start new chat
i have almost the whole stuff on that specific chat, any other way i can restore it?
how do i even back a chat up anws
uhh
Why can't gpt 5.2 extra high think of another way of designing stuff
It all looks the same
gimini
Wheres creativity
flash tomorrow
how do u know
u will see
Yeah it cooked for me but I just don't like that its not that creative
bet
watch for logan tweet tonight
Thats weird since even Qwen3 high manages that
Likely because the gpt is sooo formal
So it will rarely try outshine, besises it writes well when you ask something exactly
they copied the "high" thing from OpenAI? 🗿
😂
yes, the word "high" is copyrighted by openai
gpt 6 latent space reasoning
I think its general knowledge
That's not what I meant. Don't pretend you aren't very smart. Although maybe you...
I hope
I mean it's reasonable to assume they have bigger release in the works
there was less than 1 month between 5.1 and 5.2
this was like a small incremental update
Thats not what he meant. Dont pretend you arent very smart
Although maybe you...
No clue why they tried to oversell it this hard lol
Like pushing for benchmarks this hard with 5.2 wasn't necessary I feel like
true
5.2 you mean?
I don't think it is tbh
have you seen SimpleBench and other stuff?
5.2 gets beaten by both 5.1 and even more so by 5.0 there lol
That's the thing, my experience was the same. 5.0 > 5.1 and 5.2
And then Gemini3 just somehow manages to score great everywhere they test it at
Not definitively the best in select things perhaps, but not really underperforming anywhere either
5.1 is way better than 5.0
Almost same smartness, but it actually efforts to make better responses
Hi everyone, a quick question “Which AI model will one recommend to aid in accounting work?”
Hello
would encourage you to check out our Text Arena leaderboard using the Business, Management, and Financial Ops Occupational filter -> https://lmarena.ai/leaderboard/text/industry-business-and-management-and-financial-operations
Thank you
@torn mantle nice
Wait the text leaderboard updated and 5.2 isn't even in the top 10 lmfao wth
gpt-5.2 isn't yet on the Text leaderboard
That makes much more sense, between you and us where do you (personally) feel it's going to fall 👀 (if you're allowed to say, no worries if you're legally bonded from doing so)
anyone notice gemini 3 is hallucinating thinking it has experiences?
"I personally have a database with over 800 entries."
when will it be?
if you can share
I'm not legally bound to discourage from sharing my personal opinion, but I'd prefer not to as some may interpret that the wrong way.
TBD. Sorry to say I won't be able to give an estimated time.
i think it will score highly, though it does terrible on some edge cases
fails seahorse test
Where do you think it'll land?
Really flipping it around
LIVE BENCHMARK UPDATE
Model: Hawk (Launch 25th)
We're currently halfway through the official ARC-AGI-2 benchmark - one of the hardest AI reasoning tests in existence!
Current Stats (48% Complete)
Correct: 14
Incorrect: 44
Accuracy: 24.1%
Progress: 58/120 tasks
Key Takeaways:
Outperforming Claude Opus 4.5 by 10%
Currently ranked #9 globally
“legally bound” bro it’s an AI benchmark site not classified information 😂
Can we make LIVE BENCHMARK UPDATE a meme?
3rd
realistically 2nd, but it could really go either way depending on how varied opinions go imo
i think only o3 and opus 4.5 thinking gets seahorse question right?
gemini does too i think?
guys, what are all the site's keyboard shortcuts? is there a shorcut for new chat?
@echo aurora when will the video arena be accessible, I've seen people here claim they had access to it.
when will we get an ai with like 1% hallucination
that's all im waiting for
ofc its good to have genius ai but what if they always do errors
give me actual genius ai if you want but that are capable of actually doing things without mistake
2027
you think ?
Cause like in 2026 we're supposed to see alot of new ai like grok 5 and all
and they're supposed to be like much better
overhyped products
Hi everyone
hello
?
@astral bloom video arena is right here in discord, is a discord exclusive feature, go to the channel how-to-video-bot for more info
there is video and auto modality being experimented in the site
hmm is happening
Something went wrong with this response, please try again.
Anyone wanna search with images?
I have a way.
Literally just go to lmarena.ai
Add images
Switch to search modality
Send Prompt.
Done.
I already reported it in #1449482775823515688
So act fast , :p
Also, Lmarena is working on 2 new possible updates;
A new modality picker
And
A new Video Modality in web
which will be available at
https://lmarena.ai/c/new?chat-modality=video
And
https://lmarena.ai/?chat-modality=video
Enjoy the information 💁
Currently these links are redirected to lmarena.ai
And video generation requires login.
how do i make video on site?
Hallucination is baked into Grok by design...
hi, how can I use gpt 5.2 xhigh on lmarena? I only see 5.2 high. thanks
not available
mother of God
I got access randomly, but it is not currently available.
I tried but it required login and upon login it went away.
Does anyone know why it no longer displays images in 1920x1080?
is just 768 x1360
Yesterday, if you asked for them in 1920x1080, it would do that.
what do you mean its not only grok every llm
theres actually no llm without hallucination
oh i see
is gemini error ?
I had thought it would be 5 but 4.7 works too
I too
Expect quicker releases now from all companies
Thr competition is peak cutthroat now

AI winter is coming... My expectations are Anthropic and Google will release lots of great models under this competition.
P.S. Maybe xAI also, I am not sure.
Also, AMD and Nvidia will release new AI chips in January, which also accelerate this rush dramatically.
Googles and Amazons chips are also great 👍
My expectations are after Sonnet 4.7 release, it will get 🥇 first place in coding, Opus 4.5 🥈 second.
I doubt google will release any models soon
They gonna be either focused on other stuff or just training the new models for long time
I'm sure no one tech giant is focused on any other things.
Grok 4.20 is coming until Christmas ⛄ also. Not sure how much good it will be.
Do you guys think ai will really replace like almost every jobs ?
many people is saying this is stupid and will never happen
but
why tho it can definitly happen
Well it's based
Animators? Could be after I seen sora
then why majority of people still think it wont happen isn't it like crazy
They just can't accept that ai is getting too good
Well some believe the big ai advancement will slowdown someday
Its kinda based how well they train ai
there's always a solution
Like how do you think when ai will just reach its limit and stop advancing?
It will just be a problem to solve not something impossible to achieve, people be like its impossible yet every time human do new discoveries and what felt impossible is just a reality
there's always someone that find a new way
there will be some problem but saying we can't ever solve it feel stupid
in my opinion
Opus 4.5 Thinking 32K + prompt engineer can replace ANY programmer easily.
Not really due to hallucinations
we just want less hallucinations and yes
Prompt engineer is a human, they will give prompts to Opus until it would give an output without hallucinations.
your right but are we already at that point where it will be like as efficient as a team of programmer, i don't really know at all
i think definitly in 2026 what your saying will be doable
we're close for sure
I'm a programmer, and that's why I think LLMs are bad for us. To find a job is much harder for me even now, if to compare a few years ago.
its bad cause we only see what's happening right now, but in the future we won't just let people having no jobs and like die
it will just be very very different
it can definitly help us all if we think about medecine, new discoveries, and future
its just that theres this questions about jobs and what will happen but no one can answer already were not here yet
but for sure, we won't let people struggle forever
Only so groups of people, all people with intelligence work, people with physical work will feeling great.
Sorry, english is not my native language and im not really sure about what you try to say
Physical work might also get replaced
But not by LLMs.
No, but by robotics maybe ? which is also growing much
In china its already happening
hi
Current LLMs are very good, current robots are not good. They will be able to do all physical works, but not in near future.
Already done in some factory in china, with this ai race its not only about LLM
there is also robotics going with it
and yes they are doing alot of progress
I would say we're really into automation these days
everythings is about it
we want more efficient, cheaper worker
and its doing alot of progress
You can look deeper into it
its happening.
nanobanana pro DIRECT keeps giving“something wrong”, anyone as well?
yeah gemini looks error
Robots waiters are good.What about robots plumbers, electricians, carpenters, builders, etc.? It's impossible.
impossible ? no its not, its just not done yet, like i said we're doing alot of progress in it you really can look into it you'll see its not a dream or something impossible
im not saying its already done, but definitly it seems that it will happen
and not like in 50 years
that's definitly something else than LLM
but its also doing progress
I think even more. If robot (even futures one) would repair your electricity panel your house would burn.
there is a problem, and no problem is impossible to solve
like we definitly won't make something that kill you
or is dangerous
We need 50 years or more to solve this problem.
wow that's a big numbers, we're progressing faster than that
you know how the world were 50 years ago ?
its changing much faster than that
You're not an admin. How did you get insider information?
But we shouldn't see only the bad side of it honestly
nothing is all good or all bad
but from what we saw in history, when we progress its mostly good
yes and error is count on limit to me
hmm
and Direct keep error and said try again in 50 min
hmm
not 'forever' but don't be so sure the rich will just help people out
the question is whether you will end up living the majority of your life in hell, or abundance
it won't depend on one person but on the government, i can't say hey everythings will be good, but i can definitly say if its bad anyway it will also be bad for them in that case
the great depression was a pretty long period of time
its a win win
it will take a long time
at first they'll be like "oh, but, AI is creating new jobs. we'll just create more gov jobs."
they'll delay it as long as possible
we might even get a revolution before we getr change
historically, that's how it plays out
yes theres a time for it to be good actually they will delay it your right, but definitly the worst case will not happen
yeah i agree. to be honest, worst case i'm worried about is, the rich people are in charge of AI right now
Im not sure of what your trying to say cause ai are made by rich people already
and the rich people have access to the best AI models. not worried about an AI 'killing us all' as much as 'rich people having great influence over AI' and using it to manipulate people
imagine a 100x better claude opus 4.5, but only the rich have access to it, adn they use it to exploit you
that's the worst case, but it won't happen if it do you can already tell it will be a war
i don't believe we all just gonna accept something like that happening and they know it
sure, but, like all wars, it's usually won by the people with the most power
can you win against a robot army?
i think we see great famine before we see great abundance
why will they use a robot army, why will they want to be hated by everyone, and that mean like every country have to do it
historically, that's always happened. not sure this time is 'different'
if everyone like everyone do hate them they are loosing
they are human too
they don't want human to die
and nobody will let someone doing it
have you not seen how many humans were enslaved historically?
Yes, but like what your saying is something different actually its not the same weight
we're living in less than 0.01% of human history. we forget how bad things have been
if something like a robot army controlling everyone happen humanity is dead
would mean even them at some point will die
i don't think it will go on forever, i'm just saying, historically, we go through incredibly rough times before we get good times
and we forget, as a species, how good it is now, compared to how it was
and that cycle has repeated itself for as long as history was written
Anyone else having issues with nano banana pro
so either this time is different, or it's not
Why do you think its the same thing ? when we created pcs and phone and everythings no such thing happened
not that bad
atleast
yeah is seem error
the industrial revolution, the great depression, the last 2 wars..
But also people getting smarter with more right
you can't compare something that happened when like everythings was different from now to now
this is a very small % of human history. i'd argue we're still in the age of abundance, but we're on a downward curve into the age of famine
so you're saying. "this time is different"
it could be. but it rarely is
it always are different, cause something like that happening in the world of today doesn't seem achievable and if it is like everyone will loose
even them
yes, i don't think it will go on forever
i think history never repeats itself exactly but it often rhymes
i think we will get an age where a lot of people are in famine because of AI, and eventually a revolution of some sort, and eventually we will have abundance
they need to be popular too
the only question is whether we end up living in the majority of the famine and barely see the abundance
to be popular it also mean they have to do good things somehow
like many people who survived through WW2
we might end up telling our grand kids, "Back in my day, AIs made us all poor.."
WW2 didn't started cause suddenly we got new technologies
no, it started because of famine in Germany
we'll see. you're banking on the government being actually controlled by the poor majority instead of manipulated by the rich minority
yes cause like you said majority and minority, and if the majority start to defend themselves the minority wont win
and they don't want that anyway
it would in every case make them loose
i think what you're saying is right, i just think it'll take a lot longer than we'd like
your right too, people loosing job is not something avoidable now
but i dont know what will happen
i just know we can't go back anyway
Result are in for our upcoming model.
if anything, it's better the country you live in, develops AI before the country that may not share your same values
hello
hi
its like, when gun were created who will still use sword
its more efficient
and that's all we want
when people will realize they can't beat someone that use ai in the future
they will use ai
and you can't really blame them
imagine trying to protest the development of guns 😄
you just can't tell to everyone to stop
imagine if your country had no guns but every other country does 😄
and if someone still do it he will be more efficient
not even in the army
exactly that's why its not possible to go back when its proven its better
i do agree with you 💯 on one thing, protesting AI is stupid
it won't lead anywhere
except your own misfortune
we can't protest progress even if the progress is dangerous or will lead to big problem like mass unemployment
cause its somehow still a progress
all we can do is find solution but it doesn't even depend on us
but on few people :
all i can say is if its too bad the majority will defend themselves
so it probably won't be "too" bad
hmm
but things are 'too bad' in many countries
and the majority isn't able to defend themselves
its not equality everywhere, for the big countries it work like i said
i think that's why AI development is actually super important
it will determine whether your country is a 'big' country or not, and how long the famine will last
if ever a "bad" countries get the lead of the most powerfull progress then we're all at their mercy
that's why its a race
we're saying to know who will be the leader
of the next decades
but no one understand it atleast not yet
e.g. may not be US or China
might not even be smarter, could be luck
yes your right
we just can't tell
but its a race and not for nothing
its much more important than people think
one very interesting video I find is the time lapse videos of the world borders: https://www.youtube.com/watch?v=-6Wu0Q7x5D0
Since 200,000 BCE, humanity has spread around globe and enacted huge change upon the planet. This video shows every year of that story, right from the beginning.
Abriviations can be found in this document: https://docs.google.com/document/d/1_oJx72M75tuai2mo6yD13qqQB1g_auQ...
when china and us are racing how can people still say hey its nothing
i don't understand sometime
put it on 2x or whatever. the borders historically change like crazy, yet somehow, we think war is over
how delusionnal we can be
Yeah i know already too many things happen but we see it as small and not important when it actually matter the most
I do hope you're right, that this time really is different
if its not big enough to see it through eyes people will still think its nothing
otherwise in the next 7-30 years we're likely going to go throuigh hell 😄
yes and it can even be faster than in 7 years
by seeing at the progress of it this year for example
its crazy
all of it is being speed up cause its a race
well i mean, people predicted self-driving cars everywhere 5 years ago 😄
yes but theres a difference, cause self driving car doesn't mean power
when its about power you'll be surprised
how fast it happen
seeing Trillion dollars of investment lol
i don't even know who have this much money but yeah
its still happening
It doesn't look real. In my experience really good models don't need hype, they get hype automatically from their existence. You post these pictures every single day maybe because after the release hype will over.
Yes gpt 5.2 what a scam in my opinion
they did this so they don't loose fame but its definitly not as good as expected
benchmark are impressive yet we dont see real result
like student getting 100 / 100 on a test but when it come to apply it in real life nothing happen
that is true
president themselves are involved in this
its not just a new fancy technologies
its power but people be underestimating this
while Trillion dollars being invested
Tell me what the problem is. I've been nerfing everything for a long time, but now I keep getting the error "Something went wrong with this response, please try again." (Gemini 3 pro image preview nano bsnana)
Help please
It's something with nano banana itself
It's not available on yupp either
someone told me error result doesnt count on limit, but I try over and over when limit resets, I found it really count, cuz this hour I got only error msg and limits still runs out
yeah me too is sme happening again
how I test is: generate something random and Flux model appear (I remember the style it draw cuz I see it a lot) with other Error model I keep clicking on retry with that error model and it count on 11s then error again, I do this untill it doesnt count anymore, that means limit runs out already
This is an update, the server just doesn't work.
Nano Banana Pro is not working again
hmm, well me too
any cods invite for sora?
Everyone likes 👍???
You can now use lmarena to generate videos(Open Chrome, whatever you like)If u are lucky enough!
i can onlyy generate image and web ersearch and code. How to generate video
Asking pineapple, probably release in a few weeks/minths
but why u can generate video and i cant?
"almost same"... For me the style feels kinda forced. And it isn't as reliable on some tasks.
But you can try your luck and use #video-arena-1 1, #video-arena-2 or #video-arena-3 and #video-arena-4
Maybe u can use Sora👍🏻Good luck
Is image generation not working at all in lm arena since yesterday? At least for Nanobanana
I don't see a video option.
Yeah I do it several hours ago and it maybe a leak
I was amazed
And I saw many models like kling😆
Damn 
with GPT5 they went all in for genuine performance. With 5.1 they went for response style, and then for 5.2 they had this "oh sh'it" moment franctically responding to Gemini3
I didn't have time to try it out actually.BTW, it is a good try to add video arena
oh well is work again
that turnaround of less than 1 month between model versions is not normal tbh
We had similar with 2.5Pro 05-06 vs 06-05, but that was marginal differences and was advertised as such
The problem with video arena here is the randomness. I almost never get Sora or Veo 3. It's always the bad models
Yeah, the mods tell me that they may add a function for a model at random and a model you pick option
They scammed Microsoft when Microsoft invested $500B into Stargate and now Microsoft wanna get their money back because models of OAI are terrible, but they can't.
I hope so, but yeah, I'd prefer it on the site itself. Way more convenient and less traffic than in a public chat where it may get drowned by other gens.
they wanted to release a new model fast due to them loosing people cause of gemini and claude
but it wasn't the best idea
There is a feature "Direct Chat" for text, visual, and image models. It would be a solid idea to add it for videos too.
And all people see your inputs/outputs.
It was the worst idea. Ever. GPT 5.1 outperforms GPT 5.2 in real tasks. GPT 5.2 is just a banchmarks scam.
i guess they just lost due to their idea can they get back on the race ? i don't know honestly it depends on so many things
people will always use the best
fake benchmark doesn't get them anything lol
if it can't do it in real life
Nano banana is not working today
Trying not to lose short-term race, but don't care at all about long-term race proofs Scam Altman has negative IQ level.
yea, about only 1/5 generation was successful
1/50
Yep. All we can do is wait.
And hope they fix it

Probably because they're trying to rush out flash 3
Flash 3? How's it different from Pro 3?
Faster and cheaper for those paying
(High use people and api people)
And some claim it's better in some ways. We'll see today
Quick question, which one is better for research,
Kimi, MiniMax Agent, or Deepseek?
Why these specifically though? There multiple different ones but probably deepseek though
I haven't used deepseek or minimax but K2 is actually pretty good for research
surprisingly good
How to use Sor 3?
Thinking or Instruct?
thinking, on their site (not api)
go with k2 much better then those
Is there any news when nano banana will work?
hmm still not respond
Might as well Try Tongyi Deep Research in Qwen
still like this "Something went wrong with this response, please try again."
thanks for the suggestion, i've not tried that
Yeah try all best and judge by personal experience
Is the research very important?
what is 'low price'
Cheaper then original
so not very cheap then
Probably well cheaper
The black internet market of ai yeah
I have a a seat in ChatGPT business myself
Ping mods while replying to the message ig
Right
The Stage is Yours
Ig I'll wait can't test if we can ping mods or not though
U can atleast ping pineapple afaik
gemini flash 3 is defo today
google put out another model onto battle LMArena
Updated Ghostfalcon and Fiercefalcon
what the
I'm so curious to see how Flash 3 will perform.
preliminary reports are saying 'the sonnet to the opus'
we'll see soon enough if it is
don't think this is google. doesn't make sense for them to release a codename trial model on the same day as release
It felt like a Google model to me, could be wrong though
If Ghostfalcon is Flash 3, then OpenAI is in even deeper s*** than they already are. I just got it in battle mode, and it did a fantastic job
google got the 3/3 on my questions after i changed them up a bit
btw
no AI could 3/3 this
i want to see if Xhigh can after i changed em up a bit
Yeah guess sundar wasn't lying it's their best model yet on that specific interview.
ima test xhigh rq
also deepseek r1 turbo got it right
so if xhigh doesnt
OpenAi is cooked
I hope it's good, I'll have to manipulate 300gb of documents in anti gravity, I need flash 3.0 to be good in agentic mode
gpt 5.2 xhigh officialy lost to deepseek r1 turbo and the new gemini model
2/3 because of the Macy question
I cant use gpt 5.2 xhigh, my works it need more that 500 sec and crash
if you argue fell behind means she fell off the hill it aint, it means that she is in the back of the hill
im usin on yupp
Guys work the server on lm?
Gpt 5.2 trys a lot to not allucinate and ends not reasoning many things
Too
btw r1 turbo and ghostfalcon-20251215got it right
How so?
Gemini 3 pro
When R1 was released it was the best model sometimes by far in everything, yes?
the discord designs it made had me impressed
it was then NERFED
a f- ton
It was able to recreate websites
Yeah, i am not tweaking
He was a burfed 4o and near to level of sonnet 3.5 but with a reasonable reasoning
It made deepseek moment happened, if that is a thing lol
5.2 on text leaderboard wen
eventually
He takes too much time answering, it will take weeks lmao
I wonder how "code red" will hold up against Gemini
Do lmarena have a predecent of having a model avaliable on the direct select chat, but that model is not on the leaderboards yet?
ghostfalcon is 100% flash
At this point, I wouldn't be at all surprised if Flash 3 would beat 5.2 High.
The fact open ai makes a benchmaxxed model, to then make it actually smart is very true, it is very noticeable in 4o
I was almost certain that Flash 3 is tomorrow .. umm
I feel like it can be used as boosting purposes, if the side by side and battle votes are not separated.
Year Summary VS
OpenAI Vs Gemini
OpenAI
- 2 new models that they said were good
- Both flopped
- Benchmaxxing king
Gemini - 2 new models
- Both good
Gemini no diff
Veo 3 please
tuesday
Since he does many works perfectly, and loses perfomance gradually when the task is more different
They always release on a tuesday
Its sad that gpt 5.1 had a relative flop
Its gpt 5, but actually efforts to make a better answer, it is great
Google also gave us the best OCR this year
yeah, Logan tweet also suggest that it will happen today. Good thing that I dont gamble, otherwise I would have place bet for Wednesday 🙂
It will take a LOOOOOOOOONG time till someone beats gemini pro 3 OCR
Yeah it's teacher model is Gemini 3 Pro after all
Since OCR existed, Google always had the best one, have the best one and likely will always have the best one
I think will be gemini 3.5 lol
This is a no brainer, they been doing this since Google lens.
someone except google
And probably no one can't beat them.
yeah google lens was good
I also used gemini Live from the app and told it to identify things in my room
it didnt get one wrong
Well, Qwen3 Omni had a level close to Gemini 2.5 and runs in a phone
There was a company the Prime was not in the 3rd model?
GPT 5.2 is very bad.
True. Like Geralt and Ciri. 😉
well yeah need fix
Primes are in the 3rd model?:
Gpt 3/3.5 ✅ (Gpt 4 was not way better against Claude 2)
Grok 3 ✅ (Grok 4 fast was great, but it suprassed the 1400 points)
Gemini 3 ✅ (By far)
Claude 3 ✅
Qwen 3 ✅ (By far)
Mistral 3 ❌ (The first Mixtral 22bx8 were a beast, yet nowadays they barely manage to be competitive)
we need claude has image import
i really tried to like it, i tried various things on it, but, it feels like a model from 1.5 years ago
I know some people can be frustated about gpt 5.2, but you are isane
What are you dooing with that?
Are you using Plus tier or Pro tier?
Plus tier is terrible.. you get medium gpt 5.2, which is like gemini flash models. They are treating their paid customers very badly 🙁
You need to use Pro or API to get the best out of gpt 5.2
Ouch.
Ah yes, gpt 5.2 medium
i'm using it from api, even xhigh. i haven't tried 'pro' but i'm guessing it's similar to xhigh
Its only gpt 5.1 medium, but allucinates sigthly less (to the unoticeable level)
xhigh should perform .. what is your usecase?
maybe 5.2 medium. but i find 5.1 way better
Why non Gpt 5.2 xtra high even exists? Lmao
i tried it with coding, i tried it with logic, tried it with medical text, tried it with long context
If you look at 5.2 isolated, it is a really good model. But it's not excellent. Plus It's behind the competition, and above all, it's excrutiatingly slow. So all in all, it's neither fun to use, nor impressive in any way.
If you use xtra high, it does very very good analises in documents
Sometimes better than opus 4.5 or Gemini 3
give an example please
because every time i've tried it, it's worse than all the other models
My brother creates documents using Gemini 3, and when go fixing it with Opus 4.5 and Gpt 5.2 xtra high
i might be using 5.2 medium, if xhigh is 5.2 medium
what kind of documents? what did it fix that opus 4.5 couldn't?
Most times 5.2 notices way more issues than opus 4.5
proofreading?
I dont have it now xd
ok but what kind of documents and what kind of errors?
But yes, you are right, besises xtra high, it looks be worse than gpt 5.1
i had xtra high try to create a game for me. it was terrible
not functional
or barely functional, rather
Documents in general, he find subtle erros and mistakes, and even give good fixs plane
please be more specific than in general because every document and information i've fed it, it acts like it has low context
and it's all confused
and the few times it suggests things, it's wrong
my specifics are html/js code, transcripts, and medical text
it's failed at all three compared to other models and even gpt 5.1
i'm very confused what it excels at
You have to use standardized benchmark tests, it's only optimized for that. 😉
🤣
ok that is fair, i've not thrown any standardized texts at it, just actual use cases
omg why all claude moldes does not have image import
can we all agree that LMArena's banana pro is spoiled? just can't get any result beside other than "Something went wrong with this response, please try again."?
Hi everyone, I'm using Arena, but it always gives me this error, how can I fix it? Thanks
Yep. It's been happening all day.
They're probably busy implementing Gemini 3 Flash Image. 😉
Sorry to say we've been experiencing a higher than usual error rate with this model. It's also possible that you're hitting a rate limit.
If you haven't already would recommend to try: hard refresh of site, clear your cookies/cache, and if no luck starting a new chat. This may help.
Thanks for the reply, no.... I use the service 3 times a day so I don't think I exceeded the limit, also because the model tells me to wait 50 minutes, but it didn't tell me anything.
the model tells me to wait 50 minutes, but it didn't tell me anything.
It's a bit confusing but both error messages can both be caused by rate limit.
Can you: open Developer Tools -> open Network tab -> run a new prompt throwing an error -> in the search bar in dev tools search for the word "Stream" -> open the file that has the Eval ID (random set of numbers/letters you see in the URL) -> and then look for a Status Code.
Does it say 🟢 Status Code 200, or do you see 🔴 Status Code 427, 400, etc?
is status red
Hi, I checked DevTools in detail.
Network → Fetch/XHR is working and requests return 200, but no stream request is ever created.
Searching for stream or eval shows nothing.
This means the stream never starts at all, so there is no Status Code to inspect.
The error “something went wrong with this response” happens before the streaming endpoint is created.
Looks like a backend / model issue on LMArena, not browser or client-side.
Okay good to know, thank you for sharing. And you're sure you have the network tab open when running another prompt to through the error message agian, yeah?
I had DevTools open on the Network tab the whole time, with Fetch/XHR enabled, before and during sending the prompt.
I retried multiple times and the error appears, but no stream (or eval) request is ever created.
Only regular fetch requests return 200, then the UI shows “something went wrong with this response”.
So the request seems to fail before the streaming endpoint is initialized.
Understood, thank you for the details. Yeah this sounds like it's on our end then for sure. Like I mentioned this model has been erroring out at a higher rate than usual. Our team has been looking to lower this as much as posssible, but I'll be sure to bump this again. Thanks for the info.
Thank you, if I can help I am at your disposal
No models 😢
Hmm
What browser are you on? Seeing the same for Side by Side?
Can't say I'm seeing the same on my end.
i tried google ios and firefox ios, same results for Direct. lemme try sidebyside
Sidebyside worked well, and Direct turned normal. may just a minor error
Okay glad to hear it's working again. Keep me updated if things change.
hazel revealed?
Last time i checked lmarena are adding 3 new things
Video models
Auto modality
New model selector
just go on gemini's web itself..
it free
Why pro banana doesn't work anymore
@quaint raptor you'll want to review the information in #1397655624103493813 for a better understanding on how to use the bot.
Sorry to say this model has been pretty high error rate. Our team is looking into a fix asap.
Hope this will work soon🙏🙏🙏
Same. Fingers crossed.
Love you service, thanks
i wonder if ghostfalcon is flash too, Maybe gem 3 pro?
Thank you! Very glad to hear it!
nah its flash
gemini died
Open-Source SOTA for Agentic Coding
1
2
guess the leak only covers fiercefalcon for rn
It's flash.
Is lmarena a permanent video generator
yeah prob so
Only difference between ghost and fierce is search on or off
My guess is ghost is with search
Hmm what do you mean by this?
What is the link to the original source?
@echo aurora free lifetime?? Or any chance of premium
chance of premium
Can't say I'm aware of these plans.
free lifetime
This is our intention.
twitter post
works in AI studio
@echo aurora we respect your hard work bro??
hello 
on the AI
at lmarea
um
and i dont want to create a new chat
pineapple, is your endpoint down?
It seems to be working for me.
Can you: open Developer Tools -> open Network tab -> run a new prompt throwing an error -> in the search bar in dev tools search for the word "Stream" -> open the file that has the Eval ID (random set of numbers/letters you see in the URL) -> and then look for a Status Code.
But how can u access those tool bro lmarena
nvm fixed rn
It's most common that you're hitting a rate limit, can you try these steps @muted timber and let me know what that status code is?
What do you mean?
Can you: open Developer Tools -> open Network tab -> run a new prompt throwing an error -> in the search bar in dev tools search for the word "Stream" -> open the file that has the Eval ID (random set of numbers/letters you see in the URL) -> and then look for a Status Code.
F12
It's a browser setting/option
Yeah so with it open, run a new prompt (in LMArena) that'll then result in an error
This is how we get better inforamtion to understand what's going wrong.
so i need to chat another time ?
Oh yeah, same thing, but also with Nano Banana Pro sometimes. It seems like the fetch request returns nothing after a while, but still goes through with 200 code
i need for scripting
Follow the rest of the steps: in the search bar in dev tools search for the word "Stream" -> open the file that has the Eval ID (random set of numbers/letters you see in the URL) -> and then look for a Status Code.
theres some 429 status showing in his network tab
Okay yeah that's rate limit.
English, please
Need to wait to use again
I was directing that at pro pro, because they were able to provide a Status Code
I'm not sure whatyour Status Code is
In my case i get this as the response:
b3:"Error during image generation with google-genai for model endpoint gemini-3-pro-image-preview: Failed to fetch image: Too Many Requests"
for https://lmarena.ai/nextjs-api/stream/retry-evaluation-session-message/ PUT request
Well, It was actually at him. I read it off the image he scribbled on.
3pig broken smh
Ah, thank you.
Okay yeah it's rate limit for why you're hitting this error
Yes, but when I chat with him, he lets me use it, and I don't want to because I have all my information on that chat, and he already has all the information from all the files, because I have over 20+.
so i need only to wait? another one day?
can ppl stop calling chatbots he smh
Same. I've noticed that, when you come back to your chat with Opus after a while, it just stops working for some reason
There could be something else that's causing an error @cosmic salmon . However, if you're seeing a 429 Status Code that means it's rate limit. And yes @muted timber will need to wait.
where can i smooch off 3pig access whilst it's not working on lmarena
That's why yesterday he told me to wait 49 minutes (last night), and now I have to work continuously, it will be much harder for me this way... but what can you do...
Nope, that's a 200 Status Code, so i assume that this is either related to the Gemini API or LMarena is losing my image somehow on the way to me
I'll try to log out and change my browser to see if this issue still persists or nah
The gemini API has to work, I just used on AIStudio
You're getting a 🟢 Status Code 200 when you get an error?
im getting a generic error
Yeah, for the PUT fetch request to https://lmarena.ai/nextjs-api/stream/retry-evaluation-session-message/ when retrying my Something went wrong response
It's always generic, dev tools shows this as the response to the request
Is there any way to upload files other than images for the AIs in LMaren?
no
ChatGPT’s new image model just released!
images v2?
or well
gpt image 2
It allows for copyrighted characters, and political figures which were originally rejected in previous versions.
nano banana pro is better
right
It did, just not with a huge announcement for some reason just a post on the OpenAI Discord
what degree of copyright
where to use it
Yeah, that was a preview, a sneak-peek for the announcement that's going to be soon.
holy sh t it's releasing
but it's not out right now is it
openai discord just says this though
Well, way less strict with copyrighted characters than Sora 2 for sure
It is
It’s out
I tested it myself
there's nothing on openai.com
It just released, so it’s probably not there yet
bruh broke again
sure the whole disney uh scale
but if you can generate goku and stuff yeah OpenAI just wants a lawsuit
Anyways, here is an image made with gpt image 2
im pretty sure they dont have the rights to this
Would say the quality degraded a bit in my opinion
I still think nano banana looks better though
aren't they afraid to be sued by uh
who has spongebob rights
it's one image you need to check out more
paramount
They will probably patch this soon and not allow copyrighted characters in the future
at least now we know to rush before it's locked up
we didn't get to rush sora 2 because we thought it'd be normal
just use nano banana pro
paramount is deada livid on copyright, they are cooked if they dont act
Can't. Keeps giving me the error
Here is an image of Donald Trump with Sam Altman using GPT Image 2
defo
paramount is already planning to sue if they see this
Okay this one looks kinda good
now try gemini 3.0 pro image nanobanana
Holy fuuuuuuuuuuuuuuuuuuuuuuuuuuuuck we got a gold rush inbound
u can already do that in nano banana pro
this will be blocked quickly
trust
Alright
let me try and get it to expand an image like nbp can do
nbp lets me seamlessly expand images to 17:6 for use as a profile banner
i think i'm finally gonna update my banner after like 2-3 years
Image 2 isn't on LMArena
ik but they say that it's on chatgpt
Ah, makes sense
It is. Go test it out yourself
i dont use twitter so i have no reason to generate 500 political images but enjoy the gold rush people
have you tested it enough to tell whether it's better than 2.5fig
I would say GPT Image 2 better than Nano Banana, but not better than Nano Banana Pro
that's why i said 2.5fig
you cant judge yet
you know why
google is dropping nano banana 3 flash today
lol
ik
that's why i asked if he's tested it enough
too late
it is a tuesday
Wait, Gemini 3 Flash? I thought it was still 2.5 Flash
tuesday is new day
not yet
it hasnt dropped yet
google always drops stuff on a tuesday
image 2 model is already out for some people
also gemini 3 flash prob tomorrow?
google releaes stuff on a tuesday usually
and there is already an entry on vertex for 3 flash/fiercefalcon
Will Gemini 3 Flash be better at coding than 2.5 Flash?
2.5 pro even
the Flash 2.5 09 have near 2.5 Pro level (since always when they release a new flash model)
and looks nothing changed, 3.0 Flash must have a very near level to 3.0 Pro, but without overreasoning issue now
Oh, image 2 has just released?
yea
How is it? Still nerfed with fictional characters?
nope, they fixed that issue
barely any 3rd party guardrails rn
Nice, and it's available on the free tier?
for a few gens i guess
this is neat too
flash wen 😖
@echo aurora
they dont have insider info
do u?
Thank you for sharing. I'm not surprising the infinite generation bug wouldn't have an error status code associated with it
sure not but
they are already prepping
its this week sure
but when
i am hoping they launch Gemma 4 (what is very likely soon) and someone finetune it to be responsive like the new generation at all
it would have same level of Opus 4.5 lmao
them asking us to stalk their huggingface page most likely says its this week
Gemini 3 flash is what I am more excited for since I don’t really care about the flash image generation
hwo to use promt htere
8% chance it releases today
nah strong 50
i would say 50%
logan tweeted 3 thunders today
they only delayed Gemini 3 pro a lot to make it 100% consistent
i tried to make new message and it went instantly to something went wrong
