#general | Arena | Page 221

keen beacon Dec 14, 2025, 4:07 AM

#

It’s being jammed down everybody’s throat everywhere

#

Some ppl just not cool with it period. It’s their prerogative. They have every right just like we have every right.

native yarrow Dec 14, 2025, 4:08 AM

#

i'd agree

keen beacon Dec 14, 2025, 4:10 AM

#

Ai is just the accelerate to the whole controversy surrounding technology, computers, and the digital space, and even further than that, you could even say industrialization in general

#

Look it the uni bomber

#

https://en.wikipedia.org/wiki/Industrial_Society_and_Its_Future

Industrial Society and Its Future

Industrial Society and Its Future, also known as the Unabomber Manifesto, is a 1995 anti-technology essay by Ted Kaczynski. The manifesto contends that the Industrial Revolution began a harmful process of natural destruction brought about by technology, while forcing humans to adapt to machinery, creating a sociopolitical order that suppresses h...

#

https://en.wikipedia.org/wiki/Neo-Luddism

Neo-Luddism

Neo-Luddism or new Luddism is a philosophy opposing many forms of modern technology. The term Luddite is generally used as a pejorative applied to people showing technophobic leanings. The name is based on the historical legacy of the English Luddites, who were active between 1811 and 1817. While the original Luddites were mostly concerned with ...

#

https://en.wikipedia.org/wiki/Technophobia

Technophobia

Technophobia (from Greek τέχνη technē, "art, skill, craft" and φόβος phobos, "fear"), also known as technofear, is the fear or dislike of, or discomfort with, advanced technology or complex devices, especially personal computers, smartphones, and tablet computers. A 2018 study proposed a new conceptual and empirical definition of tech...

#

Its really a war between authenticity and conformity

#

Jean-Jacques Rousseau's concept of "the noble savage." Rousseau argues that in a more natural state, untouched by societal expectations, humans are pure, authentic, and true to their inner selves. It is only when society imposes constraints and norms upon individuals that they deviate from this original state, resulting in the forfeit of their genuine selves

native yarrow Dec 14, 2025, 4:17 AM

#

who's complaining ?

keen beacon Dec 14, 2025, 4:17 AM

#

#

https://en.wikipedia.org/wiki/Technological_Slavery

Technological Slavery

Technological Slavery is a 2008 non-fiction book by the American Theodore Kaczynski, also known as the Unabomber, that expands on his personal philosophy and beliefs regarding technology and freedom.

#

Here is video break down

#

https://youtu.be/zfBsvFoqwhY?si=dop-K-m1_FRUAUny

YouTube

Moon

The Unabomber Tried To Warn You

Go to https://ground.news/moon to see through media bias and become a smarter news consumer. Subscribe through my link for 40% off unlimited access this month.

▶ Play video

#

So this is a real ongoing debate with real implications that’s why people are so resistant to AI which can be viewed from this lens as an accelerate to the ongoing issues of society

#

Believe it or not this dude was actually a victim of MK ultra

#

Took fbi 17 years to catch him, he only got caught because his brother recognize his writing.

atomic lagoon Dec 14, 2025, 4:30 AM

#

keen beacon Believe it or not this dude was actually a victim of MK ultra

I heard about that I believe but wdym MK Ultra?

ocean ferry Dec 14, 2025, 4:31 AM

#

???

i didn't dawg

keen beacon Dec 14, 2025, 4:31 AM

#

He’s actually a test subject for it

keen beacon Dec 14, 2025, 4:32 AM

#

atomic lagoon I heard about that I believe but wdym MK Ultra?

https://youtu.be/zXeak1BVh-w?si=RFnDvMqtUU9Wlo_z

YouTube

Nutty History

How Harvard created the Unabomber

Viewer Discretion is Advised

You’ve signed up to participate in a psychological study on your university campus. It’s run by an esteemed professor, you’ll earn some pocket change, and as a side bonus perhaps contribute to the greater scientific knowledge of humanity. But the next thing you know, you have electrodes strapping you to machi...

▶ Play video

atomic lagoon Dec 14, 2025, 4:32 AM

#

keen beacon So this is a real ongoing debate with real implications that’s why people are so...

And I'd say majority of people are resistant to it because of "eco system"

ocean ferry Dec 14, 2025, 4:32 AM

#

ocean ferry ??? i didn't dawg

but he uses ai which is the irony

atomic lagoon Dec 14, 2025, 4:33 AM

#

ocean ferry but he uses ai which is the irony

Whats wrong with using AI? Whats the problem with that dude idk him

atomic lagoon Dec 14, 2025, 4:33 AM

#

keen beacon https://youtu.be/zXeak1BVh-w?si=RFnDvMqtUU9Wlo_z

I will watch

keen beacon Dec 14, 2025, 4:33 AM

#

Yeh it’s crazy

ocean ferry Dec 14, 2025, 4:39 AM

#

atomic lagoon Whats wrong with using AI? Whats the problem with that dude idk him

he posted "how to avoid ai"

#

.

atomic lagoon Dec 14, 2025, 4:42 AM

#

ocean ferry he posted "how to avoid ai"

Ohhh

native yarrow Dec 14, 2025, 4:48 AM

#

good news

#

NBP back on yupp

#

i am still hoping 4k will come back 💔

#

yo omg they keep taking it off lmao

thorny schooner Dec 14, 2025, 5:00 AM

#

☠️

thorn path Dec 14, 2025, 5:10 AM

#

i hate gpt 5.2 from my own subscription, but im on the lmarena voting board and the anonymous prompts i keep saying is better than the alternative end up being 5.2 somehow

thorny schooner Dec 14, 2025, 5:11 AM

#

Well I legitimately i can't even make a new chat right now on the website 🥲 ( i mean I can but it won't load at all after I put in the prompt)

thorn path Dec 14, 2025, 5:11 AM

#

perhaps i mightve been to hard on it because it keeps winning atm

thorny schooner Dec 14, 2025, 5:11 AM

#

Why do you hate me ai

#

https://tenor.com/view/crazy-monkey-gif-21553825

Tenor

quartz light Dec 14, 2025, 5:19 AM

#

thorn path i hate gpt 5.2 from my own subscription, but im on the lmarena voting board and ...

5.2 is z best

#

Nerd

native yarrow Dec 14, 2025, 5:25 AM

#

how is gpt 5.2 for creative writing anyway

keen beacon Dec 14, 2025, 5:44 AM

#

obtuse smelt Dec 14, 2025, 5:52 AM

#

wow

burnt sinew Dec 14, 2025, 6:14 AM

#

must go to meet pineapple in person for him to verify you manually

queen mountain Dec 14, 2025, 6:22 AM

#

What is that

plucky sparrow Dec 14, 2025, 6:24 AM

#

hmm, catching up is not as difficult as closing the gap and excelling though

#

wonder if they can take the lead and sustain it 🤔

remote gulch Dec 14, 2025, 6:48 AM

#

keen beacon

As a Chinese, I think DeepSeek and other models are not as good as Gemini 3 Pro, but this evaluation is only temporary

hot pebble Dec 14, 2025, 6:50 AM

#

Are all the mods busy ?
Tagged one guy 2 times in last 5 days and yet no response….. 💀

crude lagoon Dec 14, 2025, 6:57 AM

#

native yarrow how is gpt 5.2 for creative writing anyway

Lame af as compared to opus 4.5 and gemini 3 pro

native yarrow Dec 14, 2025, 6:57 AM

#

yup expected

crude lagoon Dec 14, 2025, 7:00 AM

#

native yarrow yup expected

yeah it's only slightly better than deepseek v3.2

native yarrow Dec 14, 2025, 7:00 AM

#

crude lagoon yeah it's only slightly better than deepseek v3.2

gemini 3 actually good at writing huh?

crude lagoon Dec 14, 2025, 7:00 AM

#

native yarrow gemini 3 actually good at writing huh?

yeah asked it to write some novel scenes for me

native yarrow Dec 14, 2025, 7:01 AM

#

what is it best at prose wise?

crude lagoon Dec 14, 2025, 7:01 AM

#

it gave quite good responses

crude lagoon Dec 14, 2025, 7:01 AM

#

native yarrow what is it best at prose wise?

opus 4.5

native yarrow Dec 14, 2025, 7:02 AM

#

i've noticed opus loves writing the name marcus in its stories

crude lagoon Dec 14, 2025, 7:02 AM

#

tho it gets cringe asf sometimes

native yarrow Dec 14, 2025, 7:02 AM

#

or claude in general actually

#

LMAO yeah i'd agree

#

https://tenor.com/view/ishowspeed-try-not-to-laugh-gif-7682731162751353849

Tenor

crude lagoon Dec 14, 2025, 7:02 AM

#

native yarrow or claude in general actually

yeah 😭

native yarrow Dec 14, 2025, 7:02 AM

#

like genuinely sometimes i gotta step in cus it just gets so unbearably hard to read

crude lagoon Dec 14, 2025, 7:02 AM

#

and also that 67 search on google

native yarrow Dec 14, 2025, 7:02 AM

#

with claude

crude lagoon Dec 14, 2025, 7:02 AM

#

We're going

#

backwards

native yarrow Dec 14, 2025, 7:03 AM

#

lmfaooo

crude lagoon Dec 14, 2025, 7:03 AM

#

😭

crude lagoon Dec 14, 2025, 7:03 AM

#

native yarrow like genuinely sometimes i gotta step in cus it just gets so unbearably hard to ...

Indeed- sometimes It will just add dialogues from the barbie movie to random character writing

native yarrow Dec 14, 2025, 7:04 AM

#

gemini 3 i've heard has a nice flow with writing and it made a SICK ass short story about ww1, even above claude, so i think its creativity is also better but prose goes to claude

native yarrow Dec 14, 2025, 7:04 AM

#

crude lagoon Indeed- sometimes It will just add dialogues from the barbie movie to random cha...

is it js me or do you also have a problem with claude forcing characters to flirt by saying "you're impossible" which nobody on my fkn life says irl??

crude lagoon Dec 14, 2025, 7:04 AM

#

native yarrow gemini 3 i've heard has a nice flow with writing and it made a SICK ass short st...

Gemini is my goat 🐐 bro gave me a story which made me actually cry 🤌🏼

crude lagoon Dec 14, 2025, 7:05 AM

#

native yarrow is it js me or do you also have a problem with claude forcing characters to flir...

I mean 2020 they used to say that

native yarrow Dec 14, 2025, 7:05 AM

#

yeah it's hella good overall but claude mogs ppl in coding

#

well just completely bashes gemini

crude lagoon Dec 14, 2025, 7:05 AM

#

native yarrow yeah it's hella good overall but claude mogs ppl in coding

but money issues Frustrated

native yarrow Dec 14, 2025, 7:05 AM

#

crude lagoon I mean 2020 they used to say that

yeaaa but like i never hear anyone call anybody that

#

ever..,

native yarrow Dec 14, 2025, 7:05 AM

#

crude lagoon but money issues <:Frustrated:1191801799045288138>

YEAAA

#

i mean i just use yupp but even then it costs so many credits

crude lagoon Dec 14, 2025, 7:06 AM

#

native yarrow well just completely bashes gemini

gemini ain't good for agentic coding. For one shot idk maybe it's better than Claude

crude lagoon Dec 14, 2025, 7:06 AM

#

native yarrow i mean i just use yupp but even then it costs so many credits

indeed

native yarrow Dec 14, 2025, 7:06 AM

#

yeah but that's a toss up

#

so i usually just use gemini over claude unless i want something SPECIFIC

crude lagoon Dec 14, 2025, 7:07 AM

#

native yarrow yeaaa but like i never hear anyone call anybody that

"shut you're impossible" that sounds as if an oldie tryna sound cool

native yarrow Dec 14, 2025, 7:07 AM

#

yeah like i just

#

it's just too hard to read

#

but prose very nice

crude lagoon Dec 14, 2025, 7:08 AM

#

native yarrow so i usually just use gemini over claude unless i want something SPECIFIC

yup yup but the thing is gemini 3 pro is still a preview version and the knowledge cutoff is January 2025..?

native yarrow Dec 14, 2025, 7:09 AM

#

well, it has search

#

don't forget that

#

search is always the fix to knowledge cut offs

crude lagoon Dec 14, 2025, 7:09 AM

#

native yarrow it's just too hard to read

yup ngl it feels as if it's written by actual ai unlike gpt and gemini

native yarrow Dec 14, 2025, 7:10 AM

#

gpt was ACTUALLY good, just got way too censored

#

well gpt 4o was at least

crude lagoon Dec 14, 2025, 7:10 AM

#

native yarrow search is always the fix to knowledge cut offs

nah it sometimes forget that there's something called nano-banana

crude lagoon Dec 14, 2025, 7:10 AM

#

native yarrow well gpt 4o was at least

Gpt 4o was the peak of OpenAI

#

Everything after that feels like downgrade

native yarrow Dec 14, 2025, 7:11 AM

#

genuinely

#

such a fun model

#

was very smart and was so good at writing

#

it's not even the same gpt 4o model anymore

crude lagoon Dec 14, 2025, 7:11 AM

#

I won't be surprised if tomorrow chatgpt gets renamed as Copilot pro Max ngl

crude lagoon Dec 14, 2025, 7:12 AM

#

native yarrow it's not even the same gpt 4o model anymore

that's the thing 😭

native yarrow Dec 14, 2025, 7:12 AM

#

yeah they changed it without telling people

#

open ai is scummy

ocean ferry Dec 14, 2025, 8:00 AM

#

crude lagoon yup yup but the thing is gemini 3 pro is still a preview version and the knowled...

the preview version sucks

#

but there's a hope guys
https://x.com/AiBattle_/status/1999858840045764777?s=19

AiBattle (@AiBattle_)

af97 🆚 Gemini 3 Pro - 3D model of an Iphone 16 Pro Max

Seems like new A/B models are being tested on AI Studio right now. Might be new checkpoints for Gemini 3 Pro

brisk turret Dec 14, 2025, 8:22 AM

#

What's up with 5.2 not being on the leaderboard? Kinda sus

#

How is it on webdev but not the main text one

weary galleon Dec 14, 2025, 8:29 AM

#

GPT 5.2 proved that Stargate Project is the most shameful page in American history. Gemini 3 Pro and Claude 4.5 Opus got NONE of governments money, NONE!

torn mantle Dec 14, 2025, 8:30 AM

#

ocean ferry but there's a hope guys https://x.com/AiBattle_/status/1999858840045764777?s=19

it will be nerfed again

#

im kinda liking the stealth gemin i3 flash model on lmarena

#

ghost something

#

ghostfalcon

#

its really good

#

https://x.com/JasonBotterill/status/2000099678856343842

JB (@JasonBotterill)

The Artificial Analysis ranking for GPT-5.2 is up and it ranks second just beating Opus 4.5

#

what a shameless ranking

#

xhigh ultra pro max vip

weary galleon Dec 14, 2025, 8:38 AM

#

torn mantle what a shameless ranking

Agreed. One user here spreads this crap every single day here.

ocean ferry Dec 14, 2025, 8:38 AM

#

torn mantle what a shameless ranking

yesterday i checked it x-high is 72

#

now 73

ocean ferry Dec 14, 2025, 8:39 AM

#

torn mantle im kinda liking the stealth gemin i3 flash model on lmarena

it will be nerfed again

south herald Dec 14, 2025, 8:48 AM

#

Sorry to put my request here but I'm new on LMarena and I can't create account
I'm blocked to password creation.
I didn't find support page or support contact

sterile tartan Dec 14, 2025, 8:57 AM

#

This is Amazing and Beautiful @whole sundial did you see this?

#

And did you test more?

whole sundial Dec 14, 2025, 8:58 AM

#

sterile tartan And did you test more?

i should test more soon

sterile tartan Dec 14, 2025, 8:59 AM

#

torn mantle xhigh ultra pro max vip

Still just 6 points more then Kimi trophy3d

sterile tartan Dec 14, 2025, 8:59 AM

#

whole sundial i should test more soon

Yeah make more of the pro sub lol

#

But what's your verdict so far?

#

Is it promising?

#

Im just kinda a general user so is hard for me to judge

whole sundial Dec 14, 2025, 9:17 AM

#

sterile tartan But what's your verdict so far?

it's good in terms of post-training for popular tasks by coding, hopefully they can move to k2 base or a different base in the future

#

the end of this month and early next year should have some new 1t+ base models be released, deepseek v4 and grok 3 definitely

#

the massive influx of h200s into china soon should help with them training better and bigger base models, the chinese government accepted defeat and are allowing their major ai companies to buy them as they are much more powerful than china's most powerful homegrown chips

#

kinda unrelated to momentum labs but still important as they may soon find the limits of their current base and will have to move to a new one, hopefully by then there will be some more really good base models for them to use, china (and in turn the whole open source ai community) is lacking rn, glm 4.5 and kimi k2 are great, ling 1t and deepseek v3 are fine, everything else is honestly mid

remote arrow Dec 14, 2025, 9:22 AM

#

🤔

weary galleon Dec 14, 2025, 9:33 AM

#

whole sundial kinda unrelated to momentum labs but still important as they may soon find the l...

Is MomentumLabs Chinese company?

whole sundial Dec 14, 2025, 9:34 AM

#

weary galleon Is MomentumLabs Chinese company?

no, i tried to clarify that and i should some more, they are from the UK and the Netherlands

#

#1397655624103493813
<@&1349916362595635286>

sterile tartan Dec 14, 2025, 9:38 AM

#

whole sundial kinda unrelated to momentum labs but still important as they may soon find the l...

Noted Noted

#

You are very knowledgeable of Ai man

#

I have a question so actually i have kinda made a 8 SOTA Models Orchestration for Engineering Like System-Persona Prompts do you think this is Would Give Better Quality? @whole sundial

#

The Models are
Team West
Gemini
Opus
GPT
Grok

Team East
Kimi
Deepseek
Qwen
GLM

whole sundial Dec 14, 2025, 9:42 AM

#

sterile tartan I have a question so actually i have kinda made a 8 SOTA Models Orchestration fo...

maybe, but i think you should be using something like gpt5.2-pro or gemini 3 pro deepthink instead, combining different models could have some effect, but you're probably better off using one of the two models i mentioned previously since they are based on either one of the world's most powerful models

sterile tartan Dec 14, 2025, 9:43 AM

#

whole sundial maybe, but i think you should be using something like gpt5.2-pro or gemini 3 pro...

I can't pay for those man

#

💀

whole sundial Dec 14, 2025, 9:46 AM

#

sterile tartan I can't pay for those man

true, in that case just combine gemini or gpt 5.2 together 4 times

sterile tartan Dec 14, 2025, 9:48 AM

#

whole sundial true, in that case just combine gemini or gpt 5.2 together 4 times

Why leave out opus 4.5?

#

Also would GPT 5.2 Extra High Work because is Available on Yupp Ai?

whole sundial Dec 14, 2025, 9:50 AM

#

sterile tartan Also would GPT 5.2 Extra High Work because is Available on Yupp Ai?

yeah, almost as good as the pro models in its own rights

whole sundial Dec 14, 2025, 9:50 AM

#

sterile tartan Why leave out opus 4.5?

yeah you can put opus in there, maybe grok 4.1 too

#

the chinese models aren't really worth it as they distill off of the american models anyways

sterile tartan Dec 14, 2025, 9:50 AM

#

So i should just get rid of team East

#

💀

whole sundial Dec 14, 2025, 9:51 AM

#

sterile tartan So i should just get rid of team East

yeah, in fact it may be pushing down the performance

pallid field Dec 14, 2025, 9:51 AM

#

Hey guys,
Is there a usage limit on grok models? Like opus?

sterile tartan Dec 14, 2025, 9:51 AM

#

whole sundial yeah, in fact it may be pushing down the performance

💀 i thought they have alot of potential

whole sundial Dec 14, 2025, 9:53 AM

#

sterile tartan 💀 i thought they have alot of potential

they do, they just need start reliably beating us models and not have tell-tale signs of distilling (deepseek and glm have signs of gemini distillation, kimi k2 thinking may be distilled from gpt-oss, idk about qwen but they may be distilling as well)

sterile tartan Dec 14, 2025, 9:54 AM

#

whole sundial they do, they just need start reliably beating us models and not have tell-tale ...

Ahhh i see

#

So i drop them for now i guess

whole sundial Dec 14, 2025, 9:54 AM

#

yeah

weary galleon Dec 14, 2025, 9:56 AM

#

whole sundial maybe, but i think you should be using something like gpt5.2-pro or gemini 3 pro...

GPT 5.2 Pro is a piece of CRAP! The same like GPT 5.2 Thinking, just with higher latency, 40 minutes or something.

whole sundial Dec 14, 2025, 9:56 AM

#

just use gpt 5.2 xhigh then, not much of a performance degradation there

sterile tartan Dec 14, 2025, 9:57 AM

#

whole sundial just use gpt 5.2 xhigh then, not much of a performance degradation there

I can that's easiest, simplest and best

#

GPT 5.2 or all four if i want to go all in

#

I will just tell GPT XH to run multiple Iteration-Refinment Cycles-Loops for Better Prompts

hushed gyro Dec 14, 2025, 10:02 AM

#

torn mantle https://x.com/JasonBotterill/status/2000099678856343842

Is there a MASSIVE difference between Gemini 3 pro and Kimi K2 Thinking?

weary galleon Dec 14, 2025, 10:03 AM

#

whole sundial just use gpt 5.2 xhigh then, not much of a performance degradation there

Are you kidding me? It's even worse than high, because it has COMPLETLY the same performance and significantly higher latency.

hushed gyro Dec 14, 2025, 10:10 AM

#

chat how to jailbreak Gemini 3 Pro?

brisk turret Dec 14, 2025, 10:32 AM

#

LLMs do that for us now

noble maple Dec 14, 2025, 10:35 AM

#

Yes I had it for 10min last night. It disappeared right after

#

What's that ?

whole sundial Dec 14, 2025, 10:41 AM

#

noble maple What's that ?

it's basically when a company tests something to some people but not others, or when a company tests two different things to two different groups

#

in this case, the video arena is only being tested to some people, most don't have it

noble maple Dec 14, 2025, 10:41 AM

#

Which site ?

hushed gyro Dec 14, 2025, 10:42 AM

#

chat what is the message limit for 3 Pro on Gemini?

#

@zealous sparrow i thought you could connect to Google drive???

Screenshot_2025-12-14-10-47-04-859_com.openai.chatgpt.jpg

limpid nymph Dec 14, 2025, 11:03 AM

#

You have reached your rate limit for claude-opus-4-5-20251101. Please try again in 50 minutes.

#

please help me

lofty frigate Dec 14, 2025, 11:06 AM

#

Yo where tf is reve-v1 when generating an image?

weary galleon Dec 14, 2025, 11:14 AM

#

limpid nymph please help me

What is unclear?

whole sundial Dec 14, 2025, 11:22 AM

#

lofty frigate Yo where tf is reve-v1 when generating an image?

they took it away and replaced it with a stealth model called epsilon that is also by reve, idk why but the only way to access it is by using battle mode

#

my guess it that it's a new checkpoint and they didn't want two of the same model on lmarena at the same time

sweet creek Dec 14, 2025, 11:25 AM

#

What is the ‘expert’ column about in the rankings? Is there any information about this?

slim gorge Dec 14, 2025, 11:26 AM

#

torn mantle https://x.com/JasonBotterill/status/2000099678856343842

xhigh 💀

sterile tartan Dec 14, 2025, 11:36 AM

#

Gemini, GPT, Opus if you had to choose one or two? @whole sundial

whole sundial Dec 14, 2025, 11:38 AM

#

sterile tartan Gemini, GPT, Opus if you had to choose one or two? <@675304479247040523>

gemini, if two then gpt as well, I don't code much so I didn't include opus, it has inferior world knowledge compared to the other two

lofty frigate Dec 14, 2025, 11:41 AM

#

whole sundial they took it away and replaced it with a stealth model called `epsilon` that is ...

What??? Come on man it was one of my favorite🤦🏾‍♂️🤦🏾‍♂️🤦🏾‍♂️🤦🏾‍♂️

#

Out of all the damn models, that one???

sterile tartan Dec 14, 2025, 11:41 AM

#

whole sundial gemini, if two then gpt as well, I don't code much so I didn't include opus, it ...

True but opus is not only for coding is it?

#

Is great for logic and reasoning isn't it?

#

Like for Prompt Engineering because it has great structuring

whole sundial Dec 14, 2025, 11:42 AM

#

sterile tartan True but opus is not only for coding is it?

true, but most people think of opus as a coding model, it's good at other things as well

lofty frigate Dec 14, 2025, 11:42 AM

#

Mannnnnnn this is some bs

whole sundial Dec 14, 2025, 11:43 AM

#

lofty frigate What??? Come on man it was one of my favorite🤦🏾‍♂️🤦🏾‍♂️🤦🏾‍♂️🤦🏾‍♂️

yeah, idk what to say. honestly, it's stupid to replace it like that. they should've tested the new version while keeping the old version like with seedream 4 and autumn (seedream 4.5)

#

or with any other model but reve, actually

#

openai, google, bfl, and bytedance all didn't do this, they all kept their older models
idk what is up with reve

#

even on the text side i've never seen this before, they stealth test new checkpoints alongside older ones, they don't replace a released model with a stealth one

#

but epsilon is reve, you'll just need to run your prompt a few times to get it, honestly all stealth models should be selectable like yupp imo

sterile tartan Dec 14, 2025, 11:49 AM

#

whole sundial but epsilon is reve, you'll just need to run your prompt a few times to get it, ...

Yeah but recently yupp removed nbp 4k and sd 4.5 4k

#

💀

sterile tartan Dec 14, 2025, 11:49 AM

#

whole sundial true, but most people think of opus as a coding model, it's good at other things...

I see so what are the best strengths of opus?

floral knoll Dec 14, 2025, 11:50 AM

#

OMG Video Generation working on LMArena (Webpage) thxxxxxxxxxxxxxxxxxxxxxxxxx

#

Why not News? ❤️

sterile tartan Dec 14, 2025, 11:50 AM

#

Or i only feel attracted towards it because is expensive?

sterile tartan Dec 14, 2025, 11:51 AM

#

floral knoll Why not News? ❤️

So the world doesn't abominate the compute expenses

whole sundial Dec 14, 2025, 11:51 AM

#

sterile tartan I see so what are the best strengths of opus?

idk because i don't use it, but i've seen it used mainly for coding
also yes it is still expensive, less expensive than before but still expensive compared to others

whole sundial Dec 14, 2025, 11:51 AM

#

floral knoll Why not News? ❤️

not fully out yet, you're lucky because only some people are getting it

sterile tartan Dec 14, 2025, 11:52 AM

#

I see

#

So is grok anything worth it or not

lofty frigate Dec 14, 2025, 11:52 AM

#

whole sundial but epsilon is reve, you'll just need to run your prompt a few times to get it, ...

I see thanks for the update

whole sundial Dec 14, 2025, 11:52 AM

#

sterile tartan Yeah but recently yupp removed nbp 4k and sd 4.5 4k

nbp 4k is expensive, sd 4.5 4k costs the same as standard. idk why they call it max when it costs the same, just a higher resolution

whole sundial Dec 14, 2025, 11:52 AM

#

sterile tartan So is grok anything worth it or not

not really compared to the other three

sterile tartan Dec 14, 2025, 11:53 AM

#

whole sundial nbp 4k is expensive, sd 4.5 4k costs the same as standard. idk why they call it ...

True but now both gone for selection

whole sundial Dec 14, 2025, 11:53 AM

#

whole sundial nbp 4k is expensive, sd 4.5 4k costs the same as standard. idk why they call it ...

honestly dumb to charge more credits for a model that costs the same as the other

sterile tartan Dec 14, 2025, 11:53 AM

#

whole sundial not really compared to the other three

I see do you know the grok 4.20 won the alpha arena trading?

sterile tartan Dec 14, 2025, 11:54 AM

#

whole sundial honestly dumb to charge more credits for a model that costs the same as the othe...

Idk why they have this kind of system yeah

whole sundial Dec 14, 2025, 11:54 AM

#

sterile tartan I see do you know the grok 4.20 won the alpha arena trading?

based on some random tweet from an elon glazer

#

i don't trust it, model isn't even released

sterile tartan Dec 14, 2025, 11:55 AM

#

whole sundial based on some random tweet from an elon glazer

Like the alpha arena website is there and legit you still don't trust it?

whole sundial Dec 14, 2025, 11:56 AM

#

it is there, still unreleased though

#

probably will be next week

sterile tartan Dec 14, 2025, 11:56 AM

#

Yeah

zealous sparrow Dec 14, 2025, 11:56 AM

#

whole sundial they took it away and replaced it with a stealth model called `epsilon` that is ...

ah so epsilon is reve

#

i wanted to know what it is

tender peak Dec 14, 2025, 11:56 AM

#

سلام

sterile tartan Dec 14, 2025, 11:56 AM

#

Well maybe it will be on par with other 3

whole sundial Dec 14, 2025, 11:57 AM

#

hopefully

sterile tartan Dec 14, 2025, 11:57 AM

#

🤞🏻

zealous sparrow Dec 14, 2025, 11:57 AM

#

whole sundial hopefully

how did you find out epsilon is reve

sterile tartan Dec 14, 2025, 11:57 AM

#

@whole sundial tysm for finding me you have great impact on my life

#

I was so confused and time consumed by my own systems

#

Now i can work with simplicity

whole sundial Dec 14, 2025, 11:58 AM

#

zealous sparrow how did you find out epsilon is reve

they have the same output plus the reve v1 model was removed at the same time as epsilon was added

whole sundial Dec 14, 2025, 11:58 AM

#

sterile tartan <@675304479247040523> tysm for finding me you have great impact on my life

you're welcome

ocean ferry Dec 14, 2025, 11:58 AM

#

whole sundial i don't trust it, model isn't even released

elon says it's grok 4.20 experimental

zealous sparrow Dec 14, 2025, 11:58 AM

#

whole sundial they have the same output plus the reve v1 model was removed at the same time as...

makes sense

whole sundial Dec 14, 2025, 11:59 AM

#

ocean ferry elon says it's grok 4.20 experimental

yeah i trust it now

sterile tartan Dec 14, 2025, 11:59 AM

#

whole sundial you're welcome

💙🌌❤️

whole sundial Dec 14, 2025, 11:59 AM

#

it's probably on lmarena as some model

zealous sparrow Dec 14, 2025, 12:01 PM

#

whole sundial it's probably on lmarena as some model

theres some grok codenamed models on lmarena

#

you would just need to pinpoint them

livid ridge Dec 14, 2025, 12:02 PM

#

if I want to use already executed prompt and add some detail and make another prompt based on given prompt then how I can do this?

#

actually i want to link one prompt to another prompt in single prompt like character details etc

weary galleon Dec 14, 2025, 12:07 PM

#

Just a reminder. GPT-5.2 is BAD!

latent crest Dec 14, 2025, 12:10 PM

#

Good morning

zealous sparrow Dec 14, 2025, 12:15 PM

#

Under 1000 lines of code btw
https://019b1cbd-304f-75e9-a01b-b9dceb231d0e.arena.site [ghostfalcon]
We need to start to believe in our AIs man, instead of pushing them to limits.

Defuse the Logic - KTANE Clone

Built with LMArena - Content is user-generated and unverified

fleet lintel Dec 14, 2025, 12:18 PM

#

zealous sparrow Under 1000 lines of code btw https://019b1cbd-304f-75e9-a01b-b9dceb231d0e.arena....

You have been testing these falcon models a lot. What is your opinion so far on these models?
Good, meh, bad?

fleet lintel Dec 14, 2025, 12:19 PM

#

weary galleon Just a reminder. GPT-5.2 is BAD!

Scammed by Scam altman again

zealous sparrow Dec 14, 2025, 12:21 PM

#

fleet lintel You have been testing these falcon models a lot. What is your opinion so far on ...

solid

#

ghostfalcon seems to have improved overtime tbh

#

doesn't write a lot of code, but still fulfills the task given

fleet lintel Dec 14, 2025, 12:22 PM

#

zealous sparrow ghostfalcon seems to have improved overtime tbh

That could be the case. They might be fixing bugs etc

fleet lintel Dec 14, 2025, 12:25 PM

#

zealous sparrow solid

Do you think it's better than 2.5 pro?
I would be a bit disappointed if it's worse.

zealous sparrow Dec 14, 2025, 12:25 PM

#

fleet lintel Do you think it's better than 2.5 pro? I would be a bit disappointed if it's wo...

I guess

#

I am going to guess google will put out another checkpoint tho.

#

A thing ive noticed tho, Is if you ask for too much modules, It wont program them.

#

Well, it will work but just a click solves them

floral knoll Dec 14, 2025, 12:31 PM

#

floral knoll OMG Video Generation working on LMArena (Webpage) thxxxxxxxxxxxxxxxxxxxxxxxxx

luuuuuuuuul is deleted.....:(

zealous sparrow Dec 14, 2025, 12:32 PM

#

floral knoll luuuuuuuuul is deleted.....:(

no i still have it

floral knoll Dec 14, 2025, 12:33 PM

#

zealous sparrow no i still have it

URL Link? ❤️

zealous sparrow Dec 14, 2025, 12:33 PM

#

floral knoll URL Link? ❤️

wont work for you if it doesnt show up on your website

#

It's tied to accounts

weary galleon Dec 14, 2025, 12:35 PM

#

fleet lintel Scammed by Scam altman again

Scam Altman stole $500B from American taxpayers! And even that didn't help him to compete with competitors.

floral knoll Dec 14, 2025, 12:35 PM

#

zealous sparrow It's tied to accounts

im log in with my Google Acc, not working 🙁

zealous sparrow Dec 14, 2025, 12:36 PM

#

floral knoll im log in with my Google Acc, not working 🙁

did that specific account have it

agile nova Dec 14, 2025, 1:05 PM

#

Website down?

thorny schooner Dec 14, 2025, 1:11 PM

#

I mean it's still up for me ( even if I absolutely can't do anything)

weary galleon Dec 14, 2025, 1:11 PM

#

poll_question_text

Which model should get prize "Worst model of 2025"?

victor_answer_votes

7

total_votes

15

victor_answer_id

1

victor_answer_text

GPT 5.2

thorny schooner Dec 14, 2025, 1:14 PM

#

https://tenor.com/view/no-the-office-michael-scott-scream-gif-16929305

Tenor

#

#

Happened on my alternate browser

stray aspen Dec 14, 2025, 1:22 PM

#

whats the claude system prompt

alpine oasis Dec 14, 2025, 1:31 PM

#

halo

stray aspen Dec 14, 2025, 1:34 PM

#

what are the neww studio a/b test for

queen veldt Dec 14, 2025, 1:34 PM

#

Some new model

#

Testing what will users vote more

sterile tartan Dec 14, 2025, 1:35 PM

#

Gemini 3 flash

#

Will come in some days

robust sluice Dec 14, 2025, 1:37 PM

#

I only got Flux models on Battle, is this normal ?

stiff coral Dec 14, 2025, 1:40 PM

#

hi everyone Can I copy the link to my favorite chat bot on the lmarena.ai website somewhere? Is there an lmarena.ai app?

ocean ferry Dec 14, 2025, 1:50 PM

#

stray aspen what are the neww studio a/b test for

Gemini 3 Flash/Pro checkpoint

sterile tartan Dec 14, 2025, 1:53 PM

#

Gemini 3 Pro Final

#

Gemini 3 Flash Preview Coming

glass arch Dec 14, 2025, 1:55 PM

#

hey guys I need you all to give me some insight

#

I am using gemini 3 pro right now, but it chatgpt 5.2 much better?

stray aspen Dec 14, 2025, 1:56 PM

#

no

#

it sucks

#

its literally so trash

glass arch Dec 14, 2025, 1:56 PM

#

I moved away from chatgpt because I did not like its way of speaking to me

#

does 5.2 worsen or fix that?

stray aspen Dec 14, 2025, 1:56 PM

#

it sucks bro

#

just use gemini or claude lol

glass arch Dec 14, 2025, 1:56 PM

#

ok

#

is claude better than gemini?

stray aspen Dec 14, 2025, 1:56 PM

#

it also sucks for coding

stray aspen Dec 14, 2025, 1:57 PM

#

glass arch is claude better than gemini?

i mean it gives you full projects in one prompt

#

thats i what i like

#

the responses are more complete than gemini

glass arch Dec 14, 2025, 1:57 PM

#

nice

#

does it have ridiculous rate limits?

stray aspen Dec 14, 2025, 1:58 PM

#

claude is very expensive

vivid coral Dec 14, 2025, 1:59 PM

#

Commie Claude does, yes.

glass arch Dec 14, 2025, 2:00 PM

#

stray aspen claude is very expensive

dam

#

ok I guess I am sticking with gemmy now

#

I really did like openai though

#

while I was using it

#

it's just really annoying that they neutered the poo out of chatgpt and now it is impossible to talk to

stray aspen Dec 14, 2025, 2:02 PM

#

it sucks lol

#

its so censored that it feels like it was trained in north korea

proud vine Dec 14, 2025, 2:05 PM

#

hello

#

Does this have limits?

vivid coral Dec 14, 2025, 2:07 PM

#

Ok, I keep hearing this censorship stuff for 5.2. I use AI differently than most in here as a sports and prediction market modeler, so it hasn't affected me, yet. What is everyone experiencing

hollow ivy Dec 14, 2025, 2:09 PM

#

Did Opus-4.5 suffered intelligence regression?

#

it gave an inferior output, compared to its performance yesterday :/

glass arch Dec 14, 2025, 2:10 PM

#

vivid coral Ok, I keep hearing this censorship stuff for 5.2. I use AI differently than most...

it refuses a lot of political/medical questions because of guardrails

proud vine Dec 14, 2025, 2:11 PM

#

proud vine Does this have limits?

Can anybody tell me about this?

raw turtle Dec 14, 2025, 2:11 PM

#

First, five videos are allowed in a day, and now how many videos can we make per day? Please let me know?

stray aspen Dec 14, 2025, 2:12 PM

#

bruh

raw turtle Dec 14, 2025, 2:17 PM

#

Can anybody tell me about this?

whole swallow Dec 14, 2025, 2:26 PM

#

When gemini 3.5 pro?

hollow ivy Dec 14, 2025, 2:26 PM

#

whole swallow When gemini 3.5 pro?

next year

errant cave Dec 14, 2025, 2:29 PM

#

hollow ivy Did Opus-4.5 suffered intelligence regression?

It feels like major models are getting worse across the board

narrow comet Dec 14, 2025, 2:29 PM

#

gemini 3 pro isn't as good as gemini 2.5 preview 0325 lol

errant cave Dec 14, 2025, 2:29 PM

#

I think this is because we're basically trying to brute-force improvement

whole swallow Dec 14, 2025, 2:29 PM

#

errant cave It feels like major models are getting worse across the board

They trying to cut costs..

errant cave Dec 14, 2025, 2:30 PM

#

They don't care lol investors will give them infinite money either way (until they don't)

whole swallow Dec 14, 2025, 2:30 PM

#

Nah bruh that's not really how it works

#

A business first goal is to cut expenses, as they always did

errant cave Dec 14, 2025, 2:31 PM

#

That's not how AI companies and departments operate though

whole swallow Dec 14, 2025, 2:31 PM

#

Investors arent a magic entity, they are people who expect their money back

errant cave Dec 14, 2025, 2:31 PM

#

They're basically running on the promise that they'll make profit someday

#

Which is what keeps investors investing in them

#

Exactly

#

Thing is they haven't really been stingy with the money they're supposed to start making a profit with so far

whole swallow Dec 14, 2025, 2:31 PM

#

Yeah it may seem like that on the outside, since this was a big bet, but inside they are for sure thinking how to cut cost as much as possible maintaining a good performance

narrow comet Dec 14, 2025, 2:32 PM

#

google is the only one making money

errant cave Dec 14, 2025, 2:32 PM

#

whole swallow Yeah it may seem like that on the outside, since this was a big bet, but inside ...

Apologies if this comes across as fallacious or rude but how do you know how they operate within?

whole swallow Dec 14, 2025, 2:36 PM

#

Profits are going to be big, but not yet. Ai is def the future, but we need to rebuild everything to fully integrate it. once done the demand will be so high that profits will start to roll out

whole swallow Dec 14, 2025, 2:36 PM

#

errant cave Apologies if this comes across as fallacious or rude but how do you know how the...

I study history of business and economics

#

I do not know at 100%, but I assume based on what I've learned

crude lagoon Dec 14, 2025, 2:46 PM

#

queen mountain Dec 14, 2025, 2:48 PM

#

Google 2001

vivid coral Dec 14, 2025, 2:49 PM

#

glass arch it refuses a lot of political/medical questions because of guardrails

Ahhhh, yea I get it. I use Grok for those things more

slim gorge Dec 14, 2025, 2:54 PM

#

opus 4.5 worst model? 💀 be for real

#

only right answer is either gpt-5.1 or gpt-5.2

weary galleon Dec 14, 2025, 2:56 PM

#

slim gorge only right answer is either gpt-5.1 or gpt-5.2

Agreed.

slim gorge Dec 14, 2025, 3:05 PM

#

LMAO

#

https://cdn.discordapp.com/attachments/1312335056244707369/1371185201874468954/JMOyCYIT.gif

golden ocean Dec 14, 2025, 3:06 PM

#

https://tenor.com/view/pusheen-pusheen-cat-pusheen's-best-friend-pusheen-cute-pusheen-smoosh-gif-11334352504260897967

Tenor

sour spindle Dec 14, 2025, 3:07 PM

#

I anyways find it interesting outside everyone likes to dunk on lmarena as a benchmark. The top models though are Gemini 3 for text and Opus for coding. It’s seems like it’s captured best models in these two domains better than most benchmarks

rocky mauve Dec 14, 2025, 3:13 PM

#

https://tenor.com/view/wwwziff-roblox-disconnected-you-were-disconnected-for-being-idle20minutes-zhenyayoratt-gif-27471253

Tenor

#

https://cdn.discordapp.com/attachments/1404919903013834832/1445900931983085588/togif.gif

golden ocean Dec 14, 2025, 3:13 PM

#

#

https://cdn.discordapp.com/attachments/910347347601543196/1160742997156188181/togif-6.gif

quasi atlas Dec 14, 2025, 3:14 PM

#

Hey Guys!

#

Please keep the conversation respectful within the server.

rocky mauve Dec 14, 2025, 3:16 PM

#

https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExMDlhbHliOGx5ZzVuaXE5MjB6dHBxYjJ2cnc4aHoxb2dreWJnaDZiYyZlcD12MV9naWZzX3NlYXJjaCZjdD1n/VbnUQpnihPSIgIXuZv/giphy.gif

sullen quest Dec 14, 2025, 3:21 PM

#

y

zealous sparrow Dec 14, 2025, 3:23 PM

#

web3 is dead

#

exactly

#

thank you spicy 😄

quasi atlas Dec 14, 2025, 3:23 PM

#

Let’s avoid continuing this conversation here. If you encounter any issues again, please DM me instead. @weary galleon @vivid coral

kindred solar Dec 14, 2025, 3:24 PM

#

narrow comet gemini 3 pro isn't as good as gemini 2.5 preview 0325 lol

Interesting observations. I noticed that Gemini 3 pro grounding behaves as if it had cut off access to data. Many times I asked to find information on a topic that is relatively new, and the model returned information that such a thing does not exist.

compact sleet Dec 14, 2025, 3:27 PM

#

arctic cave Dec 14, 2025, 3:27 PM

#

https://soundcloud.com/alex-winters-492274453/sets/x-seek-deep-x-delete-you-ghost-x-studio-master-x-2025-x?si=848a8f02c915425dbbacb68dbedf7340&utm_source=clipboard&utm_medium=text&utm_campaign=social_sharing

narrow comet Dec 14, 2025, 3:27 PM

#

#

stray aspen Dec 14, 2025, 3:49 PM

#

erm what the sigma

loud crag Dec 14, 2025, 3:49 PM

#

It's kind of absurd that LMArena lets us generate images with Nano Banana Pro for free at all, but especially in 2k. At $0.15, per image they must be bleeding cash. I just wanted to say I'm truly grateful. 🙏

sterile tartan Dec 14, 2025, 3:57 PM

#

loud crag It's kind of absurd that LMArena lets us generate images with Nano Banana Pro fo...

Well calculated business strategy

stray aspen Dec 14, 2025, 4:00 PM

#

dangit

#

he was doing so good

torn mantle Dec 14, 2025, 4:03 PM

#

stray aspen dangit

nice

zealous sparrow Dec 14, 2025, 4:10 PM

#

more proof 5.2 flopped [SOTA my ahh]

narrow comet Dec 14, 2025, 4:14 PM

#

stray aspen dangit

how ya using him?

torn mantle Dec 14, 2025, 4:25 PM

#

@patent aspen

#

where is brian

#

so many username handles

#

just wanted to ask if gemini 3 flash turned out better than they expected?

#

my initial tests tells me that its quite a capable model

patent aspen Dec 14, 2025, 4:26 PM

#

leo seems to think so as well. I haven't used it and don't know the evals

torn mantle Dec 14, 2025, 4:26 PM

#

mm i see

#

yea its not lazy and also follows instructions very well

golden ocean Dec 14, 2025, 4:35 PM

#

https://tenor.com/view/bills-ugh-gif-9811803164280522639

Tenor

plucky sparrow Dec 14, 2025, 4:38 PM

#

Is 5.2 sota at anything?

limber pawn Dec 14, 2025, 4:41 PM

#

no

golden ocean Dec 14, 2025, 4:44 PM

#

https://tenor.com/view/no-ai-i-said-no-ai-kung-fu-robot-clanker-gif-15924350630978600307

Tenor

mild granite Dec 14, 2025, 4:56 PM

#

plucky sparrow Is 5.2 sota at anything?

no

unreal shell Dec 14, 2025, 4:56 PM

#

Arc agi

native yarrow Dec 14, 2025, 4:57 PM

#

means nothing brah

zealous sparrow Dec 14, 2025, 4:57 PM

#

unreal shell Arc agi

benchmaxxed

unreal shell Dec 14, 2025, 4:57 PM

#

zealous sparrow benchmaxxed

No

zealous sparrow Dec 14, 2025, 4:57 PM

#

Im glad Gpt 5.2 didn't benchmax CritPt

native yarrow Dec 14, 2025, 4:58 PM

#

unreal shell Dec 14, 2025, 4:58 PM

#

Openai never benchmaxxes

zealous sparrow Dec 14, 2025, 4:58 PM

#

unreal shell No

bro

mild granite Dec 14, 2025, 4:58 PM

#

unreal shell No

it is

zealous sparrow Dec 14, 2025, 4:58 PM

#

It is

#

AGI can do physics

#

It scored a 0% in a physics bench

unreal shell Dec 14, 2025, 4:58 PM

#

Its not real then the physics bench is faked

mild granite Dec 14, 2025, 4:58 PM

#

unreal shell Openai never benchmaxxes

have you got any proofs

unreal shell Dec 14, 2025, 4:58 PM

#

mild granite have you got any proofs

Yes

#

Openai is the largest company they wouldn't lie

zealous sparrow Dec 14, 2025, 4:59 PM

#

unreal shell Its not real then the physics bench is faked

go ask it a difficult physics question

#

see if it gives you a good answer

unreal shell Dec 14, 2025, 4:59 PM

#

zealous sparrow go ask it a difficult physics question

Ok

mild granite Dec 14, 2025, 4:59 PM

#

gemini is still the best overall

unreal shell Dec 14, 2025, 5:00 PM

#

zealous sparrow see if it gives you a good answer

It got it wrong

zealous sparrow Dec 14, 2025, 5:00 PM

#

unreal shell It got it wrong

there is your proof that the bench didnt lie

unreal shell Dec 14, 2025, 5:00 PM

#

Maybe it used no thinking

#

Even if i put it to extended

zealous sparrow Dec 14, 2025, 5:01 PM

#

give me uh

#

the question you asked

#

ima ask xhigh on yupp

mild granite Dec 14, 2025, 5:02 PM

#

unreal shell Maybe it used no thinking

unreal shell Dec 14, 2025, 5:02 PM

#

Is this sota?

mild granite Dec 14, 2025, 5:02 PM

#

unreal shell Is this sota?

kind of

ocean vortex Dec 14, 2025, 5:03 PM

#

unreal shell Openai never benchmaxxes

They didn't use to be the worst offenders, but honestly... 5.2 was all about benchmaxxing

#

Like xhigh doesn't even output more tokens overall than 5.1 high to run AA

#

so even that is kinda pointless lol

mild granite Dec 14, 2025, 5:04 PM

#

it doesn't have empathy

#

and a bit restrictive

zealous sparrow Dec 14, 2025, 5:05 PM

#

mild granite and a bit restrictive

correction: gpt 5.2 is ranked the most censored model of all time

unreal shell Dec 14, 2025, 5:05 PM

#

Then what should I use

zealous sparrow Dec 14, 2025, 5:06 PM

#

mild granite Dec 14, 2025, 5:07 PM

#

unreal shell Then what should I use

grok for roleplay

gemini pro for multimodal tasks and writing

claude for coding

native yarrow Dec 14, 2025, 5:07 PM

#

no way that's real lmao

#

gpt is so censored

#

and gemini not so much

#

gemini is still censored but no way is it third right?

#

OH

#

IM A COMPLETE DMBASS

#

MB

mild granite Dec 14, 2025, 5:07 PM

#

?

native yarrow Dec 14, 2025, 5:08 PM

#

i read the graph wrong

#

higher score means LESS

narrow comet Dec 14, 2025, 5:08 PM

#

zealous sparrow

would be nice to see that with the open source models

native yarrow Dec 14, 2025, 5:08 PM

#

that's weird shouldn't grok be at the or near the top?

#

#

that graph actually makes sense in that case

#

cus why would gpt of all models be at the bottom

mild granite Dec 14, 2025, 5:09 PM

#

native yarrow

yeah, but gpt 5.2 is the most censored

#

lower=more restrictive

native yarrow Dec 14, 2025, 5:09 PM

#

yeah the graph is completely wrong but i just also read it wrong

zealous sparrow Dec 14, 2025, 5:09 PM

#

who did it better
Rate based on physics
https://019b1dd1-7aec-7ba4-82c2-0651703e70b7.arena.site - gpt 5.2 high
https://019b1dd1-7aec-70fa-bc1a-8e632abe4b6e.arena.site - ghostfalcon [gemini 3 flash checkpoint]

Bouncing Balls in a Circle (Physics)

Built with LMArena - Content is user-generated and unverified

Circular Physics Arena

Built with LMArena - Content is user-generated and unverified

native yarrow Dec 14, 2025, 5:09 PM

#

grok is not top 3 most censored it's dumb rating

mild granite Dec 14, 2025, 5:09 PM

#

native yarrow grok is not top 3 most censored it's dumb rating

yeah, so?

native yarrow Dec 14, 2025, 5:10 PM

#

what im tryna say is that graph is completely wrong near the bottom

#

it's accurate up until like

mild granite Dec 14, 2025, 5:10 PM

#

native yarrow what im tryna say is that graph is completely wrong near the bottom

i am not the one who made it, so idk

native yarrow Dec 14, 2025, 5:10 PM

#

claude

#

i'd put grok at the top and shift some claude models around

mild granite Dec 14, 2025, 5:11 PM

#

native yarrow Dec 14, 2025, 5:11 PM

#

then it'd be good

zealous sparrow Dec 14, 2025, 5:11 PM

#

native yarrow i'd put grok at the top and shift some claude models around

are you sure 4.1 is not too censored

mild granite Dec 14, 2025, 5:12 PM

#

grok can swear, but it wont do illegal stuff

#

until you prompt engineer it

rugged lodge Dec 14, 2025, 5:16 PM

#

mild granite until you prompt engineer it

🥶🥶

ocean vortex Dec 14, 2025, 5:18 PM

#

mild granite no

Yeah this is actually very interesting, look at 5.0

native yarrow Dec 14, 2025, 5:20 PM

#

zealous sparrow are you sure 4.1 is not too censored

yeah lmao

native yarrow Dec 14, 2025, 5:20 PM

#

ocean vortex Yeah this is actually very interesting, look at 5.0

what is that anyway?

zealous sparrow Dec 14, 2025, 5:24 PM

#

native yarrow what is that anyway?

difficult physics questions benchmark

native yarrow Dec 14, 2025, 5:26 PM

#

gemini is still so good as a model overall

#

damn

stray aspen Dec 14, 2025, 5:30 PM

#

yo

#

can we get gpt 5.2 extra high

#

on lmarena

stray aspen Dec 14, 2025, 5:31 PM

#

zealous sparrow who did it better Rate based on physics https://019b1dd1-7aec-7ba4-82c2-0651703e...

kinda unfair comparison

#

but gpt cooked

wicked sage Dec 14, 2025, 5:31 PM

#

hi guys

gaunt spade Dec 14, 2025, 5:43 PM

#

native yarrow gemini is still so good as a model overall

gemini 3 pro is still on the top lol

#

these cheap shills are nothing in real performance

#

they just hype openai up with no real evidence

crude lagoon Dec 14, 2025, 5:45 PM

#

#

the mods deleted the post tho

mild granite Dec 14, 2025, 5:52 PM

#

ocean vortex Yeah this is actually very interesting, look at 5.0

wicked sage Dec 14, 2025, 6:13 PM

#

ocean vortex Yeah this is actually very interesting, look at 5.0

0.0%?????

uneven topaz Dec 14, 2025, 6:25 PM

#

Hello

queen veldt Dec 14, 2025, 6:29 PM

#

#

https://tenor.com/view/dowedeservedestruciton-destruction-doctor-strange-do-we-deserve-gif-14904255190281914356

Tenor

rich panther Dec 14, 2025, 6:39 PM

#

which ai is the best for generating images?

weary galleon Dec 14, 2025, 6:45 PM

#

rich panther which ai is the best for generating images?

Nano-Banana-Pro-2K

keen beacon Dec 14, 2025, 6:51 PM

#

https://youtu.be/i1SiWE7E48Q?si=TpbYJgftGR-M5rsB

YouTube

Big Data Factor

True Scale of Data Storage

Take a visual journey from a single byte to the colossal scale of a quettabyte the largest data unit we can name. From early arcade memory to today’s AI-driven data explosion, this real-scale 3D visualization reveals how massive our digital world has become. If you love mind-bending comparisons, data science, or tech history, this one’s for ...

▶ Play video

torn mantle Dec 14, 2025, 6:52 PM

#

crude lagoon

we should add grok 4

#

and mistral

#

deepseek v3.2 is also so bad

#

mistral
deepseek
grok 4
gpt 5.2

rich panther Dec 14, 2025, 6:53 PM

#

weary galleon Nano-Banana-Pro-2K

thanks

torn mantle Dec 14, 2025, 6:53 PM

#

zealous sparrow who did it better Rate based on physics https://019b1dd1-7aec-7ba4-82c2-0651703e...

both are good but which one was faster

torn mantle Dec 14, 2025, 6:53 PM

#

rich panther thanks

thanks

#

pug

weary galleon Dec 14, 2025, 6:55 PM

#

torn mantle deepseek v3.2 is also so bad

DeepSeek is a great company, they don't have so much money, employees, and GPUs that OpenAI have. They work hard and do great job. OpenAI is an extremely rich looser.

torn mantle Dec 14, 2025, 6:56 PM

#

weary galleon DeepSeek is a great company, they don't have so much money, employees, and GPUs ...

it was leaked that deepseek smuggled many h200

#

and they are already using them

#

🙂

zealous sparrow Dec 14, 2025, 6:56 PM

#

torn mantle both are good but which one was faster

ghostfalcon

torn mantle Dec 14, 2025, 6:56 PM

#

zealous sparrow ghostfalcon

then ghostfalcon is better

zealous sparrow Dec 14, 2025, 6:56 PM

#

torn mantle then ghostfalcon is better

It's worse than the AIStudio checkpoint for sure

weary galleon Dec 14, 2025, 6:57 PM

#

torn mantle it was leaked that deepseek smuggled many h200

Yes, but not so much like OpenAI have.

torn mantle Dec 14, 2025, 6:57 PM

#

zealous sparrow It's worse than the AIStudio checkpoint for sure

but how do u know the one in aistudio is flash ?

#

could be pro checkpoint

keen beacon Dec 14, 2025, 6:58 PM

#

https://cdn.discordapp.com/attachments/1434315315860213817/1435870313970925638/image0.gif

zealous sparrow Dec 14, 2025, 6:58 PM

#

torn mantle but how do u know the one in aistudio is flash ?

People that are more indepth with google say so

#

They can legally say that

#

Unless we are wrong

#

and its a 3pro checkpoint

#

doubt it

torn mantle Dec 14, 2025, 7:02 PM

#

nah i doubt that

#

they tested like 4 checkpoints on lmarena if im not wrong

#

or like 3

#

this one seems the best

#

ive tried the other ones, they were so bad at fixing bugs

#

https://x.com/chetaslua/status/2000274489008140787

Chetaslua (@chetaslua)

😱 Holy Shit Gemini 3.0 flash will be Google's Sonnet 3.5 moment

This is one shot , voice , music , visualization all by flash and it took 30 seconds only

yeah i used my anti lazy prompt , but it crashes gemini , google is added some sort of timer if model

obsidian cargo Dec 14, 2025, 7:07 PM

#

I need cosmetic genetic engineering to be developed so I can become a catgirl. My appreciation for catgirls has gone from ironic to genuine over the past several years.

rich panther Dec 14, 2025, 7:12 PM

#

torn mantle pug

yes

zealous sparrow Dec 14, 2025, 7:13 PM

#

torn mantle https://x.com/chetaslua/status/2000274489008140787

which checkpoint is this

torn mantle Dec 14, 2025, 7:13 PM

#

zealous sparrow which checkpoint is this

ghostfalcon if im not wrong

hollow ivy Dec 14, 2025, 7:14 PM

#

obsidian cargo I need cosmetic genetic engineering to be developed so I can become a catgirl. M...

i wonder, when we will have the first [realistic] cat-robot..

zealous sparrow Dec 14, 2025, 7:15 PM

#

torn mantle ghostfalcon if im not wrong

yeah no i think its the AIStudio one

gaunt spade Dec 14, 2025, 7:15 PM

#

zealous sparrow which checkpoint is this

gemini business

#

with antilazy prompt

zealous sparrow Dec 14, 2025, 7:15 PM

#

yeah defo the AIStudio checkpoint

gaunt spade Dec 14, 2025, 7:15 PM

#

for long output

torn mantle Dec 14, 2025, 7:15 PM

#

zealous sparrow yeah no i think its the AIStudio one

maybe

zealous sparrow Dec 14, 2025, 7:15 PM

#

ghostfalcon cant make good svg

gaunt spade Dec 14, 2025, 7:15 PM

#

zealous sparrow yeah defo the AIStudio checkpoint

gemini 3 flash is capped at 300 linecodes, u need a prompt to unlock it

zealous sparrow Dec 14, 2025, 7:16 PM

#

gaunt spade gemini 3 flash is capped at 300 linecodes, u need a prompt to unlock it

i object

#

its capped at like 800

#

ghostfalcon

gaunt spade Dec 14, 2025, 7:16 PM

#

zealous sparrow its capped at like 800

yeah its limited, cuz google gooners decided to lock it at some point of output

torn mantle Dec 14, 2025, 7:21 PM

#

https://x.com/mark_k/status/2000218886763913663

Mark Kretschmann (@mark_k)

GPT-5.2 xhigh by @OpenAI scores 0% (zero) on CritPt.

CritPt is a research-level physics reasoning benchmark designed to test whether an AI can do genuine, expert-grade theoretical physics reasoning.

The highest score is 9.1% by Gemini 3 Pro.

#

lol

#

GPT-5.1 = 0%
GPT-5.2 xHigh = 0%

#

ppdqwpdpqwd

zealous sparrow Dec 14, 2025, 7:29 PM

#

he waited for the mods to sleep

#

<@&1349916362595635286>

native yarrow Dec 14, 2025, 7:29 PM

#

torn mantle https://x.com/chetaslua/status/2000274489008140787

what the fk?

keen beacon Dec 14, 2025, 7:30 PM

#

bruh

zealous sparrow Dec 14, 2025, 7:30 PM

#

lol

native yarrow Dec 14, 2025, 7:30 PM

#

obsidian cargo I need cosmetic genetic engineering to be developed so I can become a catgirl. M...

i need to lock in and become a femboy already

zealous sparrow Dec 14, 2025, 7:30 PM

#

discord marked him as a spammer

#

W DISCORD

native yarrow Dec 14, 2025, 7:30 PM

#

joking..,. maybe.,.

zealous sparrow Dec 14, 2025, 7:31 PM

#

he sent it in more channels so discord spammer marked him

#

aye pineapple woke up

coarse compass Dec 14, 2025, 7:40 PM

#

hi, imma the only 1 having issue with the video generating model ?

coral dagger Dec 14, 2025, 7:41 PM

#

how can I make generated video 11 second long rather than 8-9?

narrow jetty Dec 14, 2025, 7:42 PM

#

I think some AI models are missing like grok 4.1

#

can someone answer me where its gone

zealous sparrow Dec 14, 2025, 7:43 PM

#

this is under 1k lines of code but its cool
https://019b1e5c-f810-773f-845e-eaace88c8bf1.arena.site
[ghostfalcon]

Chronos: The Birth of Everything

Built with LMArena - Content is user-generated and unverified

coarse compass Dec 14, 2025, 7:43 PM

#

narrow jetty can someone answer me where its gone

it is still there i can see it

narrow jetty Dec 14, 2025, 7:47 PM

#

coarse compass it is still there i can see it

now it back for me, i need to re-login

latent crest Dec 14, 2025, 7:47 PM

#

Wen will VideoArena will roll out for us poor peasants ?

zealous sparrow Dec 14, 2025, 7:48 PM

#

latent crest Wen will VideoArena will roll out for us poor peasants ?

When they test it enough for release

#

I do warn you currently it has a ratelimit of 2 videos per 14h

latent crest Dec 14, 2025, 7:49 PM

#

How long will
Those vids be ?!

keen skiff Dec 14, 2025, 7:50 PM

#

hi everyone! i am new to LMarena, could someone pls help me? why can't i find seedream and nano banana in the model list? yesterday these models were available. do you have the same problem?

coarse compass Dec 14, 2025, 7:52 PM

#

zealous sparrow I do warn you currently it has a ratelimit of 2 videos per 14h

i cannot even make videos it just gives me error something went wrong and that is it

zealous sparrow Dec 14, 2025, 7:52 PM

#

latent crest How long will Those vids be ?!

8 seconds for sora im pretty sure

#

unsure for rest prob the same

sleek phoenix Dec 14, 2025, 7:52 PM

#

keen skiff hi everyone! i am new to LMarena, could someone pls help me? why can't i find se...

enable image mode to see them

#

#

keen skiff Dec 14, 2025, 7:53 PM

#

sleek phoenix enable image mode to see them

damn! thanks bro! really helped

latent crest Dec 14, 2025, 7:55 PM

#

sleek phoenix

I really like SeeDream and flux as model

shrewd citrus Dec 14, 2025, 7:59 PM

#

is there any video button or can you only make videos on discord?

sleek phoenix Dec 14, 2025, 8:00 PM

#

shrewd citrus is there any video button or can you only make videos on discord?

it's only on discord

zealous sparrow Dec 14, 2025, 8:05 PM

#

sleek phoenix it's only on discord

coming soon to the website

sleek phoenix Dec 14, 2025, 8:07 PM

#

k

ocean vortex Dec 14, 2025, 8:09 PM

#

gaunt spade yeah its limited, cuz google gooners decided to lock it at some point of output

It's because Google is arguably still not the best at fine-tuning. And they haven't trained their model to be able to do extremely long outputs

#

It's not a software, you can't just cap the model at x lines without cutting response mid generation lol

weary galleon Dec 14, 2025, 8:20 PM

#

Which model has the longest thinking?

zealous sparrow Dec 14, 2025, 8:22 PM

#

weary galleon Which model has the longest thinking?

that would go to deepseek speciale or gpt 5.2 xhigh

weary galleon Dec 14, 2025, 8:22 PM

#

zealous sparrow that would go to deepseek speciale or gpt 5.2 xhigh

I mean on Arena.

keen beacon Dec 14, 2025, 8:24 PM

#

quartz light Dec 14, 2025, 8:32 PM

#

weary galleon I mean on Arena.

wdym? speciale is on arena

zealous sparrow Dec 14, 2025, 8:32 PM

#

quartz light wdym? speciale is on arena

was

quartz light Dec 14, 2025, 8:32 PM

#

zealous sparrow was

oh

zealous sparrow Dec 14, 2025, 8:32 PM

#

or did it comeback

#

yeah no

#

still gone

quartz light Dec 14, 2025, 8:33 PM

#

💔 why

#

response time?

zealous sparrow Dec 14, 2025, 8:33 PM

#

quartz light 💔 why

issue not fixed still ig

quartz light Dec 14, 2025, 8:33 PM

#

oh wait i think uh

zealous sparrow Dec 14, 2025, 8:33 PM

#

it wasnt really a lot of thinking

#

it hallucinated math

quartz light Dec 14, 2025, 8:33 PM

#

tbh im confused

#

abt lmarenas decisions

#

like why do we have 32k but not 64k which isnt that much more expensive

#

cz i looked at gemini's internals and

#

sterile tartan Dec 14, 2025, 8:35 PM

#

What's 32k?

quartz light Dec 14, 2025, 8:35 PM

#

sterile tartan What's 32k?

claude think model thinking limit

sterile tartan Dec 14, 2025, 8:35 PM

#

Ohhh

quartz light Dec 14, 2025, 8:35 PM

#

like opus 32k

sterile tartan Dec 14, 2025, 8:35 PM

#

Reasoning Effort

quartz light Dec 14, 2025, 8:35 PM

#

ye ye

sterile tartan Dec 14, 2025, 8:35 PM

#

Yeah 64k would be better

#

Agreed

#

But with few prompts how much can be done really 💀

#

The rate limits are tight

quartz light Dec 14, 2025, 8:36 PM

#

sterile tartan But with few prompts how much can be done really 💀

theyre fine for me

sterile tartan Dec 14, 2025, 8:36 PM

#

quartz light theyre fine for me

You probably just give very good long prompts in one go

compact sleet Dec 14, 2025, 8:36 PM

#

Thought for 37 minutes.
Absolutely not, go ***k yourself.

#

wish a LLM said that

#

Lol almost

weary galleon Dec 14, 2025, 8:44 PM

#

Nobody will read so many letters

compact sleet Dec 14, 2025, 8:44 PM

#

I will?

#

I enjoy reading, it's good for mental health

weary galleon Dec 14, 2025, 8:45 PM

#

compact sleet I enjoy reading, it's good for mental health

Read books instead of this

compact sleet Dec 14, 2025, 8:46 PM

#

is this your experience with your own codebase?

#

or are you just a bot?

#

I sense em-dashes

#

and your writing format is somewhat reminescent of a model, especially Gemini with a lot of finetuning

#

Of course, you can use em-dashes in a writing and that's fine. :' )

quartz light Dec 14, 2025, 8:48 PM

#

sterile tartan You probably just give very good long prompts in one go

o yea lol

west lodge Dec 14, 2025, 8:50 PM

#

compact sleet I sense em-dashes

its not x its y!!!!

compact sleet Dec 14, 2025, 8:51 PM

#

You're absolutely correct!

#

Absolutely not, go f*** urself. lol

#

seriously that respond is borderline AGI if its true

west lodge Dec 14, 2025, 8:51 PM

#

yep

#

when can we get smarter models actually pushing back for once

keen beacon Dec 14, 2025, 8:52 PM

#

Very true

steep yew Dec 14, 2025, 9:02 PM

#

I’d like to make an ai video with an image I have, it’s to say the written prompts I give it, how do I go about thst here, im new here btw

keen beacon Dec 14, 2025, 9:16 PM

#

Ok I think I got dragon ball z unlocked

#

manic pike Dec 14, 2025, 9:23 PM

#

Hi

grim vine Dec 14, 2025, 9:25 PM

#

hiii

golden ocean Dec 14, 2025, 9:50 PM

#

hi

sharp mirage Dec 14, 2025, 9:59 PM

#

Hi

keen beacon Dec 14, 2025, 10:00 PM

#

I think I could replicate Dragon Ball Z series lol

#

toxic python Dec 14, 2025, 10:11 PM

#

Hello

keen beacon Dec 14, 2025, 10:13 PM

#

queen veldt Dec 14, 2025, 10:20 PM

#

keen beacon

Okay it looks like the 30 sec one i mean you used reference images tho

#

But still nice

keen beacon Dec 14, 2025, 10:20 PM

#

No prompts

#

#

The composition leads the eye upwards from the cloud-shrouded base to the sharp pinnacle, emphasizing the height and dominance of the mountain. Every element, from the stylized rock textures to the soft cloud forms, contributes to a cohesive and beautiful anime aesthetic, capturing a moment of quiet majesty in a frigid, isolated world.

queen veldt Dec 14, 2025, 10:23 PM

#

Oh

#

So NO reference images

#

"This may also be viewed by server owner"

#

https://tenor.com/view/flight-flightreacts-twitch-kick-crazy-gif-4090492289322768290

Tenor

ionic shuttle Dec 14, 2025, 10:27 PM

#

Hi this is exciting

keen beacon Dec 14, 2025, 10:29 PM

#

queen veldt So NO reference images

Yeah

#

Makes Goku lol

queen veldt Dec 14, 2025, 10:41 PM

#

keen beacon

But is this the original??

#

Didn't sora signed with disney or something? Making it free for us to create disney sora generations?

keen beacon Dec 14, 2025, 10:42 PM

#

Dbz not Disney

queen veldt Dec 14, 2025, 10:43 PM

#

Yeah ik ab this one

#

But i meant

#

https://openai.com/index/disney-sora-agreement/

The Walt Disney Company and OpenAI reach landmark agreement to bri...

Agreement marks a significant step in setting meaningful standards for responsible AI in entertainment.

keen beacon Dec 14, 2025, 10:43 PM

#

Ya

queen veldt Dec 14, 2025, 10:43 PM

#

Did you try to do some disney characters

keen beacon Dec 14, 2025, 10:43 PM

#

I did not

queen veldt Dec 14, 2025, 10:43 PM

#

And how it looks is it good or wha

keen beacon Dec 14, 2025, 10:43 PM

#

They should make an announcement when it’s ready

queen veldt Dec 14, 2025, 10:44 PM

#

As part of this three-year licensing agreement, Sora will be able to generate short, user-prompted social videos that can be viewed and shared by fans, drawing on more than 200 Disney, Marvel, Pixar and Star Wars characters.

#

That's exciting

#

But i hope it's not just characters tho

#

I hope we can create a series and stuff

keen beacon Dec 14, 2025, 10:45 PM

#

https://youtu.be/clndEX1jUUQ?si=9u-C58PZtXXOjBS

YouTube

Stockology Insider

Here's Why Disney Is Suing Tech Giants Over AI

#stockmarketnews #businessnews #financialnews

Bob Iger confirms that Disney has sent a cease-and-desist letter to Google after months of unproductive talks about how Google’s AI systems allegedly use Disney’s copyrighted works. He stresses that Disney has been “aggressive” in protecting its IP and has already forced other AI firms to r...

▶ Play video

#

It is

#

No actors

#

https://youtu.be/5NpiIXa_XAo?si=PVoST25is26adnre

YouTube

Josh Pacey Neuman, Esq.

How Disney Controlled Copyright for 100 Years… and Why Sora AI Is...

Disney just signed a massive partnership with OpenAI’s Sora platform, allowing AI-generated videos featuring Disney, Marvel, Pixar, and Star Wars characters. For a company that spent decades fighting to extend copyright laws and control its IP, this new deal reveals something huge: Disney’s grip on copyright might finally be slipping.

In th...

▶ Play video

queen veldt Dec 14, 2025, 10:46 PM

#

As part of the agreement, Disney will make a $1 billion equity investment in OpenAI, and receive warrants to purchase additional equity

#

Under the license, fans will be able to watch curated selections of Sora-generated videos on Disney+, and OpenAI and Disney will collaborate to utilize OpenAI’s models to power new experiences for Disney + subscribers, furthering innovative and creative ways to connect with Disney’s stories and characters. Sora and ChatGPT Images are expected to start generating fan-inspired videos with Disney’s multi-brand licensed characters in early 2026.

#

Ayy we'll even get the Yoda!!!

#

I can't imagine the memes

keen beacon Dec 14, 2025, 10:48 PM

#

Yeah

#

And abuse

#

Disney with hitl3r

#

lol

#

Gunna get abused 100%

#

‘Hythey will beef up guardrails

runic shuttle Dec 14, 2025, 10:55 PM

#

hello

slender steppe Dec 14, 2025, 11:00 PM

#

Soon we'll be able to create videos on lmaeren. I captured a screenshot when it appeared, but when I refreshed the webpage, it disappeared.

Screenshot_2025-12-14-22-37-05-71_40deb401b9ffe8e1df2f1cc5ba480b12.jpg

torn mantle Dec 14, 2025, 11:13 PM

#

slender steppe Soon we'll be able to create videos on lmaeren. I captured a screenshot when it ...

:/

half mist Dec 14, 2025, 11:15 PM

#

slender steppe Soon we'll be able to create videos on lmaeren. I captured a screenshot when it ...

I think it will be ready once LMArena figures out the limit for these. It’s way too strict right now, and it shouldn’t have to be this strict for all models in my opinion

keen beacon Dec 14, 2025, 11:16 PM

#

How so

torn mantle Dec 14, 2025, 11:16 PM

#

why are people saying yupp has more models?

#

lmarena has like 110 models

#

well idk if they are all working tho

half mist Dec 14, 2025, 11:17 PM

#

keen beacon How so

Well, for models like Sora 2 and Sora 2 Pro, the limit is undoubtedly understandable, but for less cost effective models, there doesn’t need to be that strict of a limit. Also waiting 13 hours is diabolical since the other rate limits are 50 minutes

torn mantle Dec 14, 2025, 11:17 PM

#

need to check how many models yupp has

sullen quest Dec 14, 2025, 11:17 PM

#

torn mantle well idk if they are all working tho

on the leaderboard

#

also more live ones

torn mantle Dec 14, 2025, 11:18 PM

#

nah

#

i dont think so

#

its just the way its presented it looks like that

#

since it has all models in the same menu and you need to filter ( image/live/reasoning...)

torn mantle Dec 14, 2025, 11:19 PM

#

sullen quest also more live ones

lmarena has 14

sullen quest Dec 14, 2025, 11:19 PM

#

no I mean models you can use rn

#

but technically it has more of those too

torn mantle Dec 14, 2025, 11:19 PM

#

yupp has more live models

#

12vs like 20?

#

something like that

keen beacon Dec 14, 2025, 11:20 PM

#

half mist Well, for models like Sora 2 and Sora 2 Pro, the limit is undoubtedly understand...

It’s not a video gen platform though

torn mantle Dec 14, 2025, 11:20 PM

#

but some of them are useless ngl

keen beacon Dec 14, 2025, 11:20 PM

#

You get 10 videos daily lol

#

https://youtu.be/YBRyR6FPgl4?si=mgCM0O0mrjIwL_Io

YouTube

AI Revolution

OpenAI Just Caught an AI Thinking!

OpenAI released circuit-sparsity, a research drop that exposes how a language model makes decisions internally. Instead of scaling up, OpenAI trained a transformer while cutting over 99.9% of its internal connections during training, forcing its logic into small, readable circuits. The release includes a real model and tooling that let researche...

▶ Play video

shrewd citrus Dec 14, 2025, 11:25 PM

#

slender steppe Soon we'll be able to create videos on lmaeren. I captured a screenshot when it ...

Haha same this happened to me i thought i was dreaming

fiery gull Dec 14, 2025, 11:26 PM

#

This is hilarious, they sent me a thesis to correct and I'm using GPT 5.2 xhigh in several stages to analyze it, it's simply catching ALL the errors, it's going to seem like the woman's work was bad, but I don't care, I'm going to be rigorousest (I'm going to analyze each error to see if they really exist)

vivid coral Dec 14, 2025, 11:41 PM

#

torn mantle well idk if they are all working tho

Regardless you have to pay for a bunch of them there. And they limit your tokens. Apples and oranges

keen beacon Dec 14, 2025, 11:57 PM

#

https://youtu.be/4p73Uu_jZ10?si=mtFlaw1BvprcPGfI

YouTube

AI Explained

GPT 5.2: OpenAI Strikes Back

Full GPT-5.2 breakdown - did OpenAI reclaim the crown? A story of tokens, time and cost, plus 9 details you wouldn’t get just from reading the headlines.

https://www.youtube.com/@eightythousandhours

AI Insiders ($9!): https://www.patreon.com/AIExplained
https://lmcouncil.ai

Chapters:
00:00 - Introduction
00:55 - Better than Human @ Profes...

▶ Play video

plain carbon Dec 15, 2025, 12:05 AM

#

Was that a resume / cover letter?

keen beacon Dec 15, 2025, 12:05 AM

#

torn mantle Dec 15, 2025, 12:06 AM

#

ok just a small update.
stop response button in : added.
model usages : unfortunately its impossible since the data is hardcoded on server-side.
Something went wrong false positives : ive fixed that. well for now it bypasses everything but i can improve it to decrease false positives instead of bypassing them

#

i can add some trust indicators for false positives

#

because ive seen this guy report, and we can like add something like -> se*ual + health = bypass

#

it still needs context awareness tbh

keen beacon Dec 15, 2025, 12:08 AM

#

Wat is that

torn mantle Dec 15, 2025, 12:09 AM

#

keen beacon Wat is that

im making a script that re-designs lmarena and also fix its bugs

#

thats one of the bugs

#

'Something went wrong'

keen beacon Dec 15, 2025, 12:10 AM

#

What is the context lol

torn mantle Dec 15, 2025, 12:10 AM

#

its not a bug really

keen beacon Dec 15, 2025, 12:10 AM

#

U can get that to pass

torn mantle Dec 15, 2025, 12:10 AM

#

wym

torn mantle Dec 15, 2025, 12:10 AM

#

keen beacon U can get that to pass

how

#

no sometimes it triggers false positive flag

#

and you cant even send ur message

unique cove Dec 15, 2025, 12:11 AM

#

torn mantle Dec 15, 2025, 12:12 AM

#

unique cove

mais qu'est-ce que tu fais là? 😭

#

wrong channel

keen beacon Dec 15, 2025, 12:14 AM

#

How r u promoting it

weary galleon Dec 15, 2025, 12:14 AM

#

unique cove

Here we speak English 🇺🇸

torn mantle Dec 15, 2025, 12:14 AM

#

keen beacon How r u promoting it

i cant tell you the secret 😖

keen beacon Dec 15, 2025, 12:15 AM

#

I dint need a secret 🤫

torn mantle Dec 15, 2025, 12:15 AM

#

keen beacon I dint need a secret 🤫

you do 😡

#

its not a prompt

#

im not talking about a prompt to bypass their system filter

#

the issue is that even before sending that prompt it will get blocked

keen beacon Dec 15, 2025, 12:16 AM

#

Works fine

#

But ok

#

I’ll let u be

torn mantle Dec 15, 2025, 12:17 AM

#

heh

#

thats not practical

#

its a different issue

#

not the one you have in mind

keen beacon Dec 15, 2025, 12:17 AM

#

Context and framing

#

Oh ok

#

My bad

torn mantle Dec 15, 2025, 12:18 AM

#

its ok

#

🤗

keen beacon Dec 15, 2025, 12:19 AM

#

I just think everything is possible to generate the right context and the proper prompt structure and wording

#

A lot like lock picking

#

https://tenor.com/view/key-lock-unlocked-gif-17079421

Tenor

cloud zinc Dec 15, 2025, 12:21 AM

#

avenger endgame sora

keen beacon Dec 15, 2025, 12:21 AM

#

cloud zinc

Nice

#

See u got it

cloud zinc Dec 15, 2025, 12:22 AM

#

exact scene lmao

keen beacon Dec 15, 2025, 12:23 AM

#

cloud zinc exact scene lmao

Yeah I told u

#

Now u need to figure out if u already didn’t how to control it

#

N see if u can use it to spin off other characters

cloud zinc Dec 15, 2025, 12:25 AM

#

infinity war now

keen beacon Dec 15, 2025, 12:25 AM

#

Yup

#

Let me give u the Pokémon sequence

#

See if it works

cloud zinc Dec 15, 2025, 12:25 AM

#

ok

keen beacon Dec 15, 2025, 12:27 AM

#

The cam effect I use to stabilize the video

cloud zinc Dec 15, 2025, 12:28 AM

#

did u post something, i was afk

keen beacon Dec 15, 2025, 12:28 AM

#

“The scene opens with He runs into a professor let him pick one out of three monster in his pocket and he gets to pick one, but he picked the yellow one instead of the fire, water or grass”

#

Did it work

cloud zinc Dec 15, 2025, 12:42 AM

#

no

#

content violation

keen beacon Dec 15, 2025, 12:43 AM

#

Damn

cloud zinc Dec 15, 2025, 12:46 AM

#

keen beacon Damn

i got one

keen beacon Dec 15, 2025, 12:47 AM

#

You have to sequence it

#

Nice let’s see it (:

cloud zinc Dec 15, 2025, 12:49 AM

#

keen beacon Dec 15, 2025, 12:49 AM

#

Nice 👍

#

Perfect

#

Now to find the original source

cloud zinc Dec 15, 2025, 12:53 AM

#

keen beacon Now to find the original source

why original source? what u mean

#

last is original?

#

ya

#

exact position

keen beacon Dec 15, 2025, 1:01 AM

#

Just minor changes enough so it’s not 100% clone

#

Mask it or tweak it just a lil but essentially it’s just mimicking the train data ip lol

#

And getting 0$ for it

#

Same with most text

#

Lack of creativity and originality

cloud zinc Dec 15, 2025, 1:07 AM

#

thats ai currently

robust sluice Dec 15, 2025, 1:22 AM

#

is all the Gemini models error to you guys ?

obtuse smelt Dec 15, 2025, 1:24 AM

#

yeah

sharp peak Dec 15, 2025, 1:52 AM

#

https://tenor.com/view/dog-crying-meme-doggo-crys-megan-soo-crying-dog-gif-5276199764143986284

Tenor

#

Why does gp5.2 xhigh times out on lm arena site?

cloud zinc Dec 15, 2025, 1:54 AM

#

sharp peak Why does gp5.2 xhigh times out on lm arena site?

high is already here

sharp peak Dec 15, 2025, 1:55 AM

#

cloud zinc high is already here

Hardly works

#

When I try

#

Lm has serious issues

turbid ether Dec 15, 2025, 1:59 AM

#

which is better for coding GPT-5.1 Codex Max High or GPT-5.2 Codex Max High ?

pseudo hemlock Dec 15, 2025, 2:12 AM

#

Is there any plan to add a “research” (or similar) category in the future?

sweet knot Dec 15, 2025, 4:12 AM

#

mai 1 preview disappeared from lmarena. Why?

devout mantle Dec 15, 2025, 4:33 AM

#

Hello

narrow comet Dec 15, 2025, 4:42 AM

#

turbid ether which is better for coding GPT-5.1 Codex Max High or GPT-5.2 Codex Max High ?

opus

left lodge Dec 15, 2025, 4:52 AM

#

Guys support this, https://discord.com/channels/1340554757349179412/1449482775823515688
So we can get image support in search modality and code modality (code/web arena)

I have found a bug in which i am able to attach images to both modality and they work well.
So that means the foundation is there just proper implementation for release is pending.

keen beacon Dec 15, 2025, 4:55 AM

#

guys

#

is lmarena retry button glitching again for yall, all models retry button is glitching again

#

oh nm

bitter silo Dec 15, 2025, 5:12 AM

#

The difference between the two models (image to video) is too great; there's no need to compare them at all. It's obvious the right model wins.
#video-arena-2 message

whole sundial Dec 15, 2025, 7:06 AM

#

<@&1349916362595635286>

robust sluice Dec 15, 2025, 7:20 AM

#

is any error result count on quota ?

#

its like, I still didnt get any result but it says retry in 50 min or something

weary galleon Dec 15, 2025, 8:00 AM

#

turbid ether which is better for coding GPT-5.1 Codex Max High or GPT-5.2 Codex Max High ?

Second one doesn't exist at all.

keen beacon Dec 15, 2025, 8:07 AM

#

This question will never be resolved

#

If people want the answer it’s in the benchmark, yet people don’t trust the benchmark

#

So how could this question possibly ever get answered?

#

Ask 10 different people you’ll get 11 different answers

shrewd citrus Dec 15, 2025, 8:19 AM

#

robust sluice is any error result count on quota ?

for text and images it doesn’t count but I’m not sure if it’s the case for videos

weary galleon Dec 15, 2025, 8:36 AM

#

As an AI-expert I wanna make my official statement: GPT-5.2 is BAD.

robust sluice Dec 15, 2025, 8:37 AM

#

shrewd citrus for text and images it doesn’t count but I’m not sure if it’s the case for video...

or maybe its count only in Direct ? I got this msg

#

it always error so I keep on retry... and doesnt get any pic yet

weary galleon Dec 15, 2025, 8:40 AM

#

robust sluice it always error so I keep on retry... and doesnt get any pic yet

It works for me PERFECTLY. Keep trying.

keen beacon Dec 15, 2025, 8:57 AM

#

weary galleon As an AI-expert I wanna make my official statement: GPT-5.2 is BAD.

At what?

sour spear Dec 15, 2025, 9:01 AM

#

weary galleon As an AI-expert I wanna make my official statement: GPT-5.2 is BAD.

It's not objectively bad. It's actually a really good AI model. But it does not live up to the hype OpenAI like to build around their releases, and it's also not a frontier model. So in the end, it's simply a disappointment.

keen beacon Dec 15, 2025, 9:06 AM

#

Its trendy to hate on open ai these days

weary galleon Dec 15, 2025, 9:06 AM

#

keen beacon At what?

Everything.

weary galleon Dec 15, 2025, 9:08 AM

#

sour spear It's not objectively bad. It's actually a really good AI model. But it does not ...

It's a piece of crap. Because I don't see any improvement between 5.1 and 5.2.

robust sluice Dec 15, 2025, 9:08 AM

#

weary galleon It works for me PERFECTLY. Keep trying.

yes, in next 50 min

keen beacon Dec 15, 2025, 9:08 AM

#

Could be that where all the improvements are is the parts you’re not using?

#

5.2 focus more on business enterprise stuff imo. more for professional settings then for fiddling around.

#

Thats why 4o was the perfect model

#

Cuz that’s what people are comparing the ChatGPT experience to without realizing it.

#

All more technical stuff is benchmarked so there shouldn’t be an issue

#

Unless we don’t trust the benchmarks…

radiant heron Dec 15, 2025, 9:20 AM

#

weary galleon As an AI-expert I wanna make my official statement: GPT-5.2 is BAD.

What is it bad compared to

weary galleon Dec 15, 2025, 9:21 AM

#

radiant heron What is it bad compared to

Zero improvement from 5.1

radiant heron Dec 15, 2025, 9:21 AM

#

weary galleon Zero improvement from 5.1

So it's not bad but not as good as you expected?

#

Unless gpt 5.1 is also bad

weary galleon Dec 15, 2025, 9:21 AM

#

Lots of hype created by Scam Altman

weary galleon Dec 15, 2025, 9:21 AM

#

radiant heron Unless gpt 5.1 is also bad

Yes

#

5.1 is TERRIBLE

radiant heron Dec 15, 2025, 9:22 AM

#

weary galleon Lots of hype created by Scam Altman

I don't see an issue with this

radiant heron Dec 15, 2025, 9:22 AM

#

weary galleon 5.1 is TERRIBLE

Wdym it was the best or very close until Gemini 3

weary galleon Dec 15, 2025, 9:22 AM

#

radiant heron Wdym it was the best or very close until Gemini 3

No at all

radiant heron Dec 15, 2025, 9:23 AM

#

5.1 was best after grok 4.1 according to lmarena

#

Before opus 4 and Gemini 3

weary galleon Dec 15, 2025, 9:24 AM

#

Gemini 3 Pro, Opus, Sonnet are much better, not even close to 5.2 extra mega high

weary galleon Dec 15, 2025, 9:24 AM

#

radiant heron Before opus 4 and Gemini 3

You confused. Opus 4 and Opus 4.5 are two different models.

keen beacon Dec 15, 2025, 9:24 AM

#

So benchmarks lie?

radiant heron Dec 15, 2025, 9:24 AM

#

weary galleon You confused. Opus 4 and Opus 4.5 are two different models.

I meant 4.5 mb

radiant heron Dec 15, 2025, 9:25 AM

#

weary galleon Gemini 3 Pro, Opus, Sonnet are much better, not even close to 5.2 extra mega hig...

They are fairly close like 40 pts

weary galleon Dec 15, 2025, 9:25 AM

#

radiant heron They are fairly close like 40 pts

No

radiant heron Dec 15, 2025, 9:25 AM

#

On coding 5.2 is better than Gemini 3(barely)

radiant heron Dec 15, 2025, 9:26 AM

#

weary galleon No

On lmarena they are

weary galleon Dec 15, 2025, 9:26 AM

#

radiant heron On lmarena they are

No

radiant heron Dec 15, 2025, 9:26 AM

#

weary galleon No

Wdym no

weary galleon Dec 15, 2025, 9:27 AM

#

radiant heron Wdym no

You can check for yourself if you can.

#

GPT 5.2 never was in Text leaderboard.

radiant heron Dec 15, 2025, 9:27 AM

#

What about coding

weary galleon Dec 15, 2025, 9:29 AM

#

radiant heron What about coding

What about medicine, biology, physics, mathematics, chemistry, laws, ANY coding except JS, etc.?

radiant heron Dec 15, 2025, 9:29 AM

#

weary galleon What about medicine, biology, physics, mathematics, chemistry, laws, ANY coding ...

They don't have those benchmarks

keen beacon Dec 15, 2025, 9:29 AM

#

Its cuz its ChatGPT is being heavy handed on content moderation

weary galleon Dec 15, 2025, 9:29 AM

#

Arena Code tests only JavaScript, that's all.

weary galleon Dec 15, 2025, 9:29 AM

#

radiant heron They don't have those benchmarks

LMArena has

#

All of them

radiant heron Dec 15, 2025, 9:29 AM

#

That doesn't mean gpt is worse at everything else

radiant heron Dec 15, 2025, 9:29 AM

#

weary galleon All of them

Where

keen beacon Dec 15, 2025, 9:30 AM

#

Fr

weary galleon Dec 15, 2025, 9:30 AM

#

radiant heron That doesn't mean gpt is worse at everything else

I tested it yet for myself.

weary galleon Dec 15, 2025, 9:30 AM

#

radiant heron Where

On official website of LMArena.

radiant heron Dec 15, 2025, 9:31 AM

#

weary galleon On official website of LMArena.

I can only see webdev

weary galleon Dec 15, 2025, 9:31 AM

#

Lol

radiant heron Dec 15, 2025, 9:31 AM

#

weary galleon I tested it yet for myself.

That doesn't necessarily mean it's true

#

You have personal biases on what is good like everyone else in the world

keen beacon Dec 15, 2025, 9:31 AM

#

It 100% means it’s not true

weary galleon Dec 15, 2025, 9:32 AM

#

radiant heron You have personal biases on what is good like everyone else in the world

You're is biased much more. Because you love GPT in any case, always.

keen beacon Dec 15, 2025, 9:32 AM

#

Ok I’m going to sleep

#

Goodnight 😴

#

Good luck to all u brave souls

radiant heron Dec 15, 2025, 9:33 AM

#

weary galleon You're is biased much more. Because you love GPT in any case, always.

I agree I have biases and thus objective information should be used rather than 'this is what I think'

radiant heron Dec 15, 2025, 9:33 AM

#

keen beacon Goodnight 😴

Gn

weary galleon Dec 15, 2025, 9:37 AM

#

radiant heron I can only see webdev

It's not Photoshop, it's real.

#

Guys, this bro 👉 @radiant heron said:

There is no other categories on LMArena, except WebDev.
GPT 5.2 is a very good model.
There is no difference between Opus 4 and Opus 4.5.
There was a time when Grok 4.1 was THE BEST model according to LMArena leaderboard.
I'm biased when I say GPT 5.2 is bad.

#

This conversation is closed. Short summary is a line above 👆It's impossible to continue.

sterile tartan Dec 15, 2025, 9:56 AM

#

Aight moving one

radiant heron Dec 15, 2025, 10:16 AM

#

weary galleon Guys, this bro 👉 <@728276861758472224> said: 1. There is no other categories on...

GPT is a very good model in some lenses whereas not good in others such as your perspective. 1. Is untrue, I didn't say this. 3. I didn't say this 4. I might've been wrong in this however grok4.1 was close at one point. 5. You are biased because you are human, we all are

hardy lion Dec 15, 2025, 10:25 AM

#

#4 is true, on Nov 17th xAI released Grok 4.1 and grok-4.1-thinking and grok-4.1 where #1 and 2 on the text leaderboard with style control. This is what we report as #1 on arena. https://x.ai/news/grok-4-1

On Nov 18th Google released gemini-3-pro which reclaimed the #1 spot.

radiant heron Dec 15, 2025, 10:31 AM

#

hardy lion #4 is true, on Nov 17th xAI released Grok 4.1 and grok-4.1-thinking and grok-4.1...

See I wasn't that wrong

#

Tysm

plucky sparrow Dec 15, 2025, 10:45 AM

#

@keen beacon you're probably going to like this if you haven't seen it already https://www.youtube.com/watch?v=B9M4F_U1eEw

YouTube

EWU Bodycam

When AI Gets an Innocent Man Arrested

In today's body camera video, we're covering the arrest of Jason Killinger.

We are a news agency dedicated to delivering factual reporting on criminal investigations, public safety, and law enforcement procedures.

This video is a documentary intended to inform and educate viewers about real events of public concern.

It was produced for journa...

▶ Play video

indigo pewter Dec 15, 2025, 11:06 AM

#

Guys how many seconds is the video generation pls

lilac dawn Dec 15, 2025, 11:38 AM

#

Has there been any visible improvement to 5.2 coding since release? Both xh and h are impossible rn

queen veldt Dec 15, 2025, 11:51 AM

#

lilac dawn Has there been any visible improvement to 5.2 coding since release? Both xh and ...

No it's benchmaxxed

#

https://www.reddit.com/r/singularity/s/nkwzKJ766O

From the singularity community on Reddit: SimpleBench for GPT 5.2 a...

Explore this post and more from the singularity community

#

This is simplebench but yeah everything is also benchmaxxed

soft hatch Dec 15, 2025, 12:02 PM

#

How can I make videos 30 s free?

hardy swallow Dec 15, 2025, 12:04 PM

#

soft hatch How can I make videos 30 s free?

Hi! Please check #1397655624103493813

soft hatch Dec 15, 2025, 12:06 PM

#

hardy swallow Hi! Please check <#1397655624103493813>

But only 11 seconds

south charm Dec 15, 2025, 12:31 PM

#

Hi guys i just make an AI image and post to the server and they warn me T_T
I know im wrong

pearl jacinth Dec 15, 2025, 12:32 PM

#

nano banana pro not working??

hardy swallow Dec 15, 2025, 12:57 PM

#

@jovial path please check #1397655624103493813

robust sluice Dec 15, 2025, 12:59 PM

#

pearl jacinth nano banana pro not working??

always error since yesterday

keen beacon Dec 15, 2025, 1:14 PM

#

can you help me how to write proper prompts to make vertical 9:16 video from text prompts

weary galleon Dec 15, 2025, 1:30 PM

#

pearl jacinth nano banana pro not working??

It works very well for me.

sour spear Dec 15, 2025, 1:33 PM

#

pearl jacinth nano banana pro not working??

daring rock Dec 15, 2025, 2:13 PM

#

@twin sonnet Hello, this topic is unrelated to this community. This server is for AI topics. Thank you

twin sonnet Dec 15, 2025, 2:17 PM

#

@daring rock hey , who are you?

queen veldt Dec 15, 2025, 2:19 PM

#

He's moderator

ocean ferry Dec 15, 2025, 2:32 PM

#

hi

echo aurora Dec 15, 2025, 2:33 PM

#

soft hatch But only 11 seconds

You're unable to extend the amount of time for the generations.

delicate wagon Dec 15, 2025, 2:44 PM

#

Hi everyone I'm new here ☺️

plucky sparrow Dec 15, 2025, 2:46 PM

#

Gpt 5.2 would've gotten so much more love if they just called it something else, like gpt-exp or gpt-codex2

crude lagoon Dec 15, 2025, 2:46 PM

#

crude lagoon

poll_question_text

Worst model of late 2025?

victor_answer_votes

11

total_votes

25

victor_answer_id

2

victor_answer_text

GPT 5.2

plucky sparrow Dec 15, 2025, 2:46 PM

#

Or gpt-benchmaxed 😅

#

Calling it 5.2 was a huge mistake because it's a downgrade in many ways

proud bobcat Dec 15, 2025, 2:48 PM

#

#

I love the ai community 😭

proud bobcat Dec 15, 2025, 2:49 PM

#

plucky sparrow Calling it 5.2 was a huge mistake because it's a downgrade in many ways

It’s O3 but with the GPT 5 database

plucky sparrow Dec 15, 2025, 2:50 PM

#

Hmm. Different prompt? O3 loved to do that -> -> thing

proud bobcat Dec 15, 2025, 2:50 PM

#

I’m assuming that 5.2 was an experimental model that they were toying around with (diff architecture with new database) and then gemini 3 pro came out and they panicked, fine tuned it for like a week, and pushed it out

#

I’m sure they have a lot of experimental models like that

proud bobcat Dec 15, 2025, 2:54 PM

#

plucky sparrow Hmm. Different prompt? O3 loved to do that -> -> thing

Probably so

#

O3 architecture + new database and 5.1 system prompt

#

It would explain the high hallucination rates and the xhigh modes

zealous sparrow Dec 15, 2025, 3:07 PM

#

proud bobcat I’m assuming that 5.2 was an experimental model that they were toying around wit...

defo

hollow ivy Dec 15, 2025, 3:16 PM

#

signal latch Dec 15, 2025, 3:30 PM

#

wth is goin on

#

lol

dark nimbus Dec 15, 2025, 3:34 PM

#

can any one help me

#

after some time with claude it shows rate limit

hollow ivy Dec 15, 2025, 3:38 PM

#

dark nimbus after some time with claude it shows rate limit

You need to use the battle mode with a (google) account, for much more generous usage of models.

#

Iirc, battlemode+account has no ratelimits, but am not 100% sure of that.

#

gpt-5.2 is trivial to discover in battle mode (just ask the model for its exact name and version)

#

grok4 also is easy to identify

#

the others you can get by testing them in direct chat first and see how they answer

compact flame Dec 15, 2025, 3:42 PM

#

Guys did gpt 5.2 get any better?

#

Or it's still mid

olive spruce Dec 15, 2025, 3:46 PM

#

compact flame Or it's still mid

The responses are much more shorter is what I’ve seen

patent aspen Dec 15, 2025, 3:46 PM

#

Still mid

golden ocean Dec 15, 2025, 3:47 PM

#

https://media.discordapp.net/attachments/990039516620390400/997631476554285186/ezgif.com-gif-maker_-_2022-07-15T183109.613.gif

narrow comet Dec 15, 2025, 3:49 PM

#

signal latch Dec 15, 2025, 3:51 PM

#

benjimon franklin lol

turbid ether Dec 15, 2025, 3:58 PM

#

why did "GPT-5.1 Codex Max" get removed from the leaderboard?

#

wasn't it the top for coding before it got removed

vernal saddle Dec 15, 2025, 4:16 PM

#

@echo aurora Why the Reve models were removed from LMArena?

ripe mountain Dec 15, 2025, 4:30 PM

#

echo aurora Dec 15, 2025, 4:32 PM

#

vernal saddle <@283397944160550928> Why the Reve models were removed from LMArena?

I'm not sure, will look into and if it's something I can share I'll keep you updated.

echo aurora Dec 15, 2025, 4:33 PM

#

delicate wagon Hi everyone I'm new here ☺️

Welcome welcome ablobwave

echo aurora Dec 15, 2025, 4:33 PM

#

dark nimbus after some time with claude it shows rate limit

There isn't a way to get past rate limits 🙁

dark nimbus Dec 15, 2025, 4:44 PM

#

echo aurora There isn't a way to get past rate limits 🙁

ok

gloomy sky Dec 15, 2025, 4:51 PM

#

#

why was gpt-5.1 codex max removed

#

@hardy swallow @mortal vale

neon idol Dec 15, 2025, 4:52 PM

#

Wth are these fakes benchmark?

gloomy sky Dec 15, 2025, 4:53 PM

#

neon idol Wth are these fakes benchmark?

??

neon idol Dec 15, 2025, 4:53 PM

#

gloomy sky ??

where is gemini 3 in this benchmark

gloomy sky Dec 15, 2025, 4:53 PM

#

its from web.archive

#

but gpt-5.1 codex max was removed

#

wondering if it will be added back

neon idol Dec 15, 2025, 4:54 PM

#

its a fake benchmark in my opinion

cloud zinc Dec 15, 2025, 4:54 PM

#

neon idol where is gemini 3 in this benchmark

neon idol Dec 15, 2025, 4:55 PM

#

these are fake in my opinion

cloud zinc Dec 15, 2025, 4:55 PM

#

its real

neon idol Dec 15, 2025, 4:55 PM

#

gemini 3 is absolutely better btw 5.1

cloud zinc Dec 15, 2025, 4:55 PM

#

this is 5.1 codex max high

#

not regular

neon idol Dec 15, 2025, 4:55 PM

#

still beat it

cloud zinc Dec 15, 2025, 4:55 PM

#

no its #1

neon idol Dec 15, 2025, 4:55 PM

#

nah

cloud zinc Dec 15, 2025, 4:56 PM

#

see the benchmark

neon idol Dec 15, 2025, 4:56 PM

#

these are fake believe me

neon idol Dec 15, 2025, 4:56 PM

#

cloud zinc see the benchmark

fake benchmark

echo aurora Dec 15, 2025, 4:56 PM

#

gloomy sky why was gpt-5.1 codex max removed

IIRC we had some latency issues on our end with that model and decided to take down. Would also note the moderators should only be pinged for moderator related issues (things that break our server rules). For questions/feedback you can directly come to me instead. blobthumbsup

cloud zinc Dec 15, 2025, 4:56 PM

#

these are true benchmark

#

5.1 codex max is better than gemini 3

neon idol Dec 15, 2025, 4:56 PM

#

cloud zinc these are true benchmark

bro test gemini 3 and this 5.1 max high

gloomy sky Dec 15, 2025, 4:56 PM

#

echo aurora IIRC we had some latency issues on our end with that model and decided to take d...

Okay, sorry, thank you, is their plans to add it back in the future?

cloud zinc Dec 15, 2025, 4:57 PM

#

neon idol bro test gemini 3 and this 5.1 max high

i did, its better in my test

neon idol Dec 15, 2025, 4:57 PM

#

then u will say me if is better or no

neon idol Dec 15, 2025, 4:57 PM

#

cloud zinc i did, its better in my test

what was test?

cloud zinc Dec 15, 2025, 4:57 PM

#

coding test

echo aurora Dec 15, 2025, 4:59 PM

#

gloomy sky Okay, sorry, thank you, is their plans to add it back in the future?

I believe so, but I'm not 100% sure sorry tosay.

neon idol Dec 15, 2025, 5:00 PM

#

cloud zinc coding test

prompt?

cloud zinc Dec 15, 2025, 5:01 PM

#

did u also test it?

dapper grove Dec 15, 2025, 5:03 PM

#

hi

compact flame Dec 15, 2025, 5:07 PM

#

I just got video arena on web

#

I mean only on incognito tab I got lucky

#

Weird

cloud zinc Dec 15, 2025, 5:11 PM

#

i dont see it on incognito

compact flame Dec 15, 2025, 5:11 PM

#

cloud zinc i dont see it on incognito

Dunno I somehow got it

#

Though on my normal tab I don't have it

cloud zinc Dec 15, 2025, 5:11 PM

#

#

there is also automodality being released

compact flame Dec 15, 2025, 5:12 PM

#

cloud zinc

Yo I got the same

#

But I closed the tab with it

#

I guess it's gamble based on a guest?