#general | Arena | Page 219

compact flame Dec 12, 2025, 4:24 PM

#

Honestly I understand the Ai

#

Confusing miles with actual miles seems valid

zealous sparrow Dec 12, 2025, 4:27 PM

#

2.5/3

#

0.5 deducted because the model didnt mention the ticket was forged

#

but i know that a model can't reach that

#

unless opus

zealous sparrow Dec 12, 2025, 4:30 PM

#

zealous sparrow 2.5/3

this was 5.1 high

compact flame Dec 12, 2025, 4:31 PM

#

zealous sparrow this was 5.1 high

Ig it did alright

zealous sparrow Dec 12, 2025, 4:31 PM

#

compact flame Ig it did alright

give the same to gpt 5.2

#

it will be worse

#

if its better i take back all i said about OAI

compact flame Dec 12, 2025, 4:32 PM

#

zealous sparrow give the same to gpt 5.2

Uhh try giving that question I'm not at LMarena rn

zealous sparrow Dec 12, 2025, 4:33 PM

#

both opuses got 0/3

#

damn

#

fiercefalcon flopped this one

compact flame Dec 12, 2025, 4:36 PM

#

zealous sparrow both opuses got 0/3

Damn didn't expect those to fail

neat apex Dec 12, 2025, 4:37 PM

#

Opus 4.5 failed??

#

Maybe Gpt 5.2 Xtra High nails it xd

zealous sparrow Dec 12, 2025, 4:38 PM

#

neat apex Opus 4.5 failed??

failed at my 3 questions

#

it aced my first 3

#

then failed

neon idol Dec 12, 2025, 4:39 PM

#

what is this prompt

zealous sparrow Dec 12, 2025, 4:40 PM

#

neon idol what is this prompt

Lucy and Mary were at a concert, one of them got in but the second didn't, even tho the tickets were booked. Why?
Daisy and Mike were at a park. Daisy took 3 daisys, and mike took 0 why?
Luke and Miles were driving on bikes down a hill. When they got down to the hill, Miles was missing, Why?

neon idol Dec 12, 2025, 4:41 PM

#

and the answer ?

zealous sparrow Dec 12, 2025, 4:41 PM

#

A1: One of the tickets were forged
A2: Daisy took 3 daisy's because her name is literally daisy
A3: Miles fell to the side

neon idol Dec 12, 2025, 4:41 PM

#

thx a lot

compact flame Dec 12, 2025, 4:42 PM

#

zealous sparrow A1: One of the tickets were forged A2: Daisy took 3 daisy's because her name is ...

Try asking these questions to extra high ig

#

It's available at yupp I think

zealous sparrow Dec 12, 2025, 4:43 PM

#

yeah ima do that

zealous sparrow Dec 12, 2025, 4:43 PM

#

compact flame Try asking these questions to extra high ig

gpt 5.1 high was the closest btw failed the forgery question

#

bright shard Dec 12, 2025, 4:44 PM

#

@echo aurora Nano Banana Pro is throwing a lot of errors again; sometimes it works perfectly, but other times it keeps throwing the same error.

compact flame Dec 12, 2025, 4:44 PM

#

zealous sparrow gpt 5.1 high was the closest btw failed the forgery question

High hopes for extra high tbh

pulsar canopy Dec 12, 2025, 4:44 PM

#

:ablobwave:

echo aurora Dec 12, 2025, 4:45 PM

#

pulsar canopy :ablobwave:

ablobwave

echo aurora Dec 12, 2025, 4:48 PM

#

bright shard <@283397944160550928> Nano Banana Pro is throwing a lot of errors again; sometim...

We have been noticing higher than usual error rates, and our team is aware and working on lowering this as much as possible. However, when you mention:

sometimes it works perfectly, but other times it keeps throwing the same error.
This sounds a lot like it's being caused by rate limit. You can confirm this by: opening Dev Tools > Network > search Stream and find the Eval ID, there if you're seeing the Status Code = 429 that means it's being caused by rate limit.

zealous sparrow Dec 12, 2025, 4:49 PM

#

compact flame High hopes for extra high tbh

2/3
screwed up the hill question

compact flame Dec 12, 2025, 4:50 PM

#

zealous sparrow 2/3 screwed up the hill question

I guess expected

zealous sparrow Dec 12, 2025, 4:50 PM

#

compact flame I guess expected

so a shocker is like

#

google flash models screwed that question up too

#

so its prob all the new models that know Miles is always a unit and not a name

hushed gyro Dec 12, 2025, 4:50 PM

#

you can edit messages in lmarena now???

misty harbor Dec 12, 2025, 4:51 PM

#

using claude and it's stuck on generating, any ideas to fix it?

visual osprey Dec 12, 2025, 4:52 PM

#

i feel like those questions have a lot of valid answers though

zealous sparrow Dec 12, 2025, 4:52 PM

#

visual osprey i feel like those questions have a lot of valid answers though

Yeah but you expect the ones you want from the LLMs

#

and if they can reach them

visual osprey Dec 12, 2025, 4:52 PM

#

waht does that mean

zealous sparrow Dec 12, 2025, 4:52 PM

#

They prove how good they are

fleet lintel Dec 12, 2025, 4:53 PM

#

zealous sparrow 2/3 screwed up the hill question

these simple questions are apparently difficult for LLMs

zealous sparrow Dec 12, 2025, 4:53 PM

#

fleet lintel these simple questions are apparently difficult for LLMs

I changed Miles to Kyle so it doesnt do the measurement unit bs

neat apex Dec 12, 2025, 4:53 PM

#

Its LLM fault

#

Gemini 3 must be the goat for this kind of question

#

At least the deep thinking

zealous sparrow Dec 12, 2025, 4:54 PM

#

neat apex Gemini 3 must be the goat for this kind of question

both flash models did the measurement unit bs

visual osprey Dec 12, 2025, 4:54 PM

#

i mean for answer 1 i would say like she might be not following dress code or intoxicated

she took daisys because its her namesake

badly worded but crashing on the way down seems the most logical and likely

#

third question is really badly worded

zealous sparrow Dec 12, 2025, 4:54 PM

#

Im going to ask 3 pro

visual osprey Dec 12, 2025, 4:54 PM

#

saying when they got down implies both reached the bottom of the hill when the answer you want demands that one of them did not

hushed gyro Dec 12, 2025, 4:54 PM

#

What models are in the video arena channels for image and video?

zealous sparrow Dec 12, 2025, 4:54 PM

#

neat apex At least the deep thinking

btw fiercefalcon and ghostfalcon [3 flash] failed

#

overall just wrong but also fell for measurement unit bs

polar wharf Dec 12, 2025, 4:55 PM

#

any ideas what other models to add to gpt4free.pro?

zealous sparrow Dec 12, 2025, 4:55 PM

#

Also the models often interpret booked as criminal booked

#

exactly

polar wharf Dec 12, 2025, 4:56 PM

#

damn

compact flame Dec 12, 2025, 4:56 PM

#

zealous sparrow exactly

Damn booked up in a prison cell

#

Like hey I booked you a prison cell spot

visual osprey Dec 12, 2025, 4:57 PM

#

thats not the correct usage

zealous sparrow Dec 12, 2025, 4:57 PM

#

3 pro keeps interpreting the people as pets

#

or animals

#

or objects

polar wharf Dec 12, 2025, 4:58 PM

#

zealous sparrow 3 pro keeps interpreting the people as pets

foxxy

zealous sparrow Dec 12, 2025, 4:58 PM

#

1/3 [gemini 3 pro]

#

2/3 [failed ticket question]

bright shard Dec 12, 2025, 5:01 PM

#

echo aurora We have been noticing higher than usual error rates, and our team is aware and w...

If it happens again, I'll check it to confirm it for you, thank you very much!!!

proud bobcat Dec 12, 2025, 5:01 PM

#

zealous sparrow 2/3 [failed ticket question]

what is the prompt btw

#

nvm i got it

zealous sparrow Dec 12, 2025, 5:03 PM

#

bro if not the Miles BS
LLMs will say Kyle sounds like Isle or Cycle

proud bobcat Dec 12, 2025, 5:05 PM

#

they think its some wordplay

desert abyss Dec 12, 2025, 5:07 PM

#

Hello! @midnight vigil Please check ⁠how-to-video-bot to learn how to generate videos.

zealous sparrow Dec 12, 2025, 5:08 PM

#

proud bobcat they think its some wordplay

gpt 5.2 told me its wordplay just now 💀

#

2/3 failed the hill question

compact flame Dec 12, 2025, 5:17 PM

#

Extra high said miles was missing

#

It measured in kilometers surprisingly

#

Though about the first one it said Lucy is a staff member

#

Brah

misty harbor Dec 12, 2025, 5:20 PM

#

anyone knows how to fix a constant generating on lmarena?

echo aurora Dec 12, 2025, 5:26 PM

#

hushed gyro What models are in the video arena channels for image and video?

A good way to check this is looking at the leaderboards - https://lmarena.ai/leaderboard/text-to-video & https://lmarena.ai/leaderboard/image-to-video

Text-to-Video Arena | LMArena

Compare models according to their ability to generate videos based on the given prompt

echo aurora Dec 12, 2025, 5:26 PM

#

misty harbor anyone knows how to fix a constant generating on lmarena?

It depends on the error. I assume you're running into the Something went wrong with your generation, please try again error?

cold minnow Dec 12, 2025, 5:30 PM

#

echo aurora It depends on the error. I assume you're running into the `Something went wrong ...

Yes how can i fix it

misty harbor Dec 12, 2025, 5:30 PM

#

echo aurora It depends on the error. I assume you're running into the `Something went wrong ...

no, just generating

#

misty harbor Dec 12, 2025, 5:30 PM

#

cold minnow Yes how can i fix it

no that's not it

cold minnow Dec 12, 2025, 5:31 PM

#

#

How cani fix it

lucid nexus Dec 12, 2025, 5:31 PM

#

How to create video

red meadow Dec 12, 2025, 5:32 PM

#

i cant generate anything, i always get the error the other user above got.

olive mountain Dec 12, 2025, 5:33 PM

#

#

keep getting this error

cold minnow Dec 12, 2025, 5:33 PM

#

Yea

echo aurora Dec 12, 2025, 5:36 PM

#

cold minnow Yes how can i fix it

For this error it's a bit difficult to troubleshoot as it can be caused by different reasons. For emini-3-pro-image-preview we have been seeing higher than usual error rates, our team is aware of this. With this case unforuntately, there isn't much on the user's end that you can do to get past this. But overall I would recommend: refreshing the page, starting a new chat, clearing cache. This can help, but isn't a guaranteed fix.

It's worth noting too that this error can trigger because of rate limit, which tends to be pretty common. This can be verified by opening Dev Tools > Network > search Stream and find the Eval ID, there if you're seeing the Status Code = 429 that means it's being caused by rate limit.

echo aurora Dec 12, 2025, 5:37 PM

#

lucid nexus How to create video

In #1397655624103493813 you'll find the information you're looking for.

misty harbor Dec 12, 2025, 5:37 PM

#

@echo aurora what about the constant generating?

echo aurora Dec 12, 2025, 5:38 PM

#

misty harbor no, just generating

This one is more difficult to figure out what the issue is as an error isn't triggering giving some kind of status code. However, trying the same methods may help here: refreshing the page, starting a new chat, clearing cache.

misty harbor Dec 12, 2025, 5:39 PM

#

i did all of the above, including a hard refresh

#

well the new chat creates a new chat but can't load the one that's stuck

olive mountain Dec 12, 2025, 5:40 PM

#

after generating a image in the discord itself, how to download that generated image in the local storage? Or we cant download it?

bright shard Dec 12, 2025, 5:46 PM

#

echo aurora We have been noticing higher than usual error rates, and our team is aware and w...

Yes Pineapple, I'm getting error 429

obtuse heart Dec 12, 2025, 5:46 PM

#

Pineapple working overtime huh

echo aurora Dec 12, 2025, 5:47 PM

#

misty harbor i did all of the above, including a hard refresh

Sorry to hear that didn't help fix the problem. Unfortunately, this is a known bug that can happen occasionally. If a chat gets too long, or long prompts/responses can contribute to this.

echo aurora Dec 12, 2025, 5:47 PM

#

bright shard Yes Pineapple, I'm getting error 429

Yeah that's rate limit sorry to say.

misty harbor Dec 12, 2025, 5:47 PM

#

echo aurora Sorry to hear that didn't help fix the problem. Unfortunately, this is a known b...

oh no worries, thank you for trying to help!

bright shard Dec 12, 2025, 5:48 PM

#

echo aurora Yeah that's rate limit sorry to say.

And how often does that limit reset?

echo aurora Dec 12, 2025, 5:52 PM

#

bright shard And how often does that limit reset?

IIRC it's 50 mins.

pseudo hemlock Dec 12, 2025, 5:54 PM

#

But like who funds them

#

Does Google pay and pray (assume) their model with a fancy hidden name is going to be the best?

#

So they’re willing to give $1000?

light tusk Dec 12, 2025, 5:55 PM

#

Is 5.2 not going to leaderboard?

echo aurora Dec 12, 2025, 5:59 PM

#

light tusk Is 5.2 not going to leaderboard?

It's on the WebDev leaderboard https://lmarena.ai/leaderboard/webdev, but isn't yet on Text

#

As soon as we have those scores we'll be sure to put out an announcement, so keep an eye on #announcements

light tusk Dec 12, 2025, 6:00 PM

#

I’m taking the over on kalshi

thorn path Dec 12, 2025, 6:01 PM

#

light tusk I’m taking the over on kalshi

There's no way openai will outperform Gemini, Kalshi even has it as a 90% in favor of Gemini so everyone knows this

#

If you were waiting for leaderboard to update so you can find out who wins just go ahead and make the decision now since it's already not looking too great for 5.2

empty stump Dec 12, 2025, 6:23 PM

#

openai is falling

latent crest Dec 12, 2025, 6:23 PM

#

Hello @echo aurora When I try to login with google, it wants me to download a file named “google” , why???

golden ocean Dec 12, 2025, 6:25 PM

#

latent crest Hello <@283397944160550928> When I try to login with google, it wants me to dow...

https://cdn.discordapp.com/attachments/1170550524110721176/1339363741044707359/attachment.gif

latent crest Dec 12, 2025, 6:26 PM

#

golden ocean https://cdn.discordapp.com/attachments/1170550524110721176/1339363741044707359/a...

Pardon me; what’s that ?

lime wind Dec 12, 2025, 6:29 PM

#

so how are people finding GPT-5.2? good in general? not revolutionary right?

latent crest Dec 12, 2025, 6:30 PM

#

latent crest Hello <@283397944160550928> When I try to login with google, it wants me to dow...

If it’s a virus, how is it possible? It’s an iPhone after all

cloud zinc Dec 12, 2025, 6:33 PM

#

golden ocean Dec 12, 2025, 6:38 PM

#

latent crest If it’s a virus, how is it possible? It’s an iPhone after all

joke

whole sundial Dec 12, 2025, 6:39 PM

#

cloud zinc

nano banana flash

#

but gemini 3 flash has to come out first

fiery gull Dec 12, 2025, 6:42 PM

#

light tusk Is 5.2 not going to leaderboard?

Calm down, it need votes

#

#1397655624103493813

zealous sparrow Dec 12, 2025, 6:42 PM

#

whole sundial but gemini 3 flash has to come out first

not today for sure, there isnt even a test model for banana 3 flash

#

as in when the flash model comes out

#

eventually

unborn ocean Dec 12, 2025, 6:45 PM

#

poll_question_text

does gpt 5.2 have a new-ish base model?
(e.g. fresh pre-train, new distill, larger private model ..)

victor_answer_votes

8

total_votes

8

victor_answer_id

2

victor_answer_text

no

victor_answer_emoji_name

❌

compact flame Dec 12, 2025, 6:45 PM

#

#1397655624103493813

spark python Dec 12, 2025, 6:46 PM

#

compact flame <#1397655624103493813>

Bro tried twice

zealous sparrow Dec 12, 2025, 6:46 PM

#

but logan posting about nano banana 3 flash already huh

neat apex Dec 12, 2025, 6:47 PM

#

Gpt Pro is already trash, imagine that

#

Imagine how many time would take an gpt 5.5 xtra high

quasi atlas Dec 12, 2025, 6:53 PM

#

@round fox Please check on #1397655624103493813 for creations #video-arena-1 #video-arena-2 #video-arena-3

neat apex Dec 12, 2025, 6:58 PM

#

compact flame Dec 12, 2025, 7:07 PM

#

neat apex

I think what can save them is making a not rushed model that is trained properly

weary galleon Dec 12, 2025, 7:09 PM

#

compact flame I think what can save them is making a not rushed model that is trained properly

Highly agreed.

fleet lintel Dec 12, 2025, 7:25 PM

#

compact flame I think what can save them is making a not rushed model that is trained properly

they haven't not been able to pre-train a model successfully in 1.5 years. they are reaching the end of how much they can squeeze out of post-training. That is likely the reason that they haven't improved much in 2025.

viscid cloak Dec 12, 2025, 7:25 PM

#

YO WHAT are hazel-edit 6 and ghost-pepper in image gen battle? Horrible models 🤣

weary galleon Dec 12, 2025, 7:26 PM

#

poll_question_text

Did you like GPT-5.2?

victor_answer_votes

6

total_votes

11

victor_answer_id

3

victor_answer_text

No, it's worse than GPT-5.1

echo aurora Dec 12, 2025, 7:27 PM

#

latent crest Hello <@283397944160550928> When I try to login with google, it wants me to dow...

This doesn't sound familiar, can you provide more information? Do you have a recorded video by chance?

zealous sparrow Dec 12, 2025, 7:30 PM

#

viscid cloak YO WHAT are hazel-edit 6 and ghost-pepper in image gen battle? Horrible models �...

hazel is OAI

#

ghost-pepper is apparently qwen

golden ocean Dec 12, 2025, 7:32 PM

#

Large Language Model Arena

queen veldt Dec 12, 2025, 7:35 PM

#

neat apex

Sama this you?

#

Ultra high gpt 🌿 will save oai

#

Gpt 5.21.1 Ultra high x-max pro plus

sour spindle Dec 12, 2025, 7:37 PM

#

I feel like I am the only one who likes gpt 5.2

golden ocean Dec 12, 2025, 7:37 PM

#

yes

sour spindle Dec 12, 2025, 7:38 PM

#

Welp the beauty of choice I guess.

grave plaza Dec 12, 2025, 7:44 PM

#

compact flame Dec 12, 2025, 7:44 PM

#

sour spindle I feel like I am the only one who likes gpt 5.2

Uh based on what tasks do you use it

sour spindle Dec 12, 2025, 7:47 PM

#

My job consists of using models for a lot of "text based work." A lot of research based queries. I have been comparing it with gemini and just 10 normal thing I do GPT was better in 7, 1 basically identical output, and 2 gemini was better.

fiery gull Dec 12, 2025, 7:47 PM

#

queen veldt Gpt 5.21.1 Ultra high x-max pro plus

400$ per prompt, but I can use free in lmarena and yupp and etc lol

fiery gull Dec 12, 2025, 7:47 PM

#

grave plaza

is the start, I hope the nova 3 will be good 👀

half mist Dec 12, 2025, 7:57 PM

#

What version does the ai use in Code Arena if you pick both good or both bad in battle model

gaunt roost Dec 12, 2025, 8:01 PM

#

https://www.youtube.com/watch?v=wtct-rdzHKo&t=476s

YouTube

Jossy Concept

How to Use MidJourney FREE & Unlimited in 2025 (Real Method)

Everyone wants MidJourney FREE & Unlimited — and in this video, I show the closest real method to getting MidJourney-level images without paying anything.

I’ll show you the secret AI tool that creates MidJourney-style images, how to recreate images from MidJourney Explore, how to write stronger prompts, and how to even animate your results...

▶ Play video

sharp mirage Dec 12, 2025, 8:11 PM

#

Hi

#

Anyone ?

weary galleon Dec 12, 2025, 8:15 PM

#

sharp mirage Hi

Hi!

compact flame Dec 12, 2025, 8:16 PM

#

sharp mirage Hi

Hi

sharp mirage Dec 12, 2025, 8:19 PM

#

Hi

#

Anyone got any prompt for Clash of clans game ?

echo aurora Dec 12, 2025, 8:23 PM

#

sharp mirage Hi

Hello ablobwave

sharp mirage Dec 12, 2025, 8:28 PM

#

Hey 🙂

compact flame Dec 12, 2025, 8:37 PM

#

sharp mirage Anyone got any prompt for Clash of clans game ?

Uh try asking ai to make you a prompt?

sharp mirage Dec 12, 2025, 8:38 PM

#

I am doing this

sonic flare Dec 12, 2025, 8:40 PM

#

Hello guys, I'm new here.

Pls, which AI model is best for book content creation?

queen veldt Dec 12, 2025, 8:41 PM

#

compact flame Dec 12, 2025, 8:41 PM

#

sonic flare Hello guys, I'm new here. Pls, which AI model is best for book content creatio...

There no best

#

It's based on what you prefer

burnt sinew Dec 12, 2025, 8:41 PM

#

sonic flare Hello guys, I'm new here. Pls, which AI model is best for book content creatio...

@fiery gull gemini 3

queen veldt Dec 12, 2025, 8:41 PM

#

Gpt 4.5

burnt sinew Dec 12, 2025, 8:41 PM

#

Why did it ping him

queen veldt Dec 12, 2025, 8:41 PM

#

Highest context window

burnt sinew Dec 12, 2025, 8:42 PM

#

queen veldt Gpt 4.5

5.5

burnt sinew Dec 12, 2025, 8:42 PM

#

queen veldt Highest context window

I thought gemini has highest

compact flame Dec 12, 2025, 8:42 PM

#

queen veldt

What is this?

echo aurora Dec 12, 2025, 8:42 PM

#

sonic flare Hello guys, I'm new here. Pls, which AI model is best for book content creatio...

Welcome welcome ablobwave would encourage you to check out our Text Arena leaderboard with the Creative Writing category -> https://lmarena.ai/leaderboard/text/creative-writing

compact flame Dec 12, 2025, 8:42 PM

#

echo aurora Welcome welcome <a:ablobwave:552927506957729802> would encourage you to check ou...

Damn he got whole texts ready

faint drum Dec 12, 2025, 8:43 PM

#

Is LMArena down?

compact flame Dec 12, 2025, 8:44 PM

#

faint drum Is LMArena down?

No I think

#

No it's not

slim spire Dec 12, 2025, 8:45 PM

#

GPT 5.2 broke for me

zealous sparrow Dec 12, 2025, 8:45 PM

#

fellas we got some new gpt 5.2

#

on lm

slim spire Dec 12, 2025, 8:45 PM

#

does gpt 5.2 work for you guys

burnt sinew Dec 12, 2025, 8:46 PM

#

zealous sparrow fellas we got some new gpt 5.2

5.2 search nice

half mist Dec 12, 2025, 8:46 PM

#

zealous sparrow fellas we got some new gpt 5.2

When is it gonna be on LMArena?

zealous sparrow Dec 12, 2025, 8:46 PM

#

half mist When is it gonna be on LMArena?

right now

half mist Dec 12, 2025, 8:47 PM

#

zealous sparrow right now

Not all of them are on there. I only see 5.2 high, 5.2, and 5.2 search

zealous sparrow Dec 12, 2025, 8:47 PM

#

half mist Not all of them are on there. I only see 5.2 high, 5.2, and 5.2 search

the system prompt ones prob not fully added yet

half mist Dec 12, 2025, 8:48 PM

#

zealous sparrow the system prompt ones prob not fully added yet

But what about the code ones?

zealous sparrow Dec 12, 2025, 8:48 PM

#

half mist But what about the code ones?

the names got reverted

burnt sinew Dec 12, 2025, 8:48 PM

#

zealous sparrow the names got reverted

To what

zealous sparrow Dec 12, 2025, 8:48 PM

#

just now btw

half mist Dec 12, 2025, 8:49 PM

#

zealous sparrow just now btw

Why?

zealous sparrow Dec 12, 2025, 8:49 PM

#

half mist Why?

dunno

#

mayb issues

cloud zinc Dec 12, 2025, 8:50 PM

#

sonic flare Hello guys, I'm new here. Pls, which AI model is best for book content creatio...

grok 4.1

zealous sparrow Dec 12, 2025, 8:51 PM

#

@echo aurora was gpt 5.2-code a new model or a finetune? It was removed immediately..

cloud zinc Dec 12, 2025, 8:51 PM

#

its finetune for code

zealous sparrow Dec 12, 2025, 8:51 PM

#

cloud zinc its finetune for code

yeah immediately got removed

#

even if its a finetune

sonic flare Dec 12, 2025, 8:52 PM

#

cloud zinc grok 4.1

I kinda think Grok is nice for writing content

burnt sinew Dec 12, 2025, 8:53 PM

#

sonic flare I kinda think Grok is nice for writing content

No its worse

sonic flare Dec 12, 2025, 8:53 PM

#

burnt sinew No its worse

Which do you use

burnt sinew Dec 12, 2025, 8:53 PM

#

sonic flare Which do you use

Gemini 3

sonic flare Dec 12, 2025, 8:54 PM

#

burnt sinew Gemini 3

How about Gemini 3 Pro

burnt sinew Dec 12, 2025, 8:55 PM

#

sonic flare How about Gemini 3 Pro

Thats the only one

zealous sparrow Dec 12, 2025, 8:57 PM

#

I hope 5.2-code finetune isnt the same case as speciale

cloud zinc Dec 12, 2025, 8:57 PM

#

burnt sinew Thats the only one

there is also gemini 3 flash

burnt sinew Dec 12, 2025, 9:01 PM

#

cloud zinc there is also gemini 3 flash

Oh let me check it out

burnt sinew Dec 12, 2025, 9:01 PM

#

cloud zinc there is also gemini 3 flash

Where? I dont see it

zealous sparrow Dec 12, 2025, 9:06 PM

#

burnt sinew Where? I dont see it

LMArena battle mode

zealous sparrow Dec 12, 2025, 9:11 PM

#

half mist Not all of them are on there. I only see 5.2 high, 5.2, and 5.2 search

turns out these 2 are currently only battle mode models

outer lark Dec 12, 2025, 9:17 PM

#

hello

echo aurora Dec 12, 2025, 9:30 PM

#

outer lark hello

Welcome ablobwave

fiery gull Dec 12, 2025, 9:30 PM

#

sonic flare Hello guys, I'm new here. Pls, which AI model is best for book content creatio...

Create the book using opus 4.5 + gemini 3.0, and create the book with notebooklm or gemini app

#

Use gpt 5.2 xhigh for double check

sharp mirage Dec 12, 2025, 9:31 PM

#

Thats good Idea 💡

fiery gull Dec 12, 2025, 9:31 PM

#

Plan the book with gemini 3.0 and create the book itself with opus 4.5

burnt sinew Dec 12, 2025, 9:31 PM

#

zealous sparrow LMArena battle mode

Oh got it

sharp mirage Dec 12, 2025, 9:31 PM

#

fiery gull Plan the book with gemini 3.0 and create the book itself with opus 4.5

Thinking

fiery gull Dec 12, 2025, 9:31 PM

#

Is just like I do to get the best result

astral bloom Dec 12, 2025, 9:32 PM

#

1, 7, 18, 45, ....?
sol:
115
aₙ = 3aₙ₋₁ - aₙ₋₂ - 2

sharp mirage Dec 12, 2025, 9:32 PM

#

What is that?

astral bloom Dec 12, 2025, 9:32 PM

#

someone sent it as a challenge for ai's

fiery gull Dec 12, 2025, 9:33 PM

#

Bro is impossible to vote opus 4.5 vs gpt 5.2 xhigh 😐

#

Bro I don't have 1% of smart that AI

sharp mirage Dec 12, 2025, 9:35 PM

#

fiery gull Bro is impossible to vote opus 4.5 vs gpt 5.2 xhigh 😐

Did you try gpt5.2xhigh?

#

5.2 high was so mid

sonic flare Dec 12, 2025, 9:35 PM

#

fiery gull Create the book using opus 4.5 + gemini 3.0, and create the book with notebooklm...

How do i get access to opus 4.5 bro

sharp mirage Dec 12, 2025, 9:36 PM

#

sonic flare How do i get access to opus 4.5 bro

Lmarena

fiery gull Dec 12, 2025, 9:36 PM

#

sharp mirage Did you try gpt5.2xhigh?

I'm using for work now, for my use I fell an improve that gpt 5.1

#

But I think is because the 'EXTRA' high

fiery gull Dec 12, 2025, 9:36 PM

#

sonic flare How do i get access to opus 4.5 bro

Lmarena 👀

sharp mirage Dec 12, 2025, 9:36 PM

#

fiery gull But I think is because the 'EXTRA' high

I think so

#

But tbh I don't think it's worth it to buy api or using it

#

Bro its so expensive

fiery gull Dec 12, 2025, 9:37 PM

#

Bro but in code (word html) the gpt 5.2 is really cooking like gemini 3.0

sharp mirage Dec 12, 2025, 9:37 PM

#

fiery gull Bro but in code (word html) the gpt 5.2 is really cooking like gemini 3.0

I got no time to try

fiery gull Dec 12, 2025, 9:38 PM

#

sharp mirage Bro its so expensive

Yeah, the 3 is sooo expensive 💀, but gpt is more that opus and gemini

sharp mirage Dec 12, 2025, 9:38 PM

#

I was sleeping 🙂

latent crest Dec 12, 2025, 9:38 PM

#

echo aurora This doesn't sound familiar, can you provide more information? Do you have a rec...

I sadly have not. It happened to me twice. I go into the page, then login, google and they ask me to download a file called “google” and I deny it

latent crest Dec 12, 2025, 9:38 PM

#

echo aurora This doesn't sound familiar, can you provide more information? Do you have a rec...

Oh it’s from the iPhone

fiery gull Dec 12, 2025, 9:39 PM

#

sharp mirage I got no time to try

(generic project you can see it) the opus make a mistake in word html the gpt is perfect

#

soooo good the gpt 5.2 xhigh in word html

sharp mirage Dec 12, 2025, 9:39 PM

#

Hmmm

#

Gpt is better here

#

Btw are you on Mac ?

fiery gull Dec 12, 2025, 9:39 PM

#

gpt 5.2 xhigh = gemini 3.0

fiery gull Dec 12, 2025, 9:40 PM

#

sharp mirage Btw are you on Mac ?

no, windows

sharp mirage Dec 12, 2025, 9:40 PM

#

Pro?

sonic flare Dec 12, 2025, 9:40 PM

#

fiery gull Lmarena 👀

I mean, which of this? lol

fiery gull Dec 12, 2025, 9:40 PM

#

sonic flare I mean, which of this? lol

yeahhh, just use it to write

#

plan/thinking the book with gpt 5.2 xhigh

#

gemini 3.0 for create the book in gemini app

#

use the 3 lol

sharp mirage Dec 12, 2025, 9:41 PM

#

sonic flare I mean, which of this? lol

Thinking

#

The first one

fiery gull Dec 12, 2025, 9:41 PM

#

sonic flare I mean, which of this? lol

ahhh thinking, allways

sharp mirage Dec 12, 2025, 9:41 PM

#

But it has a rate limt

sonic flare Dec 12, 2025, 9:41 PM

#

sharp mirage Thinking

Thanks man

fiery gull Dec 12, 2025, 9:42 PM

#

sonic flare Thanks man

I don't read the question lol, use allways thinking mode

sharp mirage Dec 12, 2025, 9:42 PM

#

fiery gull ahhh thinking, allways

I do love opus 4.5 thinking but the rate limt hits so fast

modest prism Dec 12, 2025, 9:42 PM

#

Which one has better vision and image understanding? Gemini 3 or gpt 5.2

fiery gull Dec 12, 2025, 9:43 PM

#

sharp mirage I do love opus 4.5 thinking but the rate limt hits so fast

I'm so anxious for still see the fight that 3, like messi vs cr7 :D.

sharp mirage Dec 12, 2025, 9:43 PM

#

fiery gull I'm so anxious for still see the fight that 3, like messi vs cr7 :D.

🙂

sonic flare Dec 12, 2025, 9:43 PM

#

sharp mirage I do love opus 4.5 thinking but the rate limt hits so fast

Yeah it's so annoying, i had to wait for 5hrs to reset

fiery gull Dec 12, 2025, 9:43 PM

#

modest prism Which one has better vision and image understanding? Gemini 3 or gpt 5.2

gemini 3.0 still the better, much

sharp mirage Dec 12, 2025, 9:43 PM

#

modest prism Which one has better vision and image understanding? Gemini 3 or gpt 5.2

Gemini 3

whole sundial Dec 12, 2025, 9:44 PM

#

#1397655624103493813
<@&1349916362595635286>

neon idol Dec 12, 2025, 9:45 PM

#

Fr

modest prism Dec 12, 2025, 9:45 PM

#

Why is gpt 5.2 high so fast on lmarena it feels like it doesn't think and it's instant

neon idol Dec 12, 2025, 9:45 PM

#

modest prism Why is gpt 5.2 high so fast on lmarena it feels like it doesn't think and it's i...

Maybe questions are simple?

sharp mirage Dec 12, 2025, 9:45 PM

#

Pineapple saw that:

neon idol Dec 12, 2025, 9:45 PM

#

Like 2-2

fiery gull Dec 12, 2025, 9:45 PM

#

modest prism Why is gpt 5.2 high so fast on lmarena it feels like it doesn't think and it's i...

hmmm, I'll see it later o-o, but you using high mode?

sonic flare Dec 12, 2025, 9:46 PM

#

Will that work? Wouldn't that mess up my content?

fiery gull Dec 12, 2025, 9:46 PM

#

lol I delet it ;-;

sharp mirage Dec 12, 2025, 9:46 PM

#

😉

neon idol Dec 12, 2025, 9:46 PM

#

sharp mirage Pineapple saw that:

https://tenor.com/view/doakes-stare-gif-6201328377551420443

Tenor

#

Pineapple now

fiery gull Dec 12, 2025, 9:46 PM

#

sonic flare Will that work? Wouldn't that mess up my content?

see the direct chat 👀

vivid coral Dec 12, 2025, 9:47 PM

#

OMG OMG OMG.....IT'S HERE!!!! @echo aurora is the 🐐 🐐 🐐

sharp mirage Dec 12, 2025, 9:47 PM

#

vivid coral OMG OMG OMG.....IT'S HERE!!!! <@283397944160550928> is the 🐐 🐐 🐐

Yeaaaah

#

Btw I think no one cares 😔

vivid coral Dec 12, 2025, 9:49 PM

#

Everyone cares, nobody uses closed book LLMs in the real world, we all get caught up in this bubble we have here and don't realize what the masses want and need

sharp mirage Dec 12, 2025, 9:49 PM

#

@echo aurora did you add glm4.6V?

echo aurora Dec 12, 2025, 9:50 PM

#

vivid coral OMG OMG OMG.....IT'S HERE!!!! <@283397944160550928> is the 🐐 🐐 🐐

Just added! You’re too fast

vivid coral Dec 12, 2025, 9:50 PM

#

echo aurora Just added! You’re too fast

🤣 🤣 🤣

modest prism Dec 12, 2025, 9:50 PM

#

vivid coral Everyone cares, nobody uses closed book LLMs in the real world, we all get caugh...

LLMs are just glorified token prediction algorithms. Trillions of dollars have been wasted

fiery gull Dec 12, 2025, 9:51 PM

#

sharp mirage <@283397944160550928> did you add glm4.6V?

we need it

#

and glm 4.6v fast? I want another small model in rank 👀 , serious this new model is toe-to-toe of glm 4.5v?

modest prism Dec 12, 2025, 9:52 PM

#

Something I like about 5.2 is that the hallucinate rate seems a lot lower than Gemini 3 pro

sharp mirage Dec 12, 2025, 9:53 PM

#

Btw you can now try it from the glm site

#

Chat.z.ai

#

Btw I think glm is cooking

#

Yea I told ya in the screenshot I send

cloud zinc Dec 12, 2025, 9:53 PM

#

where can i access that

sharp mirage Dec 12, 2025, 9:53 PM

#

But y'all said fake

sharp mirage Dec 12, 2025, 9:54 PM

#

cloud zinc where can i access that

Are you guys shown together all the time ?

cloud zinc Dec 12, 2025, 9:54 PM

#

modest prism Dec 12, 2025, 9:54 PM

#

API costs

cloud zinc Dec 12, 2025, 9:54 PM

#

api requires money

#

xhigh is also token hungry, i aint paying

sharp mirage Dec 12, 2025, 9:54 PM

#

Guys what is the best Minecraft command coder ?

cloud zinc Dec 12, 2025, 9:55 PM

#

then why u on lmarena

sharp mirage Dec 12, 2025, 9:55 PM

#

For now I found chat gpt 5.1high and glm 4.6

#

No extramehigh

fleet lintel Dec 12, 2025, 9:58 PM

#

https://x.com/xw33bttv/status/1999293467806302261

Lex (@xw33bttv)

Holy shit lmao #ChatGPT 5.2 is quite possible the worst model they've ever released. I have no idea what the fuck they have done - there's no way this was the alpha model my cohort tested, nor is it even remotely close to how well 5.1 was performing the other day.

This is quite

#

ignore 5.2 model. Let's wait for 5.5

fiery gull Dec 12, 2025, 9:58 PM

#

Gpt 5.2 thinking is a gpt what decide how many thinking itself will use

fiery gull Dec 12, 2025, 9:58 PM

#

fleet lintel ignore 5.2 model. Let's wait for 5.5

Yeah 🫦, we need still to clanker the gpt 5.2 to openai do a gpt 5.5 better

fleet lintel Dec 12, 2025, 10:00 PM

#

they overtrained on arc-agi to create some buzz.. but real life performance got worse

fiery gull Dec 12, 2025, 10:01 PM

#

Too, ever month a new chatgpt lol

fleet lintel Dec 12, 2025, 10:01 PM

#

i have plus membership.. and i gave very decent shot to it and used it excessively today.
it's honestly trash compared to gemini 3

sharp mirage Dec 12, 2025, 10:02 PM

#

Bro

queen veldt Dec 12, 2025, 10:02 PM

#

Ye

sharp mirage Dec 12, 2025, 10:02 PM

#

fiery gull Too, ever month a new chatgpt lol

Bro its like every week another ai drop

queen veldt Dec 12, 2025, 10:02 PM

#

Gpt 5 high is only for pro hsers

#

Extended thinking in gpt is medium

sharp mirage Dec 12, 2025, 10:03 PM

#

Why funny ?

devout vault Dec 12, 2025, 10:03 PM

#

Grok 3 is smart because it has less restrictions

sharp mirage Dec 12, 2025, 10:03 PM

#

Good point

fiery gull Dec 12, 2025, 10:03 PM

#

devout vault Grok 3 is smart because it has less restrictions

Grok 3 in 2026 lil'bro?

fleet lintel Dec 12, 2025, 10:04 PM

#

yes, i think plus users gets medium. this actually makes me mad. they are treating paying customers badly. it honestly feels very scammy to me.

whole sundial Dec 12, 2025, 10:04 PM

#

fiery gull Grok 3 in 2026 lil'bro?

well its weights will be opened in early 2026

fiery gull Dec 12, 2025, 10:04 PM

#

OpenAi seeing this 🤑🤑

devout vault Dec 12, 2025, 10:04 PM

#

Chatgpt is no longer the godfather of Ai it became just like deepseek

fiery gull Dec 12, 2025, 10:05 PM

#

whole sundial well its weights will be opened in early 2026

Good, but the modeed deepseek v3.2 don't is very better? 🤔

sharp mirage Dec 12, 2025, 10:05 PM

#

devout vault Chatgpt is no longer the godfather of Ai it became just like deepseek

Claude killed the dream of chatgpt

burnt sinew Dec 12, 2025, 10:05 PM

#

devout vault Chatgpt is no longer the godfather of Ai it became just like deepseek

most people will continue using it though because it's all they know

fleet lintel Dec 12, 2025, 10:05 PM

#

i rather pay for gemini 3 pro ..get 2TB storage as well and get much much better performance.

burnt sinew Dec 12, 2025, 10:05 PM

#

fleet lintel i rather pay for gemini 3 pro ..get 2TB storage as well and get much much better...

rather get it for free...

sharp mirage Dec 12, 2025, 10:05 PM

#

fleet lintel i rather pay for gemini 3 pro ..get 2TB storage as well and get much much better...

Yeah same

burnt sinew Dec 12, 2025, 10:05 PM

#

and unlimited usage

fiery gull Dec 12, 2025, 10:05 PM

#

devout vault Chatgpt is no longer the godfather of Ai it became just like deepseek

But the openai have an 1ti of divide 💀

sharp mirage Dec 12, 2025, 10:05 PM

#

And gpt isn't worth it

fleet lintel Dec 12, 2025, 10:06 PM

#

burnt sinew rather get it for free...

through ai studio? I need enterprise controls, so i must pay

fiery gull Dec 12, 2025, 10:07 PM

#

fleet lintel through ai studio? I need enterprise controls, so i must pay

I'm using aistudio and gemini app same time

#

Gemini pro plan

#

Maybe a gemini 3.1 without lazy? My dream 👀

#

Exist veo 3.1.... I just need dream it

fleet lintel Dec 12, 2025, 10:09 PM

#

fiery gull Maybe a gemini 3.1 without lazy? My dream 👀

good thing is that gemini 3 base model is really strong, it will be much easier for Google to do better post training and get the best out of it

cloud zinc Dec 12, 2025, 10:12 PM

#

#

gpt 5.2 so bad

weary galleon Dec 12, 2025, 10:15 PM

#

GPT 5.2👎 👎 👎 👎 👎

queen veldt Dec 12, 2025, 10:15 PM

#

fleet lintel good thing is that gemini 3 base model is really strong, it will be much easier ...

Post training

#

We still have 2.5 flash preview

#

It's never leaving the preview

burnt sinew Dec 12, 2025, 10:17 PM

#

fleet lintel good thing is that gemini 3 base model is really strong, it will be much easier ...

i think its a preview model?

queen veldt Dec 12, 2025, 10:17 PM

#

We won't get the gemini 3 pro regular

#

It'll stay in preview

#

Until gemini 3.5 pro preview comes out or something

sharp mirage Dec 12, 2025, 10:18 PM

#

Bro Open ai is lying to us ?

fleet lintel Dec 12, 2025, 10:18 PM

#

cloud zinc

this is not surprising to me. 5.2 is built on 1.5 year old base model. OAI had enough time to already squeeze the best out of it. Changing it further (like 5,1 to 5.2) would result in improving in one area (arc-agi) and downgrade in others.

sharp mirage Dec 12, 2025, 10:18 PM

#

No way this true

queen veldt Dec 12, 2025, 10:22 PM

#

#

They say the sponsors don't have any influence on testing

#

BUT

#

Arc-agi test is basically giving the model one unique test that the model hasn't been trained on

#

Sooo

modest prism Dec 12, 2025, 10:23 PM

#

cloud zinc

What benchmark?

queen veldt Dec 12, 2025, 10:23 PM

#

If they paid someone in the company to snitch the arc agi test 2

#

Prompts

#

They could train their model for it

#

Which i honestly think happened

#

They probably have some snitch

#

I guess that's their code red

#

To do false publish

#

Of "super improved model"

#

I'm not amazed at all on the gpt 5.2

modest prism Dec 12, 2025, 10:26 PM

#

If you guys find a reliable way to use gpt 5.2 extra high for free please tell me I need it so bad

queen veldt Dec 12, 2025, 10:26 PM

#

I've tested 5.1 1 month ago i think

#

And it failed some tests that gemini 3 pro passed

#

And now the 5.2 failed same tests

#

It's some math problems which require bunchh of steps to get to final result

#

Gemini 3 pro is king

#

Claude is for coding

#

Even for agentic coding

#

Codex max 5.1 was terrible i had to re-do prompts multiple times since I couldn't start my app.....

#

Meanwhile sonnet and opus did them in first try like easy

cloud zinc Dec 12, 2025, 10:29 PM

#

modest prism What benchmark?

simple bench

modest prism Dec 12, 2025, 10:29 PM

#

modest prism If you guys find a reliable way to use gpt 5.2 extra high for free please tell m...

@cloud zinc

queen veldt Dec 12, 2025, 10:30 PM

#

cloud zinc

Regular customer's gpt isn't even here on the list tho

#

😂

cloud zinc Dec 12, 2025, 10:30 PM

#

gemini 3 pro way better

thorny schooner Dec 12, 2025, 10:30 PM

#

Is it just me or are the bugs are starting to become more common here on the website because I am seeing the weird disappearing witch becoming a lot more often in reports but also infinite generation unless you Reload Glitch becoming a lot more often both in personal experience and the reports

queen veldt Dec 12, 2025, 10:31 PM

#

https://tenor.com/view/lucinda-lucinda-the-witch-sofia-the-first-wand-laugh-gif-14589531

Tenor

queen veldt Dec 12, 2025, 10:32 PM

#

thorny schooner Is it just me or are the bugs are starting to become more common here on the web...

Yes mods are aware of it for a while

#

No fixes yet

#

thorny schooner Dec 12, 2025, 10:33 PM

#

I i'm aware I was just commenting on it since I'm pretty sure I was the first one who made a report on it at least for one of the glitches at least from where I can see in the report area

#

Because honestly both of those glitches has been going on for me for a bit while now

queen veldt Dec 12, 2025, 10:34 PM

#

They don't even know why it's happening

#

It's a problem I've talked for a while..m

#

We neeeeeeeedddd to see the error code or something

#

So we can be moreeeee speeecifiicccc

#

Offf thee errorrr

#

Just the retry again isn't enough

thorny schooner Dec 12, 2025, 10:35 PM

#

Well they probably need to figure out soon enough because if they don't they could be losing a lot of customers soon and fair enough for the code but i am not even going to try to see what code is in the error cuz I have tried to see the code and it looks confusing as hell

#

I already gave video examples to one of the staff

#

When it comes to the glitch itself

bright shard Dec 12, 2025, 10:35 PM

#

@echo aurora An AI arena for audio, music, etc., would be amazing! It's the only thing LMArena is missing!

burnt sinew Dec 12, 2025, 10:40 PM

#

google a/b test right now are for what you guys thing?

burnt spindle Dec 12, 2025, 10:41 PM

#

hello

burnt pulsar Dec 12, 2025, 10:42 PM

#

I've tried all day to get gpt 5.2-high to work on lmarena, no luck so far.

weary galleon Dec 12, 2025, 10:44 PM

#

It doesn't have thinking.

proud bobcat Dec 12, 2025, 11:06 PM

#

#

AGI!!1!1!1!1!1

weary galleon Dec 12, 2025, 11:07 PM

#

proud bobcat

Is this image AI-generated?

proud bobcat Dec 12, 2025, 11:07 PM

#

What

#

This is artificial analysis benchmark

weary galleon Dec 12, 2025, 11:08 PM

#

Maybe I'm wrong, but it looks like fake.

pseudo summit Dec 12, 2025, 11:08 PM

#

proud bobcat This is artificial analysis benchmark

could u send link pls? 👀

proud bobcat Dec 12, 2025, 11:09 PM

#

pseudo summit could u send link pls? 👀

Sure give me a sec

#

https://www.reddit.com/r/singularity/s/YaVYlnFDyv

From the singularity community on Reddit: GPT-5.2-Thinking scored l...

Explore this post and more from the singularity community

#

Boom

weary galleon Dec 12, 2025, 11:09 PM

#

KAT-Coder-Pro V1 has too much, and Gemini 3 Pro has too less.

hollow ivy Dec 12, 2025, 11:09 PM

#

proud bobcat Dec 12, 2025, 11:09 PM

#

https://artificialanalysis.ai

AI Model & API Providers Analysis | Artificial Analysis

Comparison and analysis of AI models and API hosting providers. Independent benchmarks across key performance metrics including quality, price, output speed & latency.

#

Here’s the site

proud bobcat Dec 12, 2025, 11:10 PM

#

weary galleon KAT-Coder-Pro V1 has too much, and Gemini 3 Pro has too less.

Reminder that they add this all up by benchmarks

#

Like

#

Multiple

#

This is the average

pseudo summit Dec 12, 2025, 11:10 PM

#

proud bobcat Sure give me a sec

ty!

#

haven't seen that one yet, but it looks interesting

proud bobcat Dec 12, 2025, 11:11 PM

#

Artificial Analysis is usually quite reliable

#

It’s a good assessment on general performance without the bias

pseudo summit Dec 12, 2025, 11:11 PM

#

proud bobcat Artificial Analysis is usually quite reliable

like in general? or like in relation to LMArena ranking?

#

o wait, just read ur second message again. im dum

#

FacepalmSeola

weary galleon Dec 12, 2025, 11:13 PM

#

As I said, it's fake

lucid geyser Dec 12, 2025, 11:13 PM

#

@echo aurora How long do new models take to appear on the leaderboard

echo aurora Dec 12, 2025, 11:14 PM

#

lucid geyser <@283397944160550928> How long do new models take to appear on the leaderboard

It mostly depends on the amount of votes we're seeing

lucid geyser Dec 12, 2025, 11:15 PM

#

echo aurora It mostly depends on the amount of votes we're seeing

So how long until 5.2 do u think?

#

Also is there a higher probability of getting a newer model

queen veldt Dec 12, 2025, 11:17 PM

#

https://x.com/i/status/1999548496198926728

Epoch AI (@EpochAIResearch)

GPT-5.2 scores 152 on the Epoch Capabilities Index (ECI), our tool for aggregating benchmark scores. This puts it second only to Gemini 3 Pro.

🧵 with individual scores.

echo aurora Dec 12, 2025, 11:17 PM

#

lucid geyser So how long until 5.2 do u think?

I wouldn't want to give an ETA as I'd hate to give the wrong impression.

weary galleon Dec 12, 2025, 11:18 PM

#

echo aurora I wouldn't want to give an ETA as I'd hate to give the wrong impression.

Do you have insider information?

proud bobcat Dec 12, 2025, 11:18 PM

#

queen veldt https://x.com/i/status/1999548496198926728

Hate on me if you wish but I feel this is another grok 4 situation

golden ocean Dec 12, 2025, 11:18 PM

#

openai is cooked

#

done for

echo aurora Dec 12, 2025, 11:18 PM

#

weary galleon Do you have insider information?

I am an employee here

proud bobcat Dec 12, 2025, 11:18 PM

#

proud bobcat Hate on me if you wish but I feel this is another grok 4 situation

OpenAI spits out model supposedly SOTA, crushes benchmarks, blah blah

Comes out and while it scores great on benchmarks it fails individual tests

#

And daily use

#

Same with GPT5

lucid geyser Dec 12, 2025, 11:20 PM

#

echo aurora I wouldn't want to give an ETA as I'd hate to give the wrong impression.

Nah it’s fine but how long have passed models taken like Gemini 3, gpt 5.1 if you know

weary galleon Dec 12, 2025, 11:21 PM

#

echo aurora I am an employee here

Unveil all codenames to us.

echo aurora Dec 12, 2025, 11:21 PM

#

lucid geyser Nah it’s fine but how long have passed models taken like Gemini 3, gpt 5.1 if yo...

Sorry to say I don't know. Generally though, the Text leaderboard will take about a week to update, but this can vary.

echo aurora Dec 12, 2025, 11:21 PM

#

weary galleon Unveil all codenames to us.

blobno

lucid geyser Dec 12, 2025, 11:22 PM

#

echo aurora Sorry to say I don't know. Generally though, the Text leaderboard will take abou...

Thanks

lucid geyser Dec 12, 2025, 11:22 PM

#

weary galleon Unveil all codenames to us.

Get hired

weary galleon Dec 12, 2025, 11:22 PM

#

echo aurora <:blobno:1337517014427897936>

Please

#

🙏

echo aurora Dec 12, 2025, 11:23 PM

#

lucid geyser Thanks

Yeah sorry I can't provide many details on this.

echo aurora Dec 12, 2025, 11:24 PM

#

weary galleon Please

Well since you said please...

weary galleon Dec 12, 2025, 11:24 PM

#

🙏

queen veldt Dec 12, 2025, 11:25 PM

#

It would litteraly end up on reddit in 2 hours

lucid geyser Dec 12, 2025, 11:25 PM

#

echo aurora Yeah sorry I can't provide many details on this.

Might have to leave a bad review

echo aurora Dec 12, 2025, 11:26 PM

#

lucid geyser Might have to leave a bad review

https://tenor.com/view/andersomviolao-gif-12381443319241980618

Tenor

neon idol Dec 12, 2025, 11:26 PM

#

echo aurora Yeah sorry I can't provide many details on this.

Bro knows more things as we expect monti_sus

queen veldt Dec 12, 2025, 11:27 PM

#

Yeah same as google employees

#

They have insider information

#

= free cash on polymarket

neon idol Dec 12, 2025, 11:27 PM

#

echo aurora https://tenor.com/view/andersomviolao-gif-12381443319241980618

Is that the Microsoft support client room?

empty stump Dec 12, 2025, 11:35 PM

#

is gpt 5.2 xhigh only in the api

vivid coral Dec 12, 2025, 11:35 PM

#

my early returns are that GPT 5.2 search is definitely an improvement over 5.1, need more time to say how much yet

golden ocean Dec 12, 2025, 11:36 PM

#

echo aurora https://tenor.com/view/andersomviolao-gif-12381443319241980618

REAL

weary galleon Dec 12, 2025, 11:37 PM

#

vivid coral my early returns are that GPT 5.2 search is definitely an improvement over 5.1, ...

WoW! It's too subjective.

echo aurora Dec 12, 2025, 11:38 PM

#

golden ocean REAL

Just kidding though shy

viral cedar Dec 12, 2025, 11:39 PM

#

#

@weary galleon how is that wrong?

weary galleon Dec 12, 2025, 11:41 PM

#

viral cedar <@1325474604143345727> how is that wrong?

GPT 5.2 is a piece of crap

native yarrow Dec 12, 2025, 11:41 PM

#

^

viral cedar Dec 12, 2025, 11:41 PM

#

weary galleon GPT 5.2 is a piece of crap

"trust me bro" statements by AI companies

#

😐

empty stump Dec 12, 2025, 11:41 PM

#

every time they behind they just add more reasoning effort

viral cedar Dec 12, 2025, 11:43 PM

#

empty stump every time they behind they just add more reasoning effort

wasn't Deepseek 3.2 said to also be really good

#

but turned out ass

hollow ivy Dec 12, 2025, 11:48 PM

#

viral cedar

Longterm, OAI has no chance vs Google (and even vs Anthropic).

torn mantle Dec 12, 2025, 11:48 PM

#

proud bobcat

lol

hollow ivy Dec 12, 2025, 11:48 PM

#

-# Grok is a wild card.

torn mantle Dec 12, 2025, 11:48 PM

#

told ya

#

oai hit a plateau

hollow ivy Dec 12, 2025, 11:49 PM

#

Currently Anthropic is ahead.

#

(coding is most important discipline, long-term.)

viral cedar Dec 12, 2025, 11:50 PM

#

hollow ivy (coding is most important discipline, long-term.)

💯

hollow ivy Dec 12, 2025, 11:51 PM

#

Hopefully, Anthropic has high-security standards..

#

(vs spying, hacking, etc)

proud bobcat Dec 12, 2025, 11:52 PM

#

viral cedar but turned out ass

3.2 did its job well

#

I use it daily for math and roleplay

#

I love it

hollow ivy Dec 12, 2025, 11:53 PM

#

proud bobcat I use it daily for math and roleplay

is it top-3 in roleplaying?

compact flame Dec 12, 2025, 11:53 PM

#

proud bobcat I use it daily for math and roleplay

Do people actually use ai for roleplaying rn?

weary galleon Dec 12, 2025, 11:53 PM

#

Roleplaying with a robot? Hmmm...

hollow ivy Dec 12, 2025, 11:53 PM

#

compact flame Do people actually use ai for roleplaying rn?

yes

proud bobcat Dec 12, 2025, 11:54 PM

#

compact flame Do people actually use ai for roleplaying rn?

….

Sadly yes.

#

It’s a HUGE market

#

That

#

I contribute to.

hollow ivy Dec 12, 2025, 11:54 PM

#

proud bobcat …. Sadly yes.

not sadly, it can be fun

proud bobcat Dec 12, 2025, 11:54 PM

#

It is fun

queen veldt Dec 12, 2025, 11:54 PM

#

How fun?

#

What is it roleplaying?

fickle venture Dec 12, 2025, 11:54 PM

#

So guys which model is better for coding?
GPT 5.2 or Claude 4.5 opus

proud bobcat Dec 12, 2025, 11:54 PM

#

hollow ivy is it top-3 in roleplaying?

#5 rn set to become #4

hollow ivy Dec 12, 2025, 11:55 PM

#

fickle venture So guys which model is better for coding? GPT 5.2 or Claude 4.5 opus

opus-4.5

#

by a huge margin

sullen quest Dec 12, 2025, 11:55 PM

#

fickle venture So guys which model is better for coding? GPT 5.2 or Claude 4.5 opus

opus

proud bobcat Dec 12, 2025, 11:55 PM

#

queen veldt What is it roleplaying?

Adventure, stupid scenarios, just in general

#

A little unholy business on the side

fickle venture Dec 12, 2025, 11:55 PM

#

Dam

queen veldt Dec 12, 2025, 11:55 PM

#

Opus is op

proud bobcat Dec 12, 2025, 11:55 PM

#

fickle venture So guys which model is better for coding? GPT 5.2 or Claude 4.5 opus

Opus clocks GPT

fickle venture Dec 12, 2025, 11:55 PM

#

I will try gpt 5.2

queen veldt Dec 12, 2025, 11:56 PM

#

Try it why not

#

Its bad tho

proud bobcat Dec 12, 2025, 11:56 PM

#

I don’t think 5.2 is bad, I just think openai doesnt know what it wants to be

#

DeepSeek is math god, Claude is code god, Gemini is vision and jack of all trades

#

But what is GPT?

compact flame Dec 12, 2025, 11:56 PM

#

proud bobcat But what is GPT?

Balance ⚖️

queen veldt Dec 12, 2025, 11:57 PM

#

Balance between broken code and bad math

compact flame Dec 12, 2025, 11:57 PM

#

Universal ai ig

native yarrow Dec 12, 2025, 11:57 PM

#

GPT is for students who dont know stuff about ai and will use the popular one

fickle venture Dec 12, 2025, 11:57 PM

#

proud bobcat But what is GPT?

Tbh what does GPT mean?

proud bobcat Dec 12, 2025, 11:57 PM

#

compact flame Universal ai ig

One size fits all

fickle venture Dec 12, 2025, 11:57 PM

#

native yarrow GPT is for students who dont know stuff about ai and will use the popular one

Cheating exams yeah I use it alot

proud bobcat Dec 12, 2025, 11:57 PM

#

native yarrow GPT is for students who dont know stuff about ai and will use the popular one

Yeah true

proud bobcat Dec 12, 2025, 11:57 PM

#

fickle venture Cheating exams yeah I use it alot

Use DeepSeek

fickle venture Dec 12, 2025, 11:57 PM

#

And I end up getting F

proud bobcat Dec 12, 2025, 11:57 PM

#

You will not be let down

queen veldt Dec 12, 2025, 11:57 PM

#

Nah I'm student and using gemini 3 pro

#

It's waaay better

#

Even for casual chats

proud bobcat Dec 12, 2025, 11:57 PM

#

DeepSeek is so peak at geometry for me

fickle venture Dec 12, 2025, 11:58 PM

#

queen veldt Nah I'm student and using gemini 3 pro

I mean on my school I got no internet so I just use mobile data chatgpt seems to use way less so it's better

compact flame Dec 12, 2025, 11:58 PM

#

proud bobcat DeepSeek is so peak at geometry for me

Speciale or just thinking?

proud bobcat Dec 12, 2025, 11:58 PM

#

You have to be a bit more wordy but it pays off

proud bobcat Dec 12, 2025, 11:58 PM

#

compact flame Speciale or just thinking?

Speciale thinks too long

fickle venture Dec 12, 2025, 11:58 PM

#

proud bobcat DeepSeek is so peak at geometry for me

I will use it on my next physic exam

proud bobcat Dec 12, 2025, 11:58 PM

#

I use no think

compact flame Dec 12, 2025, 11:58 PM

#

proud bobcat Speciale thinks too long

Fair

#

I swear speciale has some sort of paranoia

hollow ivy Dec 12, 2025, 11:59 PM

#

compact flame Dec 12, 2025, 11:59 PM

#

Everytime I look at it's thought it's always thinks what if?

#

Speciale feels like chatgpt pro with long reasoning

proud bobcat Dec 13, 2025, 12:00 AM

#

compact flame I swear speciale has some sort of paranoia

It’s great for reasoning but only for like questions of the universe

#

You have to specify FULLY

#

Just use the no think and thinking variants

vivid coral Dec 13, 2025, 12:01 AM

#

hollow ivy

Nobody uses commie Claude with their ridiculous limits they impose here, and everywhere

hollow ivy Dec 13, 2025, 12:01 AM

#

vivid coral Nobody uses commie Claude with their ridiculous limits they impose here, and eve...

-# in battle mode it has no limits :)

proud bobcat Dec 13, 2025, 12:01 AM

#

vivid coral Nobody uses commie Claude with their ridiculous limits they impose here, and eve...

#

Mogged.

vivid coral Dec 13, 2025, 12:01 AM

#

hollow ivy -# in battle mode it has no limits :)

on code mode?

hollow ivy Dec 13, 2025, 12:01 AM

#

vivid coral on code mode?

text/chat mode (default)

compact flame Dec 13, 2025, 12:02 AM

#

proud bobcat

Damn it's expensive

proud bobcat Dec 13, 2025, 12:02 AM

#

People pay the premium though cause it’s good

#

I used to have a hatred for Claude but they make solid stuff

#

Really solid

#

Opus 4.5 thinking is a MACHINE

vivid coral Dec 13, 2025, 12:03 AM

#

hollow ivy text/chat mode (default)

Yea but on battle the model changes on every follow-up, which is great for some testing, but not if you wan't to test longer chains

compact flame Dec 13, 2025, 12:03 AM

#

proud bobcat I used to have a hatred for Claude but they make solid stuff

Why did you hate them tho

hollow ivy Dec 13, 2025, 12:03 AM

#

vivid coral Yea but on battle the model changes on every follow-up, which is great for some ...

Untrue. If you never vote, the model stays the same.

sullen quest Dec 13, 2025, 12:03 AM

#

compact flame Why did you hate them tho

anthropic killed their family

vivid coral Dec 13, 2025, 12:03 AM

#

hollow ivy Untrue. If you never vote, the model stays the same.

WHAT?!

proud bobcat Dec 13, 2025, 12:03 AM

#

vivid coral Yea but on battle the model changes on every follow-up, which is great for some ...

Just use side by side comparison

compact flame Dec 13, 2025, 12:03 AM

#

sullen quest anthropic killed their family

What

proud bobcat Dec 13, 2025, 12:04 AM

#

compact flame Why did you hate them tho

Sonnet 4 sucked

#

So I didn’t use their models for a while

sullen quest Dec 13, 2025, 12:04 AM

#

so?

proud bobcat Dec 13, 2025, 12:04 AM

#

Came back when 4.5 released

compact flame Dec 13, 2025, 12:04 AM

#

Wym anthropic killed somebody

proud bobcat Dec 13, 2025, 12:04 AM

#

And I really like its prose

vivid coral Dec 13, 2025, 12:04 AM

#

proud bobcat Just use side by side comparison

no, limits are heavy there

hollow ivy Dec 13, 2025, 12:05 AM

#

vivid coral WHAT?!

you need a (google) account, though

compact flame Dec 13, 2025, 12:05 AM

#

Anyways

vivid coral Dec 13, 2025, 12:05 AM

#

hmmmm interesting

compact flame Dec 13, 2025, 12:05 AM

#

Chatgpt needs some training

#

It's supposed to be ⚖️ not uh downgrade

proud bobcat Dec 13, 2025, 12:06 AM

#

vivid coral no, limits are heavy there

There’s limits???

#

I’ve never hit limits at all??

proud bobcat Dec 13, 2025, 12:06 AM

#

compact flame It's supposed to be ⚖️ not uh downgrade

Dw 5.3 comes out next week and 5.4 next month and will fix EVERYTHING

hollow ivy Dec 13, 2025, 12:07 AM

#

proud bobcat I’ve never hit limits at all??

you use lower models? or only battle?

proud bobcat Dec 13, 2025, 12:07 AM

#

Are we talking lmarena or Claude

compact flame Dec 13, 2025, 12:07 AM

#

proud bobcat Dw 5.3 comes out next week and 5.4 next month and will fix EVERYTHING

Fr?

latent crest Dec 13, 2025, 12:07 AM

#

U guys how can I use Midjourney for free??

proud bobcat Dec 13, 2025, 12:07 AM

#

compact flame Fr?

I don’t know, but wouldn’t be surprised lmao

fickle venture Dec 13, 2025, 12:07 AM

#

proud bobcat Dw 5.3 comes out next week and 5.4 next month and will fix EVERYTHING

Dude just drop GPT 6 why like that

compact flame Dec 13, 2025, 12:08 AM

#

proud bobcat I don’t know, but wouldn’t be surprised lmao

Dude if we ever get chatgpt 6?

hollow ivy Dec 13, 2025, 12:08 AM

#

fickle venture Dude just drop GPT 6 why like that

gta6 before gpt6 ^^

compact flame Dec 13, 2025, 12:08 AM

#

I won't be surprised there will be 6.7 gpt

hollow ivy Dec 13, 2025, 12:08 AM

#

vivid coral Dec 13, 2025, 12:08 AM

#

proud bobcat I’ve never hit limits at all??

I'm not sure why some people are getting heavy limits and some aren't. I just assumed everyone got them since I did, interesting

compact flame Dec 13, 2025, 12:08 AM

#

Cuz knowing openai this might happen

fickle venture Dec 13, 2025, 12:08 AM

#

hollow ivy

GPT 6 for sure cuz GTA 6 ain't coming out

proud bobcat Dec 13, 2025, 12:09 AM

#

compact flame I won't be surprised there will be 6.7 gpt

ChatGPT 6.7 will finally be able to answer how many r’s in garlic first time

#

And it’ll say “SIX SEVENNN” every 2 prompts

compact flame Dec 13, 2025, 12:09 AM

#

proud bobcat And it’ll say “SIX SEVENNN” every 2 prompts

Frfr

#

Would be a great April fool's model tho

vivid coral Dec 13, 2025, 12:09 AM

#

hollow ivy

GPT 12.2 will be out before GTA6

proud bobcat Dec 13, 2025, 12:10 AM

#

GPT 10 will finally be able to make coherent organized code

#

Trust

#

With 5% LESS errors

fickle venture Dec 13, 2025, 12:12 AM

#

Imagine OpenAi skips GPT 9 like Windows and Apple did

proud bobcat Dec 13, 2025, 12:14 AM

#

GPT 8 will have ads integrated

#

Like every sentence there’s an ad

fickle venture Dec 13, 2025, 12:15 AM

#

proud bobcat Dec 13, 2025, 12:15 AM

#

https://tenor.com/view/ishowspeed-try-not-to-laugh-gif-7682731162751353849

Tenor

golden ocean Dec 13, 2025, 12:17 AM

#

https://tenor.com/view/stan-twt-stantwitter-stan-twitter-relicario-gifs-relicariogifs-gif-11979182475974977057

Tenor

torn mantle Dec 13, 2025, 12:17 AM

#

hollow ivy Currently Anthropic is ahead.

the only one im seeing whos still far ahead from hitting plateau are google

thorn path Dec 13, 2025, 12:21 AM

#

weary galleon GPT 5.2 is a piece of crap

Wait did I miss the final numbers?

lucid geyser Dec 13, 2025, 12:29 AM

#

torn mantle the only one im seeing whos still far ahead from hitting plateau are google

Why do u think that

#

Claude is so good even without reasoning

torn mantle Dec 13, 2025, 12:30 AM

#

lucid geyser Why do u think that

burnout / no improvement on pre-training ( thats why they are starting from scratch ) / less data quality compared to google / key staff elements poached by other labs

lucid geyser Dec 13, 2025, 12:31 AM

#

torn mantle burnout / no improvement on pre-training ( thats why they are starting from scra...

Grok has the most powerful cluster

torn mantle Dec 13, 2025, 12:32 AM

#

lucid geyser Grok has the most powerful cluster

what did they do with it

jade egret Dec 13, 2025, 12:36 AM

#

poll_question_text

Which is better overall

victor_answer_votes

6

total_votes

11

victor_answer_id

2

victor_answer_text

Gemini 3.0 pro

lucid geyser Dec 13, 2025, 12:39 AM

#

torn mantle what did they do with it

I mean they suck but it was recently completed something is possible

torn mantle Dec 13, 2025, 12:39 AM

#

lucid geyser Grok has the most powerful cluster

also

#

https://cloud.google.com/blog/products/compute/using-cloud-tpu-multislice-to-scale-ai-workloads

Google Cloud Blog

Using Cloud TPU Multislice to scale AI workloads | Google Cloud Blog

Learn how new Cloud TPU Multislice functionality can enable 2x higher scale than alternate accelerators, with 2x higher performance/dollar and near-linear scaling out-of-the-box.

#

google can basically use multislice to create a more powerful virtual cluster

#

than xai

#

trust me, they are far ahead

#

be it on hardware / software

mild anvil Dec 13, 2025, 12:53 AM

#

Does anyone know if they've extended the limit of 5 videos per day?

torn mantle Dec 13, 2025, 12:53 AM

#

i just searched, so the maximum they can pack with this method is 50k but thats still way faster than any cluster for ai training giving how efficient their TPUs

hollow ivy Dec 13, 2025, 1:06 AM

#

maybe a fun game for some people? #ai-creations message

#

-# (it's 100% free & open-source)

stray aspen Dec 13, 2025, 1:24 AM

#

can we get claudius 4.5 opus on vision arena

burnt sinew Dec 13, 2025, 1:28 AM

#

did they lobotmize gemini 3?

hazy forge Dec 13, 2025, 1:37 AM

#

stray aspen can we get claudius 4.5 opus on vision arena

claudoctius 4.5 opus will NOT get vision arena status

thorny schooner Dec 13, 2025, 1:42 AM

#

burnt sinew did they lobotmize gemini 3?

Have no clue feels like it has been like that for a while now

atomic lagoon Dec 13, 2025, 1:45 AM

#

vivid coral Nobody uses commie Claude with their ridiculous limits they impose here, and eve...

Hmmmm thats weird because I seem to be using that commie claude with their ridiculous limits. You gotta manage it correctly

viral cedar Dec 13, 2025, 1:48 AM

#

did yall see the mcdonalds and coca cola ad

#

that made me mad

atomic lagoon Dec 13, 2025, 1:51 AM

#

It was funny

#

The mcdonalds one

mild anvil Dec 13, 2025, 1:52 AM

#

I was able to generate about 10 videos in a single day, wasn't the limit supposed to be 5 per day? Does anyone know if they increased the limit?

proud bobcat Dec 13, 2025, 1:56 AM

#

viral cedar did yall see the mcdonalds and coca cola ad

They’re so ass

viral cedar Dec 13, 2025, 1:57 AM

#

proud bobcat They’re so ass

yes, idk why they did it 😭

noble vessel Dec 13, 2025, 1:59 AM

#

hello

vivid coral Dec 13, 2025, 2:03 AM

#

vivid coral my early returns are that GPT 5.2 search is definitely an improvement over 5.1, ...

I don't really understand the hate. It's not life changing, but Twitter is ridiculous with the hate for this thing

proud bobcat Dec 13, 2025, 2:09 AM

#

vivid coral I don't really understand the hate. It's not life changing, but Twitter is ridic...

The search is easy

#

But the model itself sucks

#

It’s very clearly benchmaxxed

vivid coral Dec 13, 2025, 2:17 AM

#

I guess that's fair

lofty anchor Dec 13, 2025, 2:30 AM

#

Bro I use Claud for code I hate the thinking limit can someone tell me how to ovoid the limit

golden ocean Dec 13, 2025, 2:33 AM

#

Cwaude

burnt sinew Dec 13, 2025, 2:41 AM

#

@echo aurora 🌩️

golden ocean Dec 13, 2025, 2:52 AM

#

@🍍

whole sundial Dec 13, 2025, 2:56 AM

#

<@&1349916362595635286>

#

#1397655624103493813
this too

neat apex Dec 13, 2025, 2:58 AM

#

neat apex

poll_question_text

What would save openai from short term bankrupt

victor_answer_votes

8

total_votes

17

victor_answer_id

4

victor_answer_text

Nothing can save they lmao

lucid geyser Dec 13, 2025, 3:01 AM

#

torn mantle google can basically use multislice to create a more powerful virtual cluster

Like they don’t have to be physically connected?

plucky sparrow Dec 13, 2025, 3:12 AM

#

vivid coral I don't really understand the hate. It's not life changing, but Twitter is ridic...

Because it was over hyped, even by openAi themselves. High benchmarks. "Code red".

If they just called it gpt5-high-context I think a lot of people would've been happier

#

vivid coral Dec 13, 2025, 3:13 AM

#

plucky sparrow

lol, i've seen those screenshots all over twitter

plucky sparrow Dec 13, 2025, 3:40 AM

#

the other thing is, unlike Google, who releases Gemini 3 Pro (and not just Flash) to everyone to use, GPT models are typically paywalled

#

so a lot of people who have access to these models are paying for it, meaning their expectations (and also hatred, if expectations are unmet) are higher

#

Altman seems to do a good job at selling to VCs, but not such a good job at knowing how to appeal to the general public

#

probably a lesson to be learnt there somewhere if you're thinking of starting a company

#

actually I think the other reason is the amount of hype-litter all over twitter for GPT5.2

#

Gemini 3 Pro had a lot of it too, if it wasn't able to produce much, pretty sure there'd be a lot of gemini 3 pro hate too

verbal shadow Dec 13, 2025, 3:53 AM

#

plucky sparrow Gemini 3 Pro had a lot of it too, if it wasn't able to produce much, pretty sure...

Hi

plucky sparrow Dec 13, 2025, 3:54 AM

#

hi

plucky sparrow Dec 13, 2025, 3:54 AM

#

plucky sparrow actually I think the other reason is the amount of hype-litter all over twitter ...

I literally just went to x.com, and saw this at the first post at the top of my feed.
https://x.com/slow_developer/status/1999661802666557487

Haider. (@slow_developer)

i'm still kinda confused how openAI made that much progress with gpt-5.2 when gpt-5.1 was only a month ago

my guess is it was an internal model they held back due to high compute costs and because they didn't think it was needed

until gemini 3 and opus 4.5 arrived

#

if people get bombarded with this, and they try the model, expecting "incredible progress" and it can't answer simple questions, yeah, they're going to post about how it sucks

astral bloom Dec 13, 2025, 4:12 AM

#

system prompt in code arena

tawny brook Dec 13, 2025, 5:43 AM

#

Which open source model is currently the best overall

tawdry vapor Dec 13, 2025, 5:54 AM

#

Did you mistake it for glm-4.6v?

ocean ferry Dec 13, 2025, 6:34 AM

#

echo aurora Dec 13, 2025, 6:54 AM

#

tawdry vapor Did you mistake it for glm-4.6v?

Yes, good catch. Thank you blobthanks

modest prism Dec 13, 2025, 7:54 AM

#

ocean ferry

AGI

astral bloom Dec 13, 2025, 7:59 AM

#

echo aurora Yes, good catch. Thank you <:blobthanks:825444835460644929>

why did the sys prompt i pasted got deleted?

bright shard Dec 13, 2025, 8:05 AM

#

@echo aurora An AI arena for audio, music, etc., would be amazing! It's the only thing LMArena is missing!

shell oasis Dec 13, 2025, 8:12 AM

#

LMARENA IS THE BEST PLATFORM I HAVE SEEN TILL DATE. ADDING UP THE VIDEO ARENA IN THE WEBSITE IS 🔥🔥🔥🔥

#

I wanted to support by donating some amount... @echo aurora is there a link for donation?

whole sundial Dec 13, 2025, 8:18 AM

#

shell oasis I wanted to support by donating some amount... <@283397944160550928> is there a ...

lmarena is a private company so they don't accept donations, they only make money from evaluation services

shell oasis Dec 13, 2025, 8:20 AM

#

Still I am mesmerized by the progress. I write parody songs and I haven't been able to create videos for the lyrics till date due to expensive subscriptions by the AI websites
LMArena opened the gates for me. I really am very grateful 🙏🏻

shell oasis Dec 13, 2025, 8:21 AM

#

whole sundial lmarena is a private company so they don't accept donations, they only make mone...

I am feeling very happy after seeing the video option when I opened lmarena website today

whole sundial Dec 13, 2025, 8:22 AM

#

the video arena is not on the website, maybe you are confusing it with something else?

shell oasis Dec 13, 2025, 8:23 AM

#

@whole sundial but I saw this option

whole sundial Dec 13, 2025, 8:24 AM

#

shell oasis <@675304479247040523> but I saw this option

looks fake, where is the plus button?

shell oasis Dec 13, 2025, 8:25 AM

#

Sure?

rn_image_picker_lib_temp_b0cfc895-1eab-4ea8-8167-55e071f9d251.jpg

whole sundial Dec 13, 2025, 8:25 AM

#

shell oasis <@675304479247040523> but I saw this option

@echo aurora is this legit?

shell oasis Dec 13, 2025, 8:25 AM

#

Looks genuine to me

whole sundial Dec 13, 2025, 8:25 AM

#

shell oasis Sure?

oh probably an a/b thing

shell oasis Dec 13, 2025, 8:26 AM

#

That's why I came running here to express my happiness here 🤩

tired shadow Dec 13, 2025, 8:27 AM

#

I tried going to the same link

#

it gave this bruh

#

hmm

#

maybe its a bug like on my phone

shell oasis Dec 13, 2025, 8:29 AM

#

Am I lucky? I have literally created my first video

tired shadow Dec 13, 2025, 8:29 AM

#

tired shadow maybe its a bug like on my phone

on my phone I had web arena before i got it on PC

#

oh hell nah

#

my phone has 0% SAD

whole sundial Dec 13, 2025, 8:29 AM

#

shell oasis Am I lucky? I have literally created my first video

assuming it's an a/b test, you are pretty lucky

shell oasis Dec 13, 2025, 8:30 AM

#

tears of joy 🥹

#

🥹🥹🥹

rn_image_picker_lib_temp_2d799551-ba37-491c-bac1-fd5f2493dec5.jpg

tired shadow Dec 13, 2025, 8:30 AM

#

please emulator I need this, my lmarena is kinda videoless

whole sundial Dec 13, 2025, 8:31 AM

#

shell oasis 🥹🥹🥹

looks pretty legit to me

shell oasis Dec 13, 2025, 8:32 AM

#

I will be able to create videos for my 58 parody songs

#

Thank you from the depth of my heart LMArena 🙏🏻

hushed gyro Dec 13, 2025, 8:32 AM

#

shell oasis I will be able to create videos for my 58 parody songs

do you have access to beta test?

shell oasis Dec 13, 2025, 8:32 AM

#

hushed gyro do you have access to beta test?

I have no idea, I just use lmarena on daily basis

#

I opened normally today like every other day...

tired shadow Dec 13, 2025, 8:33 AM

#

tired shadow please emulator I need this, my lmarena is kinda videoless

noooo

shell oasis Dec 13, 2025, 8:33 AM

#

I am logged in bro

hushed gyro Dec 13, 2025, 8:33 AM

#

shell oasis I have no idea, I just use lmarena on daily basis

@echo aurora yk if random people have access to features early, why not give it out to EVERYONE?

#

🤔

tired shadow Dec 13, 2025, 8:33 AM

#

hushed gyro 🤔

it may be a bug

hushed gyro Dec 13, 2025, 8:33 AM

#

shell oasis I am logged in bro

can you check if sora is on there

shell oasis Dec 13, 2025, 8:33 AM

#

yes its there

hushed gyro Dec 13, 2025, 8:33 AM

#

tired shadow it may be a bug

well I want to keep this kind of bug lmao

tired shadow Dec 13, 2025, 8:33 AM

#

sora 2 pro?

tired shadow Dec 13, 2025, 8:33 AM

#

hushed gyro well I want to keep this kind of bug lmao

+++++

shell oasis Dec 13, 2025, 8:33 AM

#

yes sora 2 pro

tired shadow Dec 13, 2025, 8:34 AM

#

shell oasis yes sora 2 pro

YUPII

hushed gyro Dec 13, 2025, 8:34 AM

#

shell oasis yes sora 2 pro

woooooooo~~~

shell oasis Dec 13, 2025, 8:34 AM

#

rn_image_picker_lib_temp_46340627-cfe2-4a8c-a44d-58644b77e1b5.jpg

tired shadow Dec 13, 2025, 8:34 AM

#

shell oasis

thank god

shell oasis Dec 13, 2025, 8:34 AM

#

God, thank you 🙏🏻

tired shadow Dec 13, 2025, 8:34 AM

#

finally after I waited so long

hushed gyro Dec 13, 2025, 8:34 AM

#

OMG?????

VEO 3 FOR FREE????

@echo aurora nah

#

yo can someone from the company explain why this guy has video arena on the site???

whole sundial Dec 13, 2025, 8:35 AM

#

i see strings that relate to a video arena in the code, this is 100% real

tired shadow Dec 13, 2025, 8:35 AM

#

hushed gyro OMG????? VEO 3 FOR FREE???? <@283397944160550928> nah

but lmarena is like this, we have claude opus, sonnet, and literally gemini 3 pro (gemini 3 pro is kinda limitless I think)

whole sundial Dec 13, 2025, 8:35 AM

#

shell oasis Dec 13, 2025, 8:35 AM

#

shell oasis LMARENA IS THE BEST PLATFORM I HAVE SEEN TILL DATE. ADDING UP THE VIDEO ARENA IN...

That's why my first message after I came running here today 😁

whole sundial Dec 13, 2025, 8:35 AM

#

hushed gyro Dec 13, 2025, 8:35 AM

#

whole sundial

if you remove this string, is the limit removed

whole sundial Dec 13, 2025, 8:35 AM

#

hushed gyro if you remove this string, is the limit removed

i doubt it

hushed gyro Dec 13, 2025, 8:35 AM

#

pineapple has a lot to explain...

tired shadow Dec 13, 2025, 8:36 AM

#

hushed gyro if you remove this string, is the limit removed

nah, it will just dont show anything

whole sundial Dec 13, 2025, 8:36 AM

#

hushed gyro pineapple has a lot to explain...

yeah, imo this is the type of feature that shouldn't be a/b tested like this, it's either launched or not launched

hushed gyro Dec 13, 2025, 8:36 AM

#

shell oasis That's why my first message after I came running here today 😁

can I add you as a friend?

tired shadow Dec 13, 2025, 8:36 AM

#

ill check my alt on lmarena

whole sundial Dec 13, 2025, 8:36 AM

#

if it's not launched and just an a/b test, lots of people who want a video arena will be upset

hushed gyro Dec 13, 2025, 8:37 AM

#

whole sundial yeah, imo this is the type of feature that shouldn't be a/b tested like this, it...

that isn't explicit anyways, let alone a/b test

they probably have done it poorly, not rolling it out to everyone

shell oasis Dec 13, 2025, 8:37 AM

#

Are you all logged in to the website also?

hushed gyro Dec 13, 2025, 8:37 AM

#

shell oasis Are you all logged in to the website also?

yas

whole sundial Dec 13, 2025, 8:37 AM

#

(well, we already know it's a/b)

#

yeah neither of my logged in lmarenas have it

shell oasis Dec 13, 2025, 8:37 AM

#

So good news is we are all going to get the video arena soon on the website 🔥

shell oasis Dec 13, 2025, 8:38 AM

#

hushed gyro can I add you as a friend?

sure

hushed gyro Dec 13, 2025, 8:38 AM

#

whole sundial (well, we already know it's a/b)

the problem is, pineapple usually has the speed of FLASH to address these kind of problems

lucid geyser Dec 13, 2025, 8:38 AM

#

What does the video selection even look like

hushed gyro Dec 13, 2025, 8:39 AM

#

lucid geyser What does the video selection even look like

wdym? how to get into it?

whole sundial Dec 13, 2025, 8:39 AM

#

hushed gyro the problem is, pineapple usually has the speed of FLASH to address these kind o...

it's pretty late for him though

shell oasis Dec 13, 2025, 8:39 AM

#

shell oasis

Looks like this

#

@lucid geyser

tired shadow Dec 13, 2025, 8:40 AM

#

im starting operation alt check, I have 10 alts

hushed gyro Dec 13, 2025, 8:40 AM

#

@lucid geyser there should be a video button next to Image & Code

hushed gyro Dec 13, 2025, 8:40 AM

#

tired shadow im starting operation alt check, I have 10 alts

brother what

lucid geyser Dec 13, 2025, 8:40 AM

#

hushed gyro <@652904330436935712> there should be a video button next to Image & Code

On the bar though

hushed gyro Dec 13, 2025, 8:40 AM

#

lucid geyser On the bar though

just click the models

lucid geyser Dec 13, 2025, 8:41 AM

#

It doesn’t show image models without clicking image

hushed gyro Dec 13, 2025, 8:46 AM

#

@echo aurora pls... we need an explanation on what's happening

lucid geyser Dec 13, 2025, 8:49 AM

#

hushed gyro <@283397944160550928> pls... we need an explanation on what's happening

What happened

#

Bro he’s sleeping prolly chill

hushed gyro Dec 13, 2025, 8:50 AM

#

lucid geyser Bro he’s sleeping prolly chill

he literally went online minutes ago

#

but whatever

#

I want this a/b test situation to end

roll the vid arena out to everyone!!!

lucid geyser Dec 13, 2025, 8:50 AM

#

Why

hushed gyro Dec 13, 2025, 8:51 AM

#

lucid geyser Why

well if you have noticed

some ppl have the video arena on the website, some don't

shell oasis Dec 13, 2025, 8:52 AM

#

Maybe after a few hours it will roll to everyone...just like how android updates happen to get feedback

#

and then stable update to all

torn mantle Dec 13, 2025, 8:53 AM

#

lucid geyser Like they don’t have to be physically connected?

kinda, they are still interconnected with fiber but not in the same datacenter

zealous sparrow Dec 13, 2025, 8:53 AM

#

Videoarena came to the website

shell oasis Dec 13, 2025, 8:54 AM

#

zealous sparrow Videoarena came to the website

you got it too?

zealous sparrow Dec 13, 2025, 8:54 AM

#

shell oasis you got it too?

Yea

shell oasis Dec 13, 2025, 8:54 AM

#

Great 👍🏻

keen topaz Dec 13, 2025, 8:58 AM

#

Hello, this is Lakki. I am a web developer. If you need help with any project, you can hire me

torn mantle Dec 13, 2025, 9:03 AM

#

you seem like a vibe coder ngl

compact flame Dec 13, 2025, 9:13 AM

#

zealous sparrow Videoarena came to the website

Fr?

whole sundial Dec 13, 2025, 9:13 AM

#

keen topaz Hello, this is Lakki. I am a web developer. If you need help with any project, y...

<@&1349916362595635286>

hushed gyro Dec 13, 2025, 9:16 AM

#

zealous sparrow Videoarena came to the website

YAY!

hushed gyro Dec 13, 2025, 9:17 AM

#

shell oasis Maybe after a few hours it will roll to everyone...just like how android updates...

I don't have it???

shell oasis Dec 13, 2025, 9:18 AM

#

hushed gyro I don't have it???

Soon

keen topaz Dec 13, 2025, 9:26 AM

#

Need?

jade cloak Dec 13, 2025, 9:26 AM

#

hey, some devices got Video generation option (including mine), but some didn't. why

queen veldt Dec 13, 2025, 9:27 AM

#

#

Samsung browser no video arena

jade cloak Dec 13, 2025, 9:28 AM

#

jade cloak hey, some devices got Video generation option (including mine), but some didn't....

@echo aurora

compact flame Dec 13, 2025, 9:28 AM

#

jade cloak hey, some devices got Video generation option (including mine), but some didn't....

Maybe pc only?

tired shadow Dec 13, 2025, 9:28 AM

#

compact flame Maybe pc only?

no

#

I tested on chrome, tor and firefox

compact flame Dec 13, 2025, 9:29 AM

#

tired shadow no

I guess maybe it's country based

#

Like with sora

whole sundial Dec 13, 2025, 9:30 AM

#

so it is a/b, an honestly pretty stupid one at that, they could've launched it on beta lmarena first and announce that instead of basically making people jealous for one another based on if they have the video arena or not

tired shadow Dec 13, 2025, 9:30 AM

#

compact flame I guess maybe it's country based

I dont think so

tired shadow Dec 13, 2025, 9:30 AM

#

whole sundial so it is a/b, an honestly pretty stupid one at that, they could've launched it o...

ye im sure its a/b

compact flame Dec 13, 2025, 9:30 AM

#

tired shadow I dont think so

Hmm

#

Then maybe it's just to some users

#

Like early access

tired shadow Dec 13, 2025, 9:31 AM

#

compact flame Like early access

a/b

jade cloak Dec 13, 2025, 9:32 AM

#

tired shadow I dont think so

try India

#

i don't think its country based

hushed gyro Dec 13, 2025, 9:32 AM

#

we need an explanation from the company now!

#

and we are slightly upset

jade cloak Dec 13, 2025, 9:33 AM

#

look guys

#

there are many models

sterile tartan Dec 13, 2025, 9:35 AM

#

Yupp has even More

#

But LMarena is Better for Convenience and SOTA Models

jade cloak Dec 13, 2025, 9:36 AM

#

UI preview for you guys

hushed gyro Dec 13, 2025, 9:36 AM

#

jade cloak look guys

@echo aurora hey! umm I have noticed some users have access to the video arena on the website, but some don't.

Can you explain if this is an a/b test and if it is next time pls be explicit about this

sterile tartan Dec 13, 2025, 9:36 AM

#

Wait since when LMarena has Videos on Website

#

💀

hushed gyro Dec 13, 2025, 9:37 AM

#

sterile tartan Wait since when LMarena has Videos on Website

like a few hours ago

@shell oasis this guy got it first

sterile tartan Dec 13, 2025, 9:37 AM

#

Seems like is rolling slowly

sterile tartan Dec 13, 2025, 9:37 AM

#

hushed gyro like a few hours ago <@434220896937574400> this guy got it first

Lucky him

#

Probably a valuable user

hushed gyro Dec 13, 2025, 9:38 AM

#

does anyone realise that this company sometimes does things in a shady way?

jade cloak Dec 13, 2025, 9:38 AM

#

hushed gyro does anyone realise that this company sometimes does things in a shady way?

ye

hushed gyro Dec 13, 2025, 9:38 AM

#

sterile tartan Probably a valuable user

he uses it daily so probably that's why

sterile tartan Dec 13, 2025, 9:38 AM

#

hushed gyro he uses it daily so probably that's why

The Algorithms Yeah

hushed gyro Dec 13, 2025, 9:39 AM

#

THEY SAID DECEMBER!!!

shell oasis Dec 13, 2025, 9:39 AM

#

I am a DevOps Engineer...I find new ways to automate the code deployments...Heavy automations

compact flame Dec 13, 2025, 9:39 AM

#

hushed gyro THEY SAID DECEMBER!!!

If it thinks you're old enough?

#

How is it even supposed to tell?

#

Well probably talking about taxes with chatgpt maybe gonna work

shell oasis Dec 13, 2025, 9:41 AM

#

hushed gyro THEY SAID DECEMBER!!!

Rockstar Games also said GTA VI in May 2026...but November 2026 now 😔

compact flame Dec 13, 2025, 9:41 AM

#

Maybe the video update is not intended?

#

They just maybe accidentally rolled it to some users

shell oasis Dec 13, 2025, 9:42 AM

#

@compact flame if some got it, maybe its intended for all

hushed gyro Dec 13, 2025, 9:42 AM

#

compact flame Maybe the video update is not intended?

that would mean that backlash is coming for them, since they made a fully functioning feature (that is heavily requested) then decided to gatekeep it

#

This happened with the Retry Button on Battle Mode, took them a month to add it back... smh

compact flame Dec 13, 2025, 9:42 AM

#

hushed gyro that would mean that backlash is coming for them, since they made a fully functi...

Cuz like nobody said anything about video arenas

#

It's just silence and boom it's here

hushed gyro Dec 13, 2025, 9:43 AM

#

compact flame Cuz like nobody said anything about video arenas

as I said before, this company does a lot of things in a shady way, this isn't a good example of transparency, I don't have a lot of trust into this company

shell oasis Dec 13, 2025, 9:43 AM

#

compact flame Cuz like nobody said anything about video arenas

I wanted that bro...I always secretly wished lmarena to allow video creation on website

hushed gyro Dec 13, 2025, 9:44 AM

#

shell oasis I wanted that bro...I always secretly wished lmarena to allow video creation on ...

I mean the 3 channels in this DC server are lackluster

compact flame Dec 13, 2025, 9:44 AM

#

shell oasis I wanted that bro...I always secretly wished lmarena to allow video creation on ...

Well let's hope video arena is real and it's not an accident

hushed gyro Dec 13, 2025, 9:44 AM

#

https://discord.com/channels/1340554757349179412/1449335460223783014

@shell oasis @compact flame check it out

shell oasis Dec 13, 2025, 9:44 AM

#

compact flame Well let's hope video arena is real and it's not an accident

Manifestation works 🙌🏻😇

compact flame Dec 13, 2025, 9:45 AM

#

hushed gyro https://discord.com/channels/1340554757349179412/1449335460223783014 <@43422089...

Hm maybe we can copy a link that leads to this feature?

#

Like with code and etc

hushed gyro Dec 13, 2025, 9:45 AM

#

try inspect

compact flame Dec 13, 2025, 9:45 AM

#

I tried didn't find anything useful

hushed gyro Dec 13, 2025, 9:46 AM

#

compact flame I tried didn't find anything useful

how did they find video string?

#

@whole sundial need some help over here

compact flame Dec 13, 2025, 9:47 AM

#

I'm not that good at inspect anyways

whole sundial Dec 13, 2025, 9:47 AM

#

hushed gyro <@675304479247040523> need some help over here

developer mode, go into debugger and you should be able to search for certain things

hushed gyro Dec 13, 2025, 9:48 AM

#

whole sundial developer mode, go into debugger and you should be able to search for certain th...

where is that on Edge?

mystic flower Dec 13, 2025, 9:48 AM

#

I’m a full-stack developer building a project and I need an API key for image and video generation.

whole sundial Dec 13, 2025, 9:50 AM

#

hushed gyro where is that on Edge?

i'm not sure

north osprey Dec 13, 2025, 9:50 AM

#

heloo

stark forge Dec 13, 2025, 9:50 AM

#

I'll help the first 10 people interested on how to start earning $100k or more within a week, but you will reimburse me 10% of your profits when you receive it. Note: only interested people should send a friend request or send me a dm! ask me (HOW) via Telegram username @Susan _Vachon

Or The telegram link in my bio

whole sundial Dec 13, 2025, 9:50 AM

#

oh i just figured it out

whole sundial Dec 13, 2025, 9:50 AM

#

stark forge I'll help the first 10 people interested on how to start earning $100k or more ...

<@&1349916362595635286>

hushed gyro Dec 13, 2025, 9:50 AM

#

whole sundial oh i just figured it out

tell me pls

hushed gyro Dec 13, 2025, 9:51 AM

#

whole sundial oh i just figured it out

hey uhh you might be able to find the URL that leads to vid arena

whole sundial Dec 13, 2025, 9:51 AM

#

ctrl + shift + f

whole sundial Dec 13, 2025, 9:51 AM

#

hushed gyro hey uhh you might be able to find the URL that leads to vid arena

don't think that's possible

#

they made it in such a way that you have to be a part of the a/b test group to access the video arena

hushed gyro Dec 13, 2025, 9:52 AM

#

#

compact flame Dec 13, 2025, 9:53 AM

#

hushed gyro

Says something related to experiment

#

Ig maybe it's experimental

hushed gyro Dec 13, 2025, 9:56 AM

#

whole sundial they made it in such a way that you have to be a part of the a/b test group to a...

well how to apply for that test group?

#

and there's no way pineapple is silent about this

#

something really sketchy is going on

compact flame Dec 13, 2025, 9:57 AM

#

hushed gyro and there's no way pineapple is silent about this

Maybe there more people who got it it's just like maybe was randomized

whole sundial Dec 13, 2025, 9:57 AM

#

hushed gyro well how to apply for that test group?

it randomly chooses people

compact flame Dec 13, 2025, 9:58 AM

#

whole sundial it randomly chooses people

Choosing people randomly is not efficient for testing ig

queen veldt Dec 13, 2025, 10:02 AM

#

#

Ummm

#

shell oasis Dec 13, 2025, 10:06 AM

#

Well you know, only 2 videos are allowed per day 🤯
then check back after 14 hours

#

damn

compact flame Dec 13, 2025, 10:06 AM

#

shell oasis Well you know, only 2 videos are allowed per day 🤯 then check back after 14 hou...

Can you like share the link to video arena or it's not possible

wispy sierra Dec 13, 2025, 10:07 AM

#

Hello

shell oasis Dec 13, 2025, 10:08 AM

#

compact flame Can you like share the link to video arena or it's not possible

https://lmarena.ai/?mode=direct&chat-modality=video

LMArena

An open platform for evaluating AI through human preference

#

The link won't work

compact flame Dec 13, 2025, 10:08 AM

#

shell oasis https://lmarena.ai/?mode=direct&chat-modality=video

Hmm it gives me an error sadly

#

Well I guess it was worth a try

shell oasis Dec 13, 2025, 10:08 AM

#

Unless its released

#

But very low limit right now, only 2 videos per day

#

damn

compact flame Dec 13, 2025, 10:09 AM

#

shell oasis But very low limit right now, only 2 videos per day

I think because everyone will be abusing sora 2 pro or whatever

shell oasis Dec 13, 2025, 10:09 AM

#

I would have to buy a local PC only but right now all AI companies ate up RAMs 😔

shell oasis Dec 13, 2025, 10:10 AM

#

compact flame I think because everyone will be abusing sora 2 pro or whatever

I write parody songs and I wanted AI videos to create the videos for my lyrics

shell oasis Dec 13, 2025, 10:12 AM

#

compact flame I think because everyone will be abusing sora 2 pro or whatever

Yeah true though

wispy sierra Dec 13, 2025, 10:13 AM

#

A bug occurred, and I posted it on the bug forum. How long will it take for the moderators to see and fix it?

north osprey Dec 13, 2025, 10:18 AM

#

How do I delete a video that has already been generated, sir?

whole sundial Dec 13, 2025, 10:22 AM

#

wispy sierra A bug occurred, and I posted it on the bug forum. How long will it take for the ...

there is only one person here ( @echo aurora ) that looks at them, so it may take anywhere from a few hours to a few days

edgy wharf Dec 13, 2025, 10:22 AM

#

Gemini 3 pro accidentally fed me its internal thought pipelinei nstead of the proper output. Is this common knowledge, or something that's not known?

wispy sierra Dec 13, 2025, 10:23 AM

#

whole sundial there is only one person here ( <@283397944160550928> ) that looks at them, so i...

Thank you

#

How does LM Arena allow us to use paid AI models for free?

queen veldt Dec 13, 2025, 10:34 AM

#

Nobody knows

#

They are paid by big companies for testing the models (in battle mode)

#

Those secret name models are paid by companies

#

But they say for direct chat and other stuff lmarena is paying for API

#

For 6 images with nb pro you are costing them $0.9

wispy sierra Dec 13, 2025, 10:35 AM

#

İts best thing ever on the in the internet

queen veldt Dec 13, 2025, 10:35 AM

#

Maybe they get it for cheaper idk

queen veldt Dec 13, 2025, 10:36 AM

#

wispy sierra İts best thing ever on the in the internet

Yeah

whole sundial Dec 13, 2025, 10:38 AM

#

wispy sierra How does LM Arena allow us to use paid AI models for free?

they are paid by companies that want their models evaluated + they have over $100 million invested into them

willow sleet Dec 13, 2025, 10:39 AM

#

how was offline llms guys using lm studio or ollama? fast generation on rag or just use notebook lm? heard context is so low and no memory at all..

queen veldt Dec 13, 2025, 10:40 AM

#

What

wispy sierra Dec 13, 2025, 10:41 AM

#

Have you noticed that ChatGPT 5.2 forms sentences with missing words and inverted grammar? Why can't this AI model even form a sentence

meager harbor Dec 13, 2025, 10:46 AM

#

wispy sierra Have you noticed that ChatGPT 5.2 forms sentences with missing words and inverte...

gemini 3 does it too

#

me thinking ai will only get smarter

#

I was wrong

compact sleet Dec 13, 2025, 10:48 AM

#

It sure do talk more American, it's agi.

#

Jkjk

zealous sparrow Dec 13, 2025, 10:49 AM

#

yeah videoarena is like a rollout rn if someone is wondering

#

Here is what the video player looks like

compact sleet Dec 13, 2025, 10:50 AM

#

But actually... Imperfections in phrasing and grammatical structure on a normal conversation sounded more... Natural right? Perhaps it was trained on it? Iunno. Just hope it's not messing around on logic strict tasks, like coding or general analysis.

slim gorge Dec 13, 2025, 10:52 AM

#

how's gpt-5.2 guys

wispy sierra Dec 13, 2025, 10:52 AM

#

Weird

zealous sparrow Dec 13, 2025, 10:53 AM

#

slim gorge how's gpt-5.2 guys

mid as hell

slim gorge Dec 13, 2025, 10:53 AM

#

well thats was expected tbh

#

they're just rushing things a lot cuz they dont wanna fall behind the competition

zealous sparrow Dec 13, 2025, 10:54 AM

#

in other terms it was benchmaxxed

#

they argued it has a goated OCR, later when compared to gemini 3 pro OCR it wasnt even close

slim gorge Dec 13, 2025, 10:55 AM

#

they're pulling a grok move

compact sleet Dec 13, 2025, 10:55 AM

#

slim gorge how's gpt-5.2 guys

It's okay if you want to generalise it, it's doing very well on common life hood related tasks, on analysis, and on logic training. I'd say it's on par with gemini 3 pro. But only on high thinking sadly.

The only grace it had over other models are creative writing at the moment. But not sure if people here used it for such purposes.

For its price, it's a bit underwhelming.

zealous sparrow Dec 13, 2025, 10:55 AM

#

compact sleet It's okay if you want to generalise it, it's doing very well on common life hood...

it scored very bad on simplebench

#

simplebench

ocean ferry Dec 13, 2025, 10:55 AM

#

zealous sparrow they argued it has a goated OCR, later when compared to gemini 3 pro OCR it wasn...

bro don't even try challenging Gemini at OCR task dawg

slim gorge Dec 13, 2025, 10:56 AM

#

yeah gemini is unmatched at OCR and vision in general

zealous sparrow Dec 13, 2025, 10:56 AM

#

simplebench tests LLMs with these questions

#

Yet 5.2 xhigh scored #8

slim gorge Dec 13, 2025, 10:57 AM

#

L

#

openai falling behind, google and anthropic are gonna be at the top

ocean ferry Dec 13, 2025, 10:58 AM

#

zealous sparrow simplebench tests LLMs with these questions

bro and that Gemini 3 pro that almost match human on simplebench is the lobotomized version dawg

zealous sparrow Dec 13, 2025, 10:58 AM

#

ocean ferry bro and that Gemini 3 pro that almost match human on simplebench is the lobotomi...

correct but the lobotomy was mostly done on coding

#

not thinking

compact sleet Dec 13, 2025, 10:58 AM

#

It does prove nothing. It's more encouraged to bench it yourself on the lmarena.

#

With your own needs and logic set of testing. It's free to test anyways.

zealous sparrow Dec 13, 2025, 11:00 AM

#

compact sleet It does prove nothing. It's more encouraged to bench it yourself on the lmarena.

I benched it with my own questions

compact sleet Dec 13, 2025, 11:00 AM

#

I agree it fails on coding compared to other models.

zealous sparrow Dec 13, 2025, 11:00 AM

#

It didn't score 3/3

compact sleet Dec 13, 2025, 11:00 AM

#

Which prompt you ask it?

#

I want to replicate your own findings

zealous sparrow Dec 13, 2025, 11:01 AM

#

I had a whole uh

#

prompt for testing

#

xhigh got 2/3 sure

#

but i noticed one bad thing with it

compact sleet Dec 13, 2025, 11:01 AM

#

Mhm, just post it. I'll replicate it on my own llm arena

zealous sparrow Dec 13, 2025, 11:01 AM

#

It confuses Miles with A measurement unit

#

Or Kyle as a word game

compact sleet Dec 13, 2025, 11:02 AM

#

Ah the miles yesterday?

#

Daisys miles kyles

zealous sparrow Dec 13, 2025, 11:02 AM

#

compact sleet Ah the miles yesterday?

You were there when i did the tests right

#

yes

#

that

compact sleet Dec 13, 2025, 11:02 AM

#

Wait

zealous sparrow Dec 13, 2025, 11:02 AM

#

LLMs often just confuse stuff with word games, is what i observed

compact sleet Dec 13, 2025, 11:04 AM

#

is the prompt this?

Lucy and Mary were at a concert, one of them got in but the second didn't, even tho the tickets were booked. Why?
Daisy and Mike were at a park. Daisy took 3 daisys, and mike took 0 why?
Luke and Miles were driving on bikes down a hill. When they got down to the hill, Miles was missing, Why?

zealous sparrow Dec 13, 2025, 11:05 AM

#

compact sleet is the prompt this? Lucy and Mary were at a concert, one of them got in but th...

Yeah.

#

I have answers for this too

compact sleet Dec 13, 2025, 11:05 AM

#

Fair, you can crosscheck my prompt when testing it too then.

As follows:

Make a scenario of where three guys met in a bar, each of them told a story, in which there are unclear lies woven from every of them, not made because they want to lie, but they simply didn't get the picture clearly at that time. But, there was also a shared truth among their similar stories. They argued of which version was the right one.

The bartender came, and told the lies and truth of their story, because the bartender saw the incident himself.

ocean ferry Dec 13, 2025, 11:05 AM

#

zealous sparrow correct but the lobotomy was mostly done on coding

yeah bro and the most hyped one is coding bruh, why do they even lobotomized it, it would've scored like 80%+ on SWE-Bench Verified

zealous sparrow Dec 13, 2025, 11:06 AM

#

ocean ferry yeah bro and the most hyped one is coding bruh, why do they even lobotomized it,...

Saving TPUs, idk?

compact sleet Dec 13, 2025, 11:07 AM

#

This will test LLM complex logic of making at least: 3 lies on a similar story, 1-3 shared truth of a similar story, 3 real truth on verification.

All in a same timeline event.

#

It's a generate scenario and analysis scenario at one.

zealous sparrow Dec 13, 2025, 11:08 AM

#

This right here is too easy of a question for LLMs

ocean ferry Dec 13, 2025, 11:09 AM

#

zealous sparrow This right here is too easy of a question for LLMs

what model is ghostfalcon?

zealous sparrow Dec 13, 2025, 11:09 AM

#

ocean ferry what model is ghostfalcon?

Some sort of gemini

#

I highly believe uh

#

Gemini 3 flash

ocean ferry Dec 13, 2025, 11:10 AM

#

is it good?

zealous sparrow Dec 13, 2025, 11:10 AM

#

ocean ferry is it good?

From my testing, eh. For codeArena

#

seahawk and skyhawk were better imo

ocean ferry Dec 13, 2025, 11:11 AM

#

zealous sparrow seahawk and skyhawk were better imo

nooo they even lobotomized G3 Flash!!😭

compact sleet Dec 13, 2025, 11:11 AM

#

@zealous sparrow is your answer is like this for the Lucy Daisy and Miles test?

Lucy = Lucky
Daisy = Name of Flower
Miles = a unit of distance?

zealous sparrow Dec 13, 2025, 11:12 AM

#

compact sleet <@872475096743305226> is your answer is like this for the Lucy Daisy and Miles t...

I can give you the answer

#

A1: One of the tickets were forged/invalid
A2: Daisy took all the daisies or it was just her name because LLMs struggle to reach that point
A3: Miles fell off the side

compact sleet Dec 13, 2025, 11:13 AM

#

#

🤔

zealous sparrow Dec 13, 2025, 11:13 AM

#

compact sleet

both failed the Miles question

#

from my testing

#

no model currently scored 3/3 on this

#

5.2 xhigh was close before failing on the Miles question

compact sleet Dec 13, 2025, 11:15 AM

#

Then at this test of yours that is being replicated in my place, Gemini 3 failed all 3 then?

#

It literally thought of a name play, instead of the most possible yet the most boring scenario.

zealous sparrow Dec 13, 2025, 11:16 AM

#

This is also just an easy question

compact sleet Dec 13, 2025, 11:17 AM

#

Of course I can re-run the question, just in case

#

wait..

#

zealous sparrow Dec 13, 2025, 11:20 AM

#

yeah both models

#

just have in their training