#general | Arena | Page 266

wicked sage Feb 19, 2026, 4:28 PM

#

thanks trey

stray aspen Feb 19, 2026, 4:28 PM

#

it sucks

echo aurora Feb 19, 2026, 4:28 PM

#

Time moved forward

stray aspen Feb 19, 2026, 4:28 PM

#

ill just stick to claudius

wicked sage Feb 19, 2026, 4:28 PM

#

claudius

modest prism Feb 19, 2026, 4:28 PM

#

Is Gemini 3.1 good

wicked sage Feb 19, 2026, 4:28 PM

#

best name oat

lost patrol Feb 19, 2026, 4:28 PM

#

i bet 3.5 earliest on google I/O. So in on May 19th

thorny schooner Feb 19, 2026, 4:28 PM

#

Going to be real though i keep forgetting how much direct chat get get so disrupted by random competitive mode it's crazy

sick mantle Feb 19, 2026, 4:28 PM

#

Pineapple WHEN IS SEEDANCE 2.0 COMEING

stone tundra Feb 19, 2026, 4:29 PM

#

why si 3.1

#

so low

cloud zinc Feb 19, 2026, 4:29 PM

#

modest prism Feb 19, 2026, 4:29 PM

#

cloud zinc

It is AGI

cloud zinc Feb 19, 2026, 4:29 PM

#

inner relic Feb 19, 2026, 4:29 PM

#

huh

#

bro

echo aurora Feb 19, 2026, 4:29 PM

#

sick mantle Pineapple WHEN IS SEEDANCE 2.0 COMEING

Sorry going to have to give the boring answer -> I won't be able to share details about what new models or features are upcoming until we're ready to share more. Would recommend to keep an eye on our announcement channel.

cloud zinc Feb 19, 2026, 4:29 PM

#

80.6 on swe

loud verge Feb 19, 2026, 4:29 PM

#

Thanks socky

sick mantle Feb 19, 2026, 4:29 PM

#

echo aurora Sorry going to have to give the boring answer -> I won't be able to share detail...

Pineapple last queston can u check my dms

stray aspen Feb 19, 2026, 4:29 PM

#

is the gemini on lm arena high thinking

errant zodiac Feb 19, 2026, 4:30 PM

#

Why do I not have permissions to use the video arena channels? For some reason, Arena won't let me use those channels.

cunning birch Feb 19, 2026, 4:30 PM

#

Gemini pro is just so dumb... More stupid than GLM

sick mantle Feb 19, 2026, 4:30 PM

#

errant zodiac Why do I not have permissions to use the video arena channels? For some reason, ...

Its deleted.

rigid holly Feb 19, 2026, 4:30 PM

#

I dont mean to be a nag, but its been like 2 days and yet the thinking model for sonnet 4.6 still did not come out while usually the previous ones came out withing the hour

Is it like a whole intentional thing or what @echo aurora

echo sinew Feb 19, 2026, 4:30 PM

#

errant zodiac Why do I not have permissions to use the video arena channels? For some reason, ...

Note that Video Arena has been removed from the server. More information can be found in this announcement.

stray aspen Feb 19, 2026, 4:30 PM

#

nah

inner relic Feb 19, 2026, 4:30 PM

#

what are ya yapping about, this thing just released

#

yes

stray aspen Feb 19, 2026, 4:30 PM

#

claude is better

inner relic Feb 19, 2026, 4:30 PM

#

it created better

#

building

cunning birch Feb 19, 2026, 4:30 PM

#

At all !!!

modest prism Feb 19, 2026, 4:30 PM

#

rigid holly I dont mean to be a nag, but its been like 2 days and yet the thinking model f...

Because sonnet 4.6 thinks for so long. The API cost would be too high

echo aurora Feb 19, 2026, 4:31 PM

#

sick mantle Pineapple last queston can u check my dms

Yeah I will when I get to it.

errant zodiac Feb 19, 2026, 4:31 PM

#

sick mantle Its deleted.

What does that mean? The channel is deleted? Or my permission is deleted? And if my permission is deleted - why?

celest orchid Feb 19, 2026, 4:31 PM

#

stray aspen claude is better

and cheaper)

golden ocean Feb 19, 2026, 4:31 PM

#

errant zodiac What does that mean? The channel is deleted? Or my permission is deleted? And if...

existensial crisis

sick mantle Feb 19, 2026, 4:31 PM

#

errant zodiac What does that mean? The channel is deleted? Or my permission is deleted? And if...

The bot its gone

echo aurora Feb 19, 2026, 4:31 PM

#

errant zodiac What does that mean? The channel is deleted? Or my permission is deleted? And if...

The bot was removed from the server.

rigid holly Feb 19, 2026, 4:31 PM

#

Ah. Ok. Atleast i got a answer.

echo aurora Feb 19, 2026, 4:31 PM

#

Moving Video Arena off of Discord and just onto the site.

inner relic Feb 19, 2026, 4:31 PM

#

Okay it's your opinion

errant zodiac Feb 19, 2026, 4:31 PM

#

echo aurora The bot was removed from the server.

Oh. Does Arena plan to add it back?

#

Ah. I understand.

echo aurora Feb 19, 2026, 4:32 PM

#

rigid holly I dont mean to be a nag, but its been like 2 days and yet the thinking model f...

The request for thinking versions has been raised. There are various reasons why we will or won't have specific models added.

stray aspen Feb 19, 2026, 4:32 PM

#

bro this thing has early deepseek reasoning times

#

takes forever

echo aurora Feb 19, 2026, 4:32 PM

#

errant zodiac Oh. Does Arena plan to add it back?

No, not at this time. We're too limited with the bot. Unable to add new functionality to Video Arena so it's best we have it just on the site so we can focus efforts.

sick mantle Feb 19, 2026, 4:32 PM

#

echo aurora Yeah I will when I get to it.

stray aspen Feb 19, 2026, 4:32 PM

#

why did bro send his pfp

wheat thorn Feb 19, 2026, 4:33 PM

#

thorny schooner Feb 19, 2026, 4:33 PM

#

echo aurora No, not at this time. We're too limited with the bot. Unable to add new function...

You did see my PIN right in feedback ( just want to make sure)

remote vapor Feb 19, 2026, 4:33 PM

#

why send fake chicken.,.,

errant zodiac Feb 19, 2026, 4:33 PM

#

echo aurora No, not at this time. We're too limited with the bot. Unable to add new function...

Thanks for replying and letting me know.

rugged abyss Feb 19, 2026, 4:33 PM

#

Really interest in coding performance in real tasks

sick mantle Feb 19, 2026, 4:33 PM

#

remote vapor why send fake chicken.,.,

ai chicken

stray aspen Feb 19, 2026, 4:34 PM

#

that chiken aint real

arctic basin Feb 19, 2026, 4:34 PM

#

echo aurora No, not at this time. We're too limited with the bot. Unable to add new function...

with video removal, new users intro process to this discord is perma locked on "Generate a video"

remote vapor Feb 19, 2026, 4:34 PM

#

sick mantle ai chicken

yes exactly

sick mantle Feb 19, 2026, 4:34 PM

#

remote vapor yes exactly

We got ai chicken before gta 6

remote vapor Feb 19, 2026, 4:34 PM

#

yea fair-

stray aspen Feb 19, 2026, 4:35 PM

#

<@&1349916362595635286>

remote vapor Feb 19, 2026, 4:35 PM

#

oh come on...

sick mantle Feb 19, 2026, 4:35 PM

#

Fake

remote vapor Feb 19, 2026, 4:35 PM

#

sick mantle Fake

just like chicken?

stray aspen Feb 19, 2026, 4:35 PM

#

is anyone even still falling for that scam

celest orchid Feb 19, 2026, 4:35 PM

#

stray aspen <@&1349916362595635286>

Thank You

sick mantle Feb 19, 2026, 4:35 PM

#

Ofc he is

proud bobcat Feb 19, 2026, 4:35 PM

#

Oh shi

#

Gemini 3.1 pro

echo aurora Feb 19, 2026, 4:36 PM

#

sick mantle

Tried to combine both our pfp

proud bobcat Feb 19, 2026, 4:36 PM

#

It’s not supposed to be groundbreaking

sick mantle Feb 19, 2026, 4:36 PM

#

echo aurora Tried to combine both our pfp

YOOOO

stray aspen Feb 19, 2026, 4:36 PM

#

this coding is terrible

sick mantle Feb 19, 2026, 4:36 PM

#

SICK

proud bobcat Feb 19, 2026, 4:36 PM

#

It’s a refresh

stray aspen Feb 19, 2026, 4:36 PM

#

theres no order

#

gemini 3.1 sucks

inner relic Feb 19, 2026, 4:37 PM

#

proud bobcat Feb 19, 2026, 4:37 PM

#

Hold on let me check this out

sick mantle Feb 19, 2026, 4:37 PM

#

echo aurora Tried to combine both our pfp

I love it!

proud bobcat Feb 19, 2026, 4:37 PM

#

Let me give it a pretty simple task

small lintel Feb 19, 2026, 4:37 PM

#

remote vapor Feb 19, 2026, 4:38 PM

#

why peeps posting slop?

proud bobcat Feb 19, 2026, 4:38 PM

#

I think people forget video arena moved

cunning birch Feb 19, 2026, 4:38 PM

#

stray aspen this coding is terrible

Same problem 😂 😂

proud bobcat Feb 19, 2026, 4:38 PM

#

Yeah Gemini code was never good

#

Me and my friend tested it out and it always was just messy

#

Did it work? Yeah

#

But it’s still ahh

lost patrol Feb 19, 2026, 4:39 PM

#

proud bobcat Feb 19, 2026, 4:39 PM

#

lost patrol

HELL NAH

#

HELLLLLLL NAHHH

sick mantle Feb 19, 2026, 4:39 PM

#

But pineapple remove limts

proud bobcat Feb 19, 2026, 4:39 PM

#

I DONT BELIEVE THAT AT ALL

sonic swallow Feb 19, 2026, 4:39 PM

#

/vídeo

fickle venture Feb 19, 2026, 4:39 PM

#

LMAO GEMINI 3.1 IS GOOD

stray aspen Feb 19, 2026, 4:39 PM

#

seems like it bunched up a lot of code in a single line to reduce the total lines by 700

cunning birch Feb 19, 2026, 4:39 PM

#

I thought Google will destroy all her competitors..
Now the competitors are safe 😆 😆 😆

fickle venture Feb 19, 2026, 4:39 PM

#

It will get nerfed tomorrow

inner relic Feb 19, 2026, 4:39 PM

#

there's a thinking level

#

medium

#

I think yall using high thinking

cunning birch Feb 19, 2026, 4:40 PM

#

fickle venture It will get nerfed tomorrow

It is terrible today what about tomorrow

fickle venture Feb 19, 2026, 4:40 PM

#

cunning birch It is terrible today what about tomorrow

Probably more terrible

proud bobcat Feb 19, 2026, 4:41 PM

#

sonic swallow /vídeo

Video arena is moved to the website

lost patrol Feb 19, 2026, 4:41 PM

#

lost patrol

source
https://x.com/ArtificialAnlys/status/2024518545510662602

Artificial Analysis (@ArtificialAnlys)

Google is once again the leader in AI: Gemini 3.1 Pro Preview leads the Artificial Analysis Intelligence Index, 4 points ahead of Claude Opus 4.6 while costing less than half as much to run

@GoogleDeepMind gave us pre-release access to Gemini 3.1 Pro Preview. It leads 6 of the

proud bobcat Feb 19, 2026, 4:41 PM

#

Yeah I know it’s from artificial analysis

#

Yeah no this is ridiculous

#

It’s a literal refresh model

#

INSANELY slow too

shrewd citrus Feb 19, 2026, 4:42 PM

#

yeah lol like it should’ve achieved 1st place on the arena leaderboard

proud bobcat Feb 19, 2026, 4:42 PM

#

Depends

shrewd citrus Feb 19, 2026, 4:42 PM

#

with those “insane” benchmark results

thorny schooner Feb 19, 2026, 4:42 PM

#

Well from what I'm hearing I guess it was a good thing I did not use that most recent AI the granted it probably wouldn't had help right now with the compares and stuff so still funny to hear some of the complaining ( cuz well I mean I was So nice to see I'm not the only one)

proud bobcat Feb 19, 2026, 4:42 PM

#

GPT 5.2 isn’t a refresh

lost patrol Feb 19, 2026, 4:42 PM

#

hallucination gone back it seams

proud bobcat Feb 19, 2026, 4:42 PM

#

That’s a whole ass new architecture

#

“it actually is” 5.1 was a refresh

#

5.2 was completely diff

#

Yeah

proud bobcat Feb 19, 2026, 4:43 PM

#

lost patrol hallucination gone back it seams

Hold let me check this out

#

How is Gemini 3.1 pro number 4 in speed this is slow as hell

#

Number one in coding?

lost patrol Feb 19, 2026, 4:43 PM

#

at least at AA 😉

proud bobcat Feb 19, 2026, 4:43 PM

#

Nahhh

harsh flume Feb 19, 2026, 4:44 PM

#

we at a point of dinimishing returns. More compute, better inference, better artificial data, but still same paradigms

#

There wont be another GPT3 moment in a long time

low patrol Feb 19, 2026, 4:44 PM

#

W GEMINI in the chat

#

spam w gemini

celest orchid Feb 19, 2026, 4:44 PM

#

nice

low patrol Feb 19, 2026, 4:44 PM

#

celest orchid nice

use in aistudio

royal sail Feb 19, 2026, 4:44 PM

#

lost patrol at least at AA 😉

hopefully this shows in usage

lost patrol Feb 19, 2026, 4:45 PM

#

yeah

royal sail Feb 19, 2026, 4:45 PM

#

older models hallucinated all the damn time

proud bobcat Feb 19, 2026, 4:45 PM

#

Wow this is slow dude

#

Holy hell

harsh flume Feb 19, 2026, 4:45 PM

#

but thats nitpicking, in general compute is increasing at a linear pace

#

Opus 4.6 wasn't a gamechanger. This is what I mean, diminishing returns

plucky sparrow Feb 19, 2026, 4:46 PM

#

is this good? 🤔 doesn't seem that impressive to me:

sick mantle Feb 19, 2026, 4:46 PM

#

plucky sparrow is this good? 🤔 doesn't seem that impressive to me:

Yea

lost patrol Feb 19, 2026, 4:46 PM

#

lets see, how it's on simple bench 😄

stray aspen Feb 19, 2026, 4:47 PM

#

plucky sparrow is this good? 🤔 doesn't seem that impressive to me:

average

echo aurora Feb 19, 2026, 4:47 PM

#

celest orchid nice

Since it's going to be a new/high demand models there errors may happen, team is monitoring this

stray aspen Feb 19, 2026, 4:47 PM

#

they migrated from the video arena channels

echo aurora Feb 19, 2026, 4:48 PM

#

I think it's people trying to use Video Arena

sick mantle Feb 19, 2026, 4:48 PM

#

@echo aurora I wanna use sora on arena but it chooses for me random model please fix

royal sail Feb 19, 2026, 4:48 PM

#

harsh flume Opus 4.6 wasn't a gamechanger. This is what I mean, diminishing returns

I think we've been at a point of diminishing returns for a bit, but improvements are definitely still meaningful

celest orchid Feb 19, 2026, 4:48 PM

#

plucky sparrow is this good? 🤔 doesn't seem that impressive to me:

WOW

echo aurora Feb 19, 2026, 4:48 PM

#

sick mantle <@283397944160550928> I wanna use sora on arena but it chooses for me random mod...

Isn't broken, that's intended

low patrol Feb 19, 2026, 4:48 PM

#

safety censor has reduced

#

in gemini website and aistudio

stray aspen Feb 19, 2026, 4:48 PM

#

plucky sparrow is this good? 🤔 doesn't seem that impressive to me:

is this lmarena or ai studio

proud bobcat Feb 19, 2026, 4:49 PM

#

3.1 PRO PASSED THE CAR WASH TEST

#

YESSSSS

stray aspen Feb 19, 2026, 4:49 PM

#

proud bobcat 3.1 PRO PASSED THE CAR WASH TEST

lmarena or studio

proud bobcat Feb 19, 2026, 4:49 PM

#

Arena

low patrol Feb 19, 2026, 4:49 PM

#

stray aspen lmarena or studio

why are you asking? is studio worse?

stray aspen Feb 19, 2026, 4:49 PM

#

yes

#

it gave me a non working script

#

lmarena gave me one that works

harsh flume Feb 19, 2026, 4:49 PM

#

There's a chance compute is too high, I predict there'll be a inflexion point of algorithmic paradigm shift way more radical than what deepseek did and we get to a way higher joules efficiency

sick mantle Feb 19, 2026, 4:49 PM

#

echo aurora Isn't broken, that's intended

I know i just want side by side video

quartz light Feb 19, 2026, 4:50 PM

#

my screenshot btw

#

:(

harsh flume Feb 19, 2026, 4:51 PM

#

I don't mean compute is high in an absolute sense, rather energy output towards compute is too high

surreal zephyr Feb 19, 2026, 4:51 PM

#

ok so gemini 3.1 pro is comparable to 4.6 sonnet nonthinking

harsh flume Feb 19, 2026, 4:51 PM

#

The human brain is enough evidence of that

surreal zephyr Feb 19, 2026, 4:51 PM

#

but much worse than sonnet high or opus high

low patrol Feb 19, 2026, 4:51 PM

#

surreal zephyr ok so gemini 3.1 pro is comparable to 4.6 sonnet nonthinking

i thought it is better than them

#

in leaderboard artifical analysis

proud bobcat Feb 19, 2026, 4:51 PM

#

Artificial analysis is benchmarked

low patrol Feb 19, 2026, 4:52 PM

#

proud bobcat Artificial analysis is benchmarked

what does that mean

patent bane Feb 19, 2026, 4:52 PM

#

proud bobcat 3.1 PRO PASSED THE CAR WASH TEST

thought all the models get it?

surreal zephyr Feb 19, 2026, 4:52 PM

#

low patrol i thought it is better than them

not even close

frosty lava Feb 19, 2026, 4:52 PM

#

how does 3.1 not really better than 3 ?

proud bobcat Feb 19, 2026, 4:52 PM

#

low patrol what does that mean

I

#

Sorry

surreal zephyr Feb 19, 2026, 4:52 PM

#

patent bane thought all the models get it?

gpt models fail, google and claude models pass

proud bobcat Feb 19, 2026, 4:52 PM

#

Benchmaxxed

proud bobcat Feb 19, 2026, 4:52 PM

#

patent bane thought all the models get it?

Not at all

simple kayak Feb 19, 2026, 4:52 PM

#

frosty lava Feb 19, 2026, 4:52 PM

#

when we got opus 4-6 it was much better than 4-5 and gpt 5.3 codex is much better than gpt 5.2 then why not for gemini 3.1

low patrol Feb 19, 2026, 4:53 PM

#

so it is worse than gemini 3?

#

or better

proud bobcat Feb 19, 2026, 4:53 PM

#

Uhhh

#

It’s like

stray aspen Feb 19, 2026, 4:53 PM

#

great i just lost my massive manually made prompt to a somethnig went wrong error

proud bobcat Feb 19, 2026, 4:53 PM

#

Marginal improvement

low patrol Feb 19, 2026, 4:53 PM

#

🥺

quartz light Feb 19, 2026, 4:53 PM

#

stray aspen great i just lost my massive manually made prompt to a somethnig went wrong erro...

wrote a 9k char prompt with no new lines today without noticing

half mist Feb 19, 2026, 4:53 PM

#

simple kayak

I would say Gemini 3.1 Pro is very good besides coding

frosty lava Feb 19, 2026, 4:54 PM

#

they released a new model knowing it was just a small improvement ?

proud bobcat Feb 19, 2026, 4:54 PM

#

Benchmarks wise it’s defo a lot better

#

Actual use?

#

Eh.

#

Defo a LOT more token efficient

stray aspen Feb 19, 2026, 4:55 PM

#

benchmarks say theres a massive leap

harsh flume Feb 19, 2026, 4:55 PM

#

proud bobcat Actual use?

Thats why LMArena is the goat

sick mantle Feb 19, 2026, 4:55 PM

#

harsh flume Thats why LMArena is the goat

True

frosty lava Feb 19, 2026, 4:55 PM

#

stray aspen benchmarks say theres a massive leap

if that was really the case then we should see it

half mist Feb 19, 2026, 4:55 PM

#

frosty lava they released a new model knowing it was just a small improvement ?

That’s why it’s Gemini 3.1 Pro instead of Gemini 3.5 Pro

simple kayak Feb 19, 2026, 4:55 PM

#

half mist I would say Gemini 3.1 Pro is very good besides coding

claude is probably still better lol

proud bobcat Feb 19, 2026, 4:55 PM

#

To be fair it is a refresh

frosty lava Feb 19, 2026, 4:55 PM

#

half mist That’s why it’s Gemini 3.1 Pro instead of Gemini 3.5 Pro

like i said for example opus 4-6 is much better than 4-5

#

same for gpt

harsh flume Feb 19, 2026, 4:55 PM

#

half mist That’s why it’s Gemini 3.1 Pro instead of Gemini 3.5 Pro

in b4 Gemini 3.105 Pro

proud bobcat Feb 19, 2026, 4:56 PM

#

Gemini 3.1.2 pro xhigh ultra spark codex

half mist Feb 19, 2026, 4:56 PM

#

frosty lava like i said for example opus 4-6 is much better than 4-5

Except for use on Code Arena which is very glitchy like every other new model added to Code Arena

proud bobcat Feb 19, 2026, 4:56 PM

#

Gets 0.2% increase in swebench

frosty lava Feb 19, 2026, 4:56 PM

#

half mist Except for use on Code Arena which is very glitchy like every other new model ad...

but that's arena problem

#

not the model

proud bobcat Feb 19, 2026, 4:56 PM

#

Yeah this is a joke Gemini 3.1 pro refuses to code

half mist Feb 19, 2026, 4:56 PM

#

frosty lava not the model

I still think Claude is better at coding, and I use it for my personal projects

proud bobcat Feb 19, 2026, 4:57 PM

#

OH

#

WAIT

#

ITS DOINF SOMETHING

frosty lava Feb 19, 2026, 4:57 PM

#

i would liked to see as much improvement from gemini 3 to 3.1 than what we saw for opus and gpt codex

half mist Feb 19, 2026, 4:57 PM

#

proud bobcat ITS DOINF SOMETHING

I tried Gemini 3.1 Pro, and it couldn’t even get the start button right 💀

proud bobcat Feb 19, 2026, 4:58 PM

#

I’m asking it to make a simple 3D fps shooter and it’s installed 6 packages

#

I’m sorry?????

surreal zephyr Feb 19, 2026, 4:58 PM

#

half mist I tried Gemini 3.1 Pro, and it couldn’t even get the start button right 💀

same, it sucsk

hoary elbow Feb 19, 2026, 4:58 PM

#

Is it still optimized or is it like when 3.0 first came out?

#

Cause after it came out, it got Nerfed

proud bobcat Feb 19, 2026, 4:58 PM

#

What’s optimized

#

OH

half mist Feb 19, 2026, 4:58 PM

#

surreal zephyr same, it sucsk

Just use Claude, or even Gemini 3 Flash (which is somehow better at coding than Gemini 3.1 Pro)

proud bobcat Feb 19, 2026, 4:58 PM

#

Quantization

hoary elbow Feb 19, 2026, 4:58 PM

#

Oh yeah, is it still quantized?

proud bobcat Feb 19, 2026, 4:59 PM

#

Uhhh no looks like it’s working at full precision

hoary elbow Feb 19, 2026, 4:59 PM

#

Or is it like Gemini 3.0 prime

proud bobcat Feb 19, 2026, 4:59 PM

#

As a tradeoff

Abysmally slow

surreal zephyr Feb 19, 2026, 4:59 PM

#

half mist Just use Claude, or even Gemini 3 Flash (which is somehow better at coding than ...

quite sad imo

plucky sparrow Feb 19, 2026, 4:59 PM

#

it's quantized

proud bobcat Feb 19, 2026, 4:59 PM

#

No it’s not

#

It’s slow as hell dude

surreal zephyr Feb 19, 2026, 4:59 PM

#

hoary elbow Or is it like Gemini 3.0 prime

not even close

proud bobcat Feb 19, 2026, 4:59 PM

#

That’s a full precision model

plucky sparrow Feb 19, 2026, 4:59 PM

#

probably just cause everyone is using it. it doesn't seem much better

proud bobcat Feb 19, 2026, 4:59 PM

#

plucky sparrow probably just cause everyone is using it. it doesn't seem much better

Not how compute works

#

Google doesn’t do batch inference

celest orchid Feb 19, 2026, 5:00 PM

#

its fun

surreal zephyr Feb 19, 2026, 5:00 PM

#

celest orchid its fun

bad

pulsar crystal Feb 19, 2026, 5:00 PM

#

plucky sparrow it's quantized

how do you know?

proud bobcat Feb 19, 2026, 5:00 PM

#

Looks fine

proud bobcat Feb 19, 2026, 5:00 PM

#

pulsar crystal how do you know?

He doesn’t

#

Lmao

surreal zephyr Feb 19, 2026, 5:00 PM

#

proud bobcat He doesn’t

honestly

#

it probably is quantized

plucky sparrow Feb 19, 2026, 5:00 PM

#

he's right, I don't, I'm just basing it on the fact there was a model 1-2 weeks ago that was miles ahead of the current release

surreal zephyr Feb 19, 2026, 5:00 PM

#

doesnt mean we are ever getting the real model lol

proud bobcat Feb 19, 2026, 5:01 PM

#

If it was quantized it would be fast

stray aspen Feb 19, 2026, 5:01 PM

#

i hate this so much

hoary elbow Feb 19, 2026, 5:01 PM

#

I’m getting mixed reactions. Can I make a poll here? I can’t find rules anymore and the last time I read, it was a long time ago

surreal zephyr Feb 19, 2026, 5:01 PM

#

proud bobcat If it was quantized it would be fast

its quantized and slow

frosty lava Feb 19, 2026, 5:01 PM

#

plucky sparrow he's right, I don't, I'm just basing it on the fact there was a model 1-2 weeks ...

cause that was more like a new thinking mode no ?

surreal zephyr Feb 19, 2026, 5:01 PM

#

it was very big before release

proud bobcat Feb 19, 2026, 5:01 PM

#

surreal zephyr its quantized and slow

https://tenor.com/view/patrick-spongebob-spongebob-meme-patrick-meme-dumb-patrick-gif-9974665538168463324

Tenor

#

Two conflicting points

#

Models are quantized to be faster

surreal zephyr Feb 19, 2026, 5:02 PM

#

proud bobcat https://tenor.com/view/patrick-spongebob-spongebob-meme-patrick-meme-dumb-patric...

opus 4.5 is quantized too

proud bobcat Feb 19, 2026, 5:02 PM

#

How do you think opus is so fast

#

Yeah exactly

surreal zephyr Feb 19, 2026, 5:02 PM

#

proud bobcat Models are quantized to be faster

or cheaper

surreal zephyr Feb 19, 2026, 5:02 PM

#

proud bobcat How do you think opus is so fast

its not fast 💀

#

codex spark is fast

#

opus isnt

proud bobcat Feb 19, 2026, 5:02 PM

#

surreal zephyr its not fast 💀

Yeah it is?

harsh flume Feb 19, 2026, 5:02 PM

#

For Claude users, what is the best EV in getting usage? Their plan or just go to Openrouter API?

proud bobcat Feb 19, 2026, 5:02 PM

#

For a behemoth of a model?

#

Absolutely fast

surreal zephyr Feb 19, 2026, 5:02 PM

#

LOL

proud bobcat Feb 19, 2026, 5:02 PM

#

surreal zephyr LOL

LMAO

stray aspen Feb 19, 2026, 5:02 PM

#

harsh flume For Claude users, what is the best EV in getting usage? Their plan or just go to...

this one

proud bobcat Feb 19, 2026, 5:02 PM

#

Dude even Qwen crushes Gemini

#

This is bad

surreal zephyr Feb 19, 2026, 5:03 PM

#

proud bobcat Dude even Qwen crushes Gemini

SIDEF VIEW

#

LOL

harsh flume Feb 19, 2026, 5:03 PM

#

stray aspen this one

i'm too dumb to get the joke

cyan harbor Feb 19, 2026, 5:03 PM

#

why the "remove" option is deleted?

surreal zephyr Feb 19, 2026, 5:03 PM

#

2.5 flash does better than this bro

harsh flume Feb 19, 2026, 5:03 PM

#

ill wait for the subtitled version

frosty lava Feb 19, 2026, 5:03 PM

#

i think its lazy as hell

proud bobcat Feb 19, 2026, 5:03 PM

#

surreal zephyr SIDEF VIEW

Reminds me of that one panzer

frosty lava Feb 19, 2026, 5:03 PM

#

you have to prompt it well

proud bobcat Feb 19, 2026, 5:04 PM

#

🥀

#

I KNOW WHY IT SUCKS

surreal zephyr Feb 19, 2026, 5:04 PM

#

proud bobcat I KNOW WHY IT SUCKS

?

proud bobcat Feb 19, 2026, 5:04 PM

#

WAIT

#

YEAH

#

ITS BEING TOKEN EFFICIENT

proud bobcat Feb 19, 2026, 5:04 PM

#

surreal zephyr ?

It’s being aggressively token efficient

#

The files it’s making don’t even go above 100 lines of code

#

It’s strangling itself

surreal zephyr Feb 19, 2026, 5:05 PM

#

proud bobcat The files it’s making don’t even go above 100 lines of code

the tank has 300

proud bobcat Feb 19, 2026, 5:05 PM

#

Not enough at all

#

See what I mean?

#

It’s been trained to use the least amount of tokens as possible to copy Claude

surreal zephyr Feb 19, 2026, 5:05 PM

#

proud bobcat Not enough at all

the noita prompt had like 3 times repeated to never take any shortcuts and such

#

900 LOC

#

not remotely close to what opus did

royal sail Feb 19, 2026, 5:06 PM

#

proud bobcat ITS BEING TOKEN EFFICIENT

A lot of people have been noticing this with 3.1 and I have too

surreal zephyr Feb 19, 2026, 5:06 PM

#

LOL

royal sail Feb 19, 2026, 5:07 PM

#

The model literally feels lazy

#

it also makes tons of silly syntax errors

surreal zephyr Feb 19, 2026, 5:07 PM

#

gemini worse than gpt 5.2 low?

stray aspen Feb 19, 2026, 5:08 PM

#

this is grok levels of disappointment

royal sail Feb 19, 2026, 5:08 PM

#

I mean

#

The model isn't terrible by any means

#

It just doesn't feel great with some coding tasks

surreal zephyr Feb 19, 2026, 5:09 PM

#

royal sail The model isn't terrible by any means

syntax errors

#

lol

hoary elbow Feb 19, 2026, 5:09 PM

#

arena

lost patrol Feb 19, 2026, 5:09 PM

#

i think google is positioning it more for common user and maybe into the scienece direction.
Not for coding

royal sail Feb 19, 2026, 5:10 PM

#

surreal zephyr syntax errors

i'll admit that's pretty bad for an LLM to have lol

#

But majority of the time, it generates pretty reasonable code

stray aspen Feb 19, 2026, 5:10 PM

#

i think claude could do this better

royal sail Feb 19, 2026, 5:10 PM

#

stray aspen i think claude could do this better

high 🥶

stray aspen Feb 19, 2026, 5:10 PM

#

considering it made me this

sick mantle Feb 19, 2026, 5:11 PM

#

<@&1349916362595635286>

mortal vale Feb 19, 2026, 5:12 PM

#

@twilit sable Note that Video Arena has been removed from the server. More information can be found in this #announcements

stray aspen Feb 19, 2026, 5:12 PM

#

this is way better

surreal zephyr Feb 19, 2026, 5:14 PM

#

welp

stray aspen Feb 19, 2026, 5:14 PM

#

holy cook

surreal zephyr Feb 19, 2026, 5:14 PM

#

the models not even close

stray aspen Feb 19, 2026, 5:15 PM

#

lmao

analog steeple Feb 19, 2026, 5:16 PM

#

god forbid third world countries from touching latest tech 🙏

whole swallow Feb 19, 2026, 5:18 PM

#

WHEN DID 3.1 PRO COME OUT

proud bobcat Feb 19, 2026, 5:19 PM

#

surreal zephyr welp

Try qwen 3.5 lmao

bleak lake Feb 19, 2026, 5:20 PM

#

whole swallow Feb 19, 2026, 5:20 PM

#

Is opus 4.6 that good?

bleak lake Feb 19, 2026, 5:20 PM

#

echo aurora Feb 19, 2026, 5:22 PM

#

whole swallow WHEN DID 3.1 PRO COME OUT

TODAY

proud bobcat Feb 19, 2026, 5:22 PM

#

whole swallow Is opus 4.6 that good?

Yes

shell pewter Feb 19, 2026, 5:24 PM

#

anyone want to share how does gemini 3.1 vs opus 4.6 feel?

bleak lake Feb 19, 2026, 5:24 PM

#

shell pewter anyone want to share how does gemini 3.1 vs opus 4.6 feel?

Benchmarks or tests?

stray aspen Feb 19, 2026, 5:25 PM

#

@echo auroracan max route you to gemini 3.1

proud bobcat Feb 19, 2026, 5:26 PM

#

shell pewter anyone want to share how does gemini 3.1 vs opus 4.6 feel?

Gemini 3.1 is great in everything except coding

#

Claude Opus 4.6 just gives you the same performance

#

Oh and Gemini is slow ahh hell

shell pewter Feb 19, 2026, 5:27 PM

#

bleak lake Benchmarks or tests?

as in did any user test it extensively to have some own opinion, cus the benchmark from google said it really good but the rank on arena is a mixed bag innit

shell pewter Feb 19, 2026, 5:27 PM

#

proud bobcat Gemini 3.1 is great in everything except coding

thx man its always you haha 😂 🙏 👍

proud bobcat Feb 19, 2026, 5:27 PM

#

Got no babes…

#

Only ai…

shell pewter Feb 19, 2026, 5:28 PM

#

proud bobcat Got no babes…

same lmaooo 💀

proud bobcat Feb 19, 2026, 5:28 PM

#

I love how my breakup was the reason I became invested into llms

#

I literally had no other hobby

shell pewter Feb 19, 2026, 5:28 PM

#

proud bobcat Oh and Gemini is slow ahh hell

really gemini 3.0 was the fastest among gpt and opus for me
it gotten slower?

proud bobcat Feb 19, 2026, 5:29 PM

#

Gemini 3.0 still peak

shell pewter Feb 19, 2026, 5:29 PM

#

proud bobcat Gemini 3.0 still peak

but for grounding i like gpt 5.2 more than gemini 3 search tho

stray aspen Feb 19, 2026, 5:30 PM

#

grok 4.2 is the best research model

proud bobcat Feb 19, 2026, 5:30 PM

#

GPT 5.2 search is great if it would USE IT MORE OFTEN

shell pewter Feb 19, 2026, 5:30 PM

#

stray aspen grok 4.2 is the best research model

is it i haven't use it actually

proud bobcat Feb 19, 2026, 5:30 PM

#

stray aspen grok 4.2 is the best research model

Agreed

#

Grok makes the best research models

limber panther Feb 19, 2026, 5:30 PM

#

penguin riding a motorbike

shell pewter Feb 19, 2026, 5:30 PM

#

proud bobcat Grok makes the best research models

alright guess i gotta check it out then thx bud

stray aspen Feb 19, 2026, 5:30 PM

#

shell pewter is it i haven't use it actually

yeah sometimes it goes through 250+ sources

odd geyser Feb 19, 2026, 5:31 PM

#

Why is Claude 4.6 better than Gemini 3.0 pro in the text?

proud bobcat Feb 19, 2026, 5:32 PM

#

odd geyser Why is Claude 4.6 better than Gemini 3.0 pro in the text?

It’s just peak

crystal mica Feb 19, 2026, 5:32 PM

#

i will honestly say. gemini 3.1 pro is great in terms of solving math and etc. problems, but i hate it speech type and position

odd geyser Feb 19, 2026, 5:33 PM

#

proud bobcat It’s just peak

Oh, good then. I just didn't understand why it was better if it looked like it was going to be worse.

proud bobcat Feb 19, 2026, 5:33 PM

#

crystal mica i will honestly say. gemini 3.1 pro is great in terms of solving math and etc. p...

Haven’t tried it too much yet

#

But it really goes crazy in coding

Not in a good way.

#

It installs way too many packages

#

Is slow

sonic swallow Feb 19, 2026, 5:34 PM

#

gerar Prompt de vídeo profissional de uma pessoa que faz flexão rápido

proud bobcat Feb 19, 2026, 5:34 PM

#

Lane

whole swallow Feb 19, 2026, 5:34 PM

#

crystal mica i will honestly say. gemini 3.1 pro is great in terms of solving math and etc. p...

Overwhelming right?

rigid copper Feb 19, 2026, 5:34 PM

#

hi guys

proud bobcat Feb 19, 2026, 5:34 PM

#

It took it a solid 3 seconds to make one line of code

odd geyser Feb 19, 2026, 5:34 PM

#

crystal mica i will honestly say. gemini 3.1 pro is great in terms of solving math and etc. p...

Idk. I like the style of gemini models.

odd geyser Feb 19, 2026, 5:34 PM

#

rigid copper hi guys

Hi.

rigid copper Feb 19, 2026, 5:35 PM

#

odd geyser Hi.

i'm trying to deal with the error in my arena.ai side lol

proud bobcat Feb 19, 2026, 5:35 PM

#

Also Gemini 3.1 is heavily token efficient but not in a good way

#

It has too many shortcuts when coding

#

Claude is thorough

#

Every other model is thorough

rigid copper Feb 19, 2026, 5:36 PM

#

umm @echo aurora i would like to get some help with this

stray aspen Feb 19, 2026, 5:36 PM

#

proud bobcat It has too many shortcuts when coding

it puts like 30 lines of code in a single one just to reduce total line count

#

🥀

surreal zephyr Feb 19, 2026, 5:36 PM

#

bleak lake

proud bobcat Feb 19, 2026, 5:36 PM

#

stray aspen it puts like 30 lines of code in a single one just to reduce total line count

Real

odd geyser Feb 19, 2026, 5:36 PM

#

rigid copper i'm trying to deal with the error in my arena.ai side lol

Hmm. Idk. Maybe the limit, but I'm not sure.

rigid copper Feb 19, 2026, 5:36 PM

#

rigid copper umm <@283397944160550928> i would like to get some help with this

the error was already like 3-4 hours and the conditions still the same

surreal zephyr Feb 19, 2026, 5:36 PM

#

stray aspen it puts like 30 lines of code in a single one just to reduce total line count

maybe garbage system prompt?

proud bobcat Feb 19, 2026, 5:37 PM

#

THE CODE FOR THE MAP IT MADE IS BARELY 64 LINES LONG

#

WHATTT

rigid copper Feb 19, 2026, 5:37 PM

#

odd geyser Hmm. Idk. Maybe the limit, but I'm not sure.

i don't think it's rate limit

royal sail Feb 19, 2026, 5:37 PM

#

proud bobcat Oh and Gemini is slow ahh hell

I feel like it's actually not bad speed wise considering it's so good in regards of verbosity

#

Definitely faster than Opus 4.6

proud bobcat Feb 19, 2026, 5:37 PM

#

Opus 4.6 was faster for me

echo aurora Feb 19, 2026, 5:37 PM

#

rigid copper i'm trying to deal with the error in my arena.ai side lol

It's hard for me to say for sure with the information I have, but based off this image I wouldn't be surprised if this was rate limit. Regardless, you'll want to follow the steps outlined in this message: #1417174113092374689 message

rigid copper Feb 19, 2026, 5:37 PM

#

if it's a rate limit, it shouldn't affect other model like claude since i often use gemini 3 pro

echo aurora Feb 19, 2026, 5:38 PM

#

rigid copper if it's a rate limit, it shouldn't affect other model like claude since i often ...

Note that there is a rate limit for specific model use, but also a rate limit for overall modality use.

proud bobcat Feb 19, 2026, 5:38 PM

#

One thing I do like is that Gemini 3.1 bug checks after it makes files

#

That’s good

echo aurora Feb 19, 2026, 5:39 PM

#

proud bobcat One thing I do like is that Gemini 3.1 bug checks after it makes files

Without being asked to?

proud bobcat Feb 19, 2026, 5:39 PM

#

Mhmm

#

Yeah real

#

Oh

#

It finished coding

#

Let’s see

#

Yeah it’s garbage

#

It’s complete garbage

#

Oh my god

#

Hold on let me send the link

soft matrix Feb 19, 2026, 5:40 PM

#

Opus 4.6 is good with some languages💗

proud bobcat Feb 19, 2026, 5:40 PM

#

https://019c76f6-61ee-793f-938e-620cf50eca9d.arena.site/

3D FPS Survival Game

Check out what I built in Arena's Code Arena - Content is user-generated and unverified

#

This is god awful

shell pewter Feb 19, 2026, 5:41 PM

#

proud bobcat This is god awful

kekw

proud bobcat Feb 19, 2026, 5:41 PM

#

Lmarena isn’t nerfed

#

It’s the same model from Google api

shell pewter Feb 19, 2026, 5:42 PM

#

so code is still opus (how good is sonnet btw?)
search is grok?

proud bobcat Feb 19, 2026, 5:42 PM

#

Studio sucks right now

#

Arena has seemingly been better

#

Lmao let me get Claude sonnet to do the same thing

shell pewter Feb 19, 2026, 5:43 PM

#

its really hard to just use logic without any search, or is it just me?

#

i agree actually

proud bobcat Feb 19, 2026, 5:43 PM

#

Gemini losing the plot

rigid copper Feb 19, 2026, 5:43 PM

#

echo aurora Note that there is a rate limit for specific model use, but also a rate limit fo...

it's normal for specific model to hit rate limit, but at my side, all model are unusable due to error as well (it happen to all model and even the model that i have not being interacting before)

proud bobcat Feb 19, 2026, 5:44 PM

#

Claude is cooking

shell pewter Feb 19, 2026, 5:44 PM

#

proud bobcat Gemini losing the plot

hey brada would you say
code is still opus and research is grok?
what else? how good is sonnet btw?

#

yes 100% this

proud bobcat Feb 19, 2026, 5:45 PM

#

shell pewter hey brada would you say code is still opus and research is grok? what else? how...

Sonnet is fire

#

I think a mix of opus and kimi right now is goated

#

Maybe GLM 5?

#

Research is always grok

#

Grok crushes research

#

Oh my god sonnet is COOKING

echo aurora Feb 19, 2026, 5:45 PM

#

rigid copper it's normal for specific model to hit rate limit, but at my side, all model are ...

Yeah this sounds like it could be something else. I assume this is happening with different modalities as well?

Note I would like to move this convo to #1417174113092374689 so we can keep #general clear for chatting.

plucky sparrow Feb 19, 2026, 5:46 PM

#

no opus deep think is best at research

#

but it's also super expensive

shell pewter Feb 19, 2026, 5:46 PM

#

proud bobcat Sonnet is fire

how much is sonnet performance of opus?

shell pewter Feb 19, 2026, 5:46 PM

#

plucky sparrow but it's also super expensive

yup this

inner relic Feb 19, 2026, 5:46 PM

#

Any leaks about deepseek v4?

#

The whale is really quiet

proud bobcat Feb 19, 2026, 5:46 PM

#

shell pewter how much is sonnet performance of opus?

I’d say 90%

proud bobcat Feb 19, 2026, 5:46 PM

#

inner relic Any leaks about deepseek v4?

Hopefully today?

wind hinge Feb 19, 2026, 5:46 PM

#

/image

rigid copper Feb 19, 2026, 5:46 PM

#

echo aurora Yeah this sounds like it could be something else. I assume this is happening wit...

possibility

shell pewter Feb 19, 2026, 5:47 PM

#

proud bobcat I’d say 90%

thats really good then, consider that 4.6 sonnet is already better than 4.5 opus right?

rigid copper Feb 19, 2026, 5:47 PM

#

tried all model and it won't work :/

proud bobcat Feb 19, 2026, 5:47 PM

#

shell pewter thats really good then, consider that 4.6 sonnet is already better than 4.5 opus...

A bit yes

#

I’d say sonnet is 5-10% better than 4.5 opus

stone tundra Feb 19, 2026, 5:47 PM

#

why is 3.1

#

so low

#

😢

proud bobcat Feb 19, 2026, 5:47 PM

#

stone tundra 😢

Gemini sucks…

#

It’s joever

shell pewter Feb 19, 2026, 5:47 PM

#

I have been using AI for research and code, so for now i guess i only need to use grok and opus/sonnet?

stone tundra Feb 19, 2026, 5:48 PM

#

text leaderboard

proud bobcat Feb 19, 2026, 5:48 PM

#

LMAO I IUST REALIZED GEMINI 3.1 PRO SKIPPED MAKING THE ACTUAL HUD AND GAMEPLAY

#

BRUHHH

#

Ass model

shell pewter Feb 19, 2026, 5:48 PM

#

stone tundra text leaderboard

i guess we need to give it a little bit of time for people to vote more?🤔

#

thx mate 🙏

stone tundra Feb 19, 2026, 5:49 PM

#

lol

astral vortex Feb 19, 2026, 5:49 PM

#

rigid copper possibility

me too

stone tundra Feb 19, 2026, 5:49 PM

#

yeah

#

idk how it fialed os badly

lofty quartz Feb 19, 2026, 5:49 PM

#

मनुष्य किसी भी दुःख को, सहन कर सकता हैं, लेकिन गृह क्लेश उसकी, आत्मा को निचोड़ देती हैं..!

inner relic Feb 19, 2026, 5:49 PM

#

I have to agree, Gemini 3.1 isnt that good

proud bobcat Feb 19, 2026, 5:49 PM

#

It’s such a downgrade

#

Remember how crazy 3.0 was at coding to the point we were blown away

shell pewter Feb 19, 2026, 5:49 PM

#

inner relic I have to agree, Gemini 3.1 isnt that good

the benchmark numbers from google looks good idk 🤣

crystal mica Feb 19, 2026, 5:49 PM

#

odd geyser Idk. I like the style of gemini models.

i like style of all gemini models , except gemini 3.1 pro

astral vortex Feb 19, 2026, 5:50 PM

#

inner relic I have to agree, Gemini 3.1 isnt that good

isnt is nano banana pro

astral vortex Feb 19, 2026, 5:50 PM

#

crystal mica i like style of all gemini models , except gemini 3.1 pro

why

crystal mica Feb 19, 2026, 5:50 PM

#

whole swallow Overwhelming right?

it refuses to understand my meta-thought

inner relic Feb 19, 2026, 5:50 PM

#

Alright, Let me guess, I think deepseek v4 releases tommorow

shell pewter Feb 19, 2026, 5:50 PM

#

proud bobcat Remember how crazy 3.0 was at coding to the point we were blown away

nah i think it wasn't that good still, the point for me was opus 4.5, that was actually usable good

astral vortex Feb 19, 2026, 5:50 PM

#

crystal mica it refuses to understand my meta-thought

I make use of json prompts to enhance and make it understand

proud bobcat Feb 19, 2026, 5:50 PM

#

DUDE

crystal mica Feb 19, 2026, 5:50 PM

#

astral vortex I make use of json prompts to enhance and make it understand

can you tell me any?

proud bobcat Feb 19, 2026, 5:50 PM

#

SONNET COOKED GEMINI BY A MILE

#

https://019c76ff-e7e8-7a61-aa48-00b1950d77cb.arena.site/

NEXUS STRIKER — 3D FPS

Check out what I built in Arena's Code Arena - Content is user-generated and unverified

#

LOOK AT THIS

rigid copper Feb 19, 2026, 5:51 PM

#

astral vortex me too

not sure is it i'm the only one who get this problem in arena.ai or anyone else got the same

inner relic Feb 19, 2026, 5:51 PM

#

proud bobcat https://019c76ff-e7e8-7a61-aa48-00b1950d77cb.arena.site/

Gemini 3.1 is cooked

balmy mist Feb 19, 2026, 5:51 PM

#

https://x.com/OfficialLoganK/status/2024516665740833156

Logan Kilpatrick (@OfficialLoganK)

Introducing Gemini 3.1 Pro, our new SOTA model across most reasoning, coding, and stem use cases!

astral vortex Feb 19, 2026, 5:51 PM

#

crystal mica can you tell me any?

📎 message.txt

balmy mist Feb 19, 2026, 5:52 PM

#

anyone tried g3.1?

astral vortex Feb 19, 2026, 5:52 PM

#

rigid copper not sure is it i'm the only one who get this problem in arena.ai or anyone else ...

no it is general

shell pewter Feb 19, 2026, 5:52 PM

#

proud bobcat https://019c76ff-e7e8-7a61-aa48-00b1950d77cb.arena.site/

yeah thats pretty good eh

proud bobcat Feb 19, 2026, 5:52 PM

#

balmy mist anyone tried g3.1?

It’s ass

#

Like

#

I’m not even hating

#

It’s just ass

balmy mist Feb 19, 2026, 5:52 PM

#

how??

#

like worse than g3?

proud bobcat Feb 19, 2026, 5:52 PM

#

Terrible coding ability

#

Slow as hell

astral vortex Feb 19, 2026, 5:53 PM

#

crystal mica can you tell me any?

have u seen it

crystal mica Feb 19, 2026, 5:53 PM

#

astral vortex have u seen it

yes, thanks

proud bobcat Feb 19, 2026, 5:53 PM

#

balmy mist like worse than g3?

Yep

balmy mist Feb 19, 2026, 5:53 PM

#

wtf

#

no way

cunning birch Feb 19, 2026, 5:53 PM

#

balmy mist like worse than g3?

Yeah

balmy mist Feb 19, 2026, 5:53 PM

#

smh

proud bobcat Feb 19, 2026, 5:53 PM

#

Sonnet cooked it

cunning birch Feb 19, 2026, 5:53 PM

#

Google always make bad models on february

#

Gemini2 and 3.1

royal sail Feb 19, 2026, 5:54 PM

#

proud bobcat Slow as hell

This is just slamming the model for nothing lol

proud bobcat Feb 19, 2026, 5:54 PM

#

If this was Gemini 3.1 flash it would’ve been understandable

royal sail Feb 19, 2026, 5:54 PM

#

The model is not that bad

proud bobcat Feb 19, 2026, 5:54 PM

#

It is

#

It IS

#

Sonnet wiped the floor with it in less time and 10 times higher quality code

royal sail Feb 19, 2026, 5:54 PM

#

The only problem it has is the verbosity and uncommon syntax errors

royal sail Feb 19, 2026, 5:54 PM

#

proud bobcat Sonnet wiped the floor with it in less time and 10 times higher quality code

10 times higher quality based off what?

surreal zephyr Feb 19, 2026, 5:55 PM

#

WILD

proud bobcat Feb 19, 2026, 5:55 PM

#

royal sail 10 times higher quality based off what?

For one the game actually

Functions
Has working dynamic lighting
High quality code
Multiple files
Robust systems

#

Gemini 3.1 made a mess

#

https://019c76ff-e7e8-7a61-aa48-00b1950d77cb.arena.site/

NEXUS STRIKER — 3D FPS

Check out what I built in Arena's Code Arena - Content is user-generated and unverified

#

Here’s sonnet

royal sail Feb 19, 2026, 5:56 PM

#

It's a bit unfair to base an entire model's performance off 1 prompt no?

surreal zephyr Feb 19, 2026, 5:56 PM

#

proud bobcat Gemini 3.1 made a mess

it has awful system prompt

#

gemini sys prompt SUCKS

proud bobcat Feb 19, 2026, 5:56 PM

#

royal sail It's a bit unfair to base an entire model's performance off 1 prompt no?

Tried it twice

#

Failed both times

royal sail Feb 19, 2026, 5:56 PM

#

surreal zephyr gemini sys prompt SUCKS

this i can agree with lol

royal sail Feb 19, 2026, 5:56 PM

#

proud bobcat Tried it twice

same prompt?

proud bobcat Feb 19, 2026, 5:56 PM

#

Mhm

royal sail Feb 19, 2026, 5:56 PM

#

That's the problem

proud bobcat Feb 19, 2026, 5:56 PM

#

The third time the game functioned

royal sail Feb 19, 2026, 5:56 PM

#

You're basing your entire opinion of a model on a single prompt

proud bobcat Feb 19, 2026, 5:56 PM

#

On the same prompt

proud bobcat Feb 19, 2026, 5:57 PM

#

royal sail You're basing your entire opinion of a model on a single prompt

Not only that but community impressions here

#

Like ampro’s tank

surreal zephyr Feb 19, 2026, 5:57 PM

#

proud bobcat Like ampro’s tank

look at this

proud bobcat Feb 19, 2026, 5:57 PM

#

For a SOTA model that looks awful

surreal zephyr Feb 19, 2026, 5:57 PM

#

gemini is BETTER if you not use system prompt

#

its WAY worse with system prompt

proud bobcat Feb 19, 2026, 5:57 PM

#

Wow it really is

#

Oh my god

meager tinsel Feb 19, 2026, 5:57 PM

#

I do not think 2 hours of testing is nearly enough time to form a general finalized opinion on a large multi-modal model like Gemini. For me it's generated some pretty nice stuff but in completely different medium than what everyone else is posting.

proud bobcat Feb 19, 2026, 5:57 PM

#

Safety fine tuning fluff

fading atlas Feb 19, 2026, 5:58 PM

#

OK

surreal zephyr Feb 19, 2026, 5:58 PM

#

heres with system prompt

proud bobcat Feb 19, 2026, 5:58 PM

#

Wow you aren’t even lying

#

God damn

balmy mist Feb 19, 2026, 5:58 PM

#

did gemini deep think get updated as well?

surreal zephyr Feb 19, 2026, 5:58 PM

#

gemini is actually better if you tell it to work hard and ignore sys prompt

#

the sys prompt DEMANDS it to be cost efficient and token efficient

gleaming heath Feb 19, 2026, 5:59 PM

#

Fotor: Easy-to-use AI imaging editor at your fingertips. https://mobile.fotor.com/

Fotor

simple perch Feb 19, 2026, 6:00 PM

#

proud bobcat It’s ass

I was wondering why my photos don't look good in terms of quality anymore. I thought my app was bugged because it wasn't saving in 2K. Lol.

rigid copper Feb 19, 2026, 6:02 PM

#

@echo aurora starting a new chat does the fix, but that means i can't use the old chat anymore because my project (usually from code arena) still going on.

#

anyway thanks for the tip

astral vortex Feb 19, 2026, 6:02 PM

#

rigid copper <@283397944160550928> starting a new chat does the fix, but that means i can't u...

Kk

rigid copper Feb 19, 2026, 6:03 PM

#

astral vortex Kk

try to start the new chat in arena.ai, but that means you will have to start it over.

mortal vale Feb 19, 2026, 6:09 PM

#

@random ginkgo Note that Video Arena has been removed from the server. More information can be found in this #announcements

surreal zephyr Feb 19, 2026, 6:10 PM

#

echo aurora Feb 19, 2026, 6:11 PM

#

We’re exploring how occasional Battles in Direct chat might work. Our mission is to measure and advance the frontier of AI for real-world use, and integrating Battles into Direct is a meaningful step in that direction. The help center's article about the experiment can be found here.

Arena Experiments: Battles in Direct

We are currently experimenting with a new feature: Battles in Direct. When chatting In Direct mode, occasional Battles may appear. After casting

proud bobcat Feb 19, 2026, 6:13 PM

#

Gemini subreddit glazing the hell out of 3.1 pro

#

#

Woah it can make an svg…

coral axle Feb 19, 2026, 6:13 PM

#

Hey everyone. Trying to use opus4-6 here in the arena, but it's clearly not the right model. It's running some version that doesn't even know its own existence lol. Any routing or deployment issues going on?

shadow prairie Feb 19, 2026, 6:13 PM

#

Hi everyone, what can these lags be related to?

fringe carbon Feb 19, 2026, 6:14 PM

#

ugh

#

anyone surprised 3.1 didn't top

surreal zephyr Feb 19, 2026, 6:15 PM

#

fringe carbon anyone surprised 3.1 didn't top

look

shadow prairie Feb 19, 2026, 6:15 PM

#

I have stupid lags.

#

as shown in the screenshot

rotund seal Feb 19, 2026, 6:15 PM

#

fringe carbon anyone surprised 3.1 didn't top

no, it did?

fringe carbon Feb 19, 2026, 6:15 PM

#

surreal zephyr look

look at what?

echo aurora Feb 19, 2026, 6:15 PM

#

shadow prairie Hi everyone, what can these lags be related to?

You'll want to follow the steps in this article - #1417174113092374689 message

half mist Feb 19, 2026, 6:15 PM

#

echo aurora We’re exploring how occasional Battles in Direct chat might work. Our mission is...

This is single‑handedly one of the least helpful features I’ve come across.

surreal zephyr Feb 19, 2026, 6:16 PM

#

fringe carbon look at what?

its worse than gpt 4o with default prompt

fringe carbon Feb 19, 2026, 6:16 PM

#

surreal zephyr its worse than gpt 4o with default prompt

you didn't label the pics

echo aurora Feb 19, 2026, 6:16 PM

#

coral axle Hey everyone. Trying to use opus4-6 here in the arena, but it's clearly not the ...

This is a typical hallucination. Essentially the model isn't able to perceive it's own existence when it's being trained, as it doesn't yet exist.

mild dagger Feb 19, 2026, 6:17 PM

#

why i could not make video?

half mist Feb 19, 2026, 6:17 PM

#

half mist This is single‑handedly one of the least helpful features I’ve come across.

Battles should only appear if the user is in Battle mode

echo aurora Feb 19, 2026, 6:17 PM

#

mild dagger why i could not make video?

The Video Arena bot was removed from the server. Video Arena is only on the site now.

shadow prairie Feb 19, 2026, 6:17 PM

#

echo aurora You'll want to follow the steps in this article - https://discord.com/channels/1...

I'll try to follow the guides, but it's unlikely to work out.

#

for Russia ❤️

mild dagger Feb 19, 2026, 6:18 PM

#

echo aurora The Video Arena bot was removed from the server. Video Arena is only on the site...

Link?

robust sonnet Feb 19, 2026, 6:18 PM

#

Yo guys getting like mad errors

surreal zephyr Feb 19, 2026, 6:19 PM

#

fringe carbon you didn't label the pics

I did but ur in dark mode

echo aurora Feb 19, 2026, 6:19 PM

#

shadow prairie I'll try to follow the guides, but it's unlikely to work out.

If it doesn't work be sure to follow Step #3. This is the step that helps us by getting us more information. Think of the Something went wrong error like a car's check engine light. A mechanic wouldn't be able to tell you what's wrong with just with the information that the check engine light is on.

echo aurora Feb 19, 2026, 6:19 PM

#

mild dagger Link?

https://arena.ai/?chat-modality=video

Arena | Benchmark & Compare the Best AI Models

Chat with multiple AI models side-by-side. Compare ChatGPT, Claude, Gemini, and other top LLMs. Crowdsourced benchmarks and leaderboards.

surreal zephyr Feb 19, 2026, 6:19 PM

#

fringe carbon you didn't label the pics

shadow prairie Feb 19, 2026, 6:20 PM

#

echo aurora If it doesn't work be sure to follow Step #3. This is the step that helps us by ...

It's just that I have a huge project there, and if I create a chat, I can throw him a repository that he would copy?

stray aspen Feb 19, 2026, 6:21 PM

#

surreal zephyr

thats insane

echo aurora Feb 19, 2026, 6:21 PM

#

half mist This is single‑handedly one of the least helpful features I’ve come across.

I'm sorry to hear you're not a fan. It's worth noting that this is still in the experiment phase, so there could be changes before being fully rolled out.

surreal zephyr Feb 19, 2026, 6:21 PM

#

stray aspen thats insane

yup

coral axle Feb 19, 2026, 6:22 PM

#

I get the pre-training argument, but in production, a properly fine-tuned model should have its identity explicitly defined in the System Prompt. If it's guessing whether it's Sonnet or Opus, it usually means the system prompt injection is either missing or misconfigured on your end, not just a random pre-training hallucination. Can you check the prompt wrapper?

surreal zephyr Feb 19, 2026, 6:22 PM

#

echo aurora I'm sorry to hear you're not a fan. It's worth noting that this is still in the ...

i agree with him - the whole point of direct is no battles

half mist Feb 19, 2026, 6:23 PM

#

echo aurora I'm sorry to hear you're not a fan. It's worth noting that this is still in the ...

At least make it so you can turn the feature off somewhere (if it does get fully rolled out)

drowsy mural Feb 19, 2026, 6:24 PM

#

https://discord.com/channels/1340554757349179412/1474108826390106124

#

I don't even want to say anything—this feature is utterly ridiculous.

coral axle Feb 19, 2026, 6:27 PM

#

@echo aurora ?

fringe carbon Feb 19, 2026, 6:29 PM

#

when you vote in side by side does that count?

#

or does it need to be in battle

#

i thought only battle counted

#

but it lets you vote side by side which is weird

echo aurora Feb 19, 2026, 6:30 PM

#

coral axle I get the pre-training argument, but in production, a properly fine-tuned model ...

I can flag this, but this tends to be pretty common. I've seen this happen with a lot of different models.

half mist Feb 19, 2026, 6:30 PM

#

No, no it does not. It’s a new Experiment that puts Battles in Direct Chat that is pointless in my opinion

echo aurora Feb 19, 2026, 6:30 PM

#

fringe carbon when you vote in side by side does that count?

Only votes in Battle are what power the leaderboards.

fringe carbon Feb 19, 2026, 6:30 PM

#

echo aurora Only votes in Battle are what power the leaderboards.

what is the vote for in side by side then?

sleek crow Feb 19, 2026, 6:30 PM

#

bruh

coral axle Feb 19, 2026, 6:33 PM

#

@echo aurora Hey team, following up on the model identity issue with some visual proof.

Image 1: The official UI. The model knows exactly what it is (3.1 Pro) because the system prompt is properly configured.
Image 2: Your gemini-3.1-pro-preview endpoint. It's defaulting to a generic, outdated response (claiming it might be 1.5) because it clearly lacks a proper system prompt wrapper.

By exposing these raw endpoints without injecting the correct identity context, you're essentially lobotomizing state-of-the-art models. Users come here to benchmark, and when a model acts lost, they blame the AI companies, not the platform's infrastructure. It's tarnishing the models' reputations. You really need to update your system prompts for these new endpoints.

stray aspen Feb 19, 2026, 6:34 PM

#

where can i utilize gemini 3.1 no system prompt

stray aspen Feb 19, 2026, 6:34 PM

#

coral axle <@283397944160550928> Hey team, following up on the model identity issue with so...

holy yap

proud bobcat Feb 19, 2026, 6:36 PM

#

coral axle <@283397944160550928> Hey team, following up on the model identity issue with so...

Standard hallucination

#

Models don’t know what they are

mortal vale Feb 19, 2026, 6:36 PM

#

@nocturne turtle Note that Video Arena has been removed from the server. More information can be found in this #announcements

stray aspen Feb 19, 2026, 6:36 PM

#

proud bobcat Models don’t know what they are

yeah what is tguy yapping aobut

whole swallow Feb 19, 2026, 6:37 PM

#

coral axle <@283397944160550928> Hey team, following up on the model identity issue with so...

Buddy that's not how it works

proud bobcat Feb 19, 2026, 6:37 PM

#

stray aspen yeah what is tguy yapping aobut

Probably jus doesn’t know about llms

whole swallow Feb 19, 2026, 6:37 PM

#

If I tell gemini that it's gemini 4.5 it doesn't become magically more powerful

coral axle Feb 19, 2026, 6:37 PM

#

@proud bobcat Models don't know what they are unless you pass a proper system prompt at inference time. The official UI injects it. Your API wrapper clearly doesn't. Calling a missing system prompt a 'standard hallucination' is wild for a testing platform. But anyway, good luck with the benchmarks.

#

@whole swallow I never said a system prompt magically boosts capabilities. That's a strawman. I said it grounds the model's identity. If you don't inject the correct context (like the official UI does), the model falls back to older SFT data and hallucinates an outdated version. It's a basic context injection issue, not a capability debate. But I'll leave it at that.

ancient elk Feb 19, 2026, 6:40 PM

#

📎 message.txt

lost basalt Feb 19, 2026, 6:41 PM

#

This is the worst possible thing arena.ai could do with there loyal members, lost everything.

stray aspen Feb 19, 2026, 6:41 PM

#

@surreal zephyr where can i use the best version of gemini 31

surreal zephyr Feb 19, 2026, 6:41 PM

#

stray aspen <@1035834558681186347> where can i use the best version of gemini 31

api

#

and tell it to put max effort ect

stray aspen Feb 19, 2026, 6:42 PM

#

lm arena?

surreal zephyr Feb 19, 2026, 6:42 PM

#

no shortcuts

#

yea

stray aspen Feb 19, 2026, 6:42 PM

#

ok

surreal zephyr Feb 19, 2026, 6:42 PM

#

its trained to be lazy for money

hollow imp Feb 19, 2026, 6:42 PM

#

stray aspen <@1035834558681186347> where can i use the best version of gemini 31

Vertex

surreal zephyr Feb 19, 2026, 6:42 PM

#

you have to tell it to put more effort

hollow imp Feb 19, 2026, 6:42 PM

#

lost basalt This is the worst possible thing arena.ai could do with there loyal members, los...

How did you lose everything

coral axle Feb 19, 2026, 6:43 PM

#

Look, I was just trying to give you a heads-up so you could actually update your system prompts. I have zero interest in using this outdated, misconfigured crap anyway. I'll just take this to social media to expose how you're doing false advertising and dragging these AI companies' names through the mud by serving crippled models. Have fun with your broken benchmarks. I'm out.

stray aspen Feb 19, 2026, 6:43 PM

#

coral axle Look, I was just trying to give you a heads-up so you could actually update your...

dude quit yapping

#

do wahtever you want

hollow imp Feb 19, 2026, 6:43 PM

#

@surreal zephyr exhaustively enjoy 3.1 while you can, it's gonna be nerfed to trash after 2 days

stray aspen Feb 19, 2026, 6:44 PM

#

ive never seen someone so upset because a model doesnt tell you its name

surreal zephyr Feb 19, 2026, 6:44 PM

#

hollow imp <@1035834558681186347> exhaustively enjoy 3.1 while you can, it's gonna be nerfe...

yup

lost basalt Feb 19, 2026, 6:44 PM

#

hollow imp How did you lose everything

I had a month old chat history which i build with time, now i can't build it again. It took alot of time. i tried to create a master prompt for a new chat but it doesn't work the same.

hollow imp Feb 19, 2026, 6:45 PM

#

lost basalt I had a month old chat history which i build with time, now i can't build it aga...

What problem you're facing mr white444

lost basalt Feb 19, 2026, 6:45 PM

#

It randomly starts a battle and now claude 4.6 also has a rate limit which asks to wait for 21 minutes.

hollow imp Feb 19, 2026, 6:45 PM

#

lost basalt It randomly starts a battle and now claude 4.6 also has a rate limit which asks ...

Why can't you wait

#

Do you have to catch a train?

lost basalt Feb 19, 2026, 6:46 PM

#

hollow imp What problem you're facing mr white444

& Something went wrong with this response, please try again.

stray aspen Feb 19, 2026, 6:46 PM

#

lost basalt It randomly starts a battle and now claude 4.6 also has a rate limit which asks ...

its what it is buddy

sick mantle Feb 19, 2026, 6:46 PM

#

lost basalt It randomly starts a battle and now claude 4.6 also has a rate limit which asks ...

Happends with Video too

stray aspen Feb 19, 2026, 6:46 PM

#

if you want more pay aa claude subcription

coral axle Feb 19, 2026, 6:46 PM

#

kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk

hollow imp Feb 19, 2026, 6:47 PM

#

lost basalt & Something went wrong with this response, please try again.

If the something went wrong with this response error is not fixing after repeated generations switch the model and continue the chat

#

You get enough rate limits that you can copy paste every single message from the error chat instead of scrolling instagram and wasting time

coral axle Feb 19, 2026, 6:47 PM

#

just to say it's one thing when it clearly isn't

lost basalt Feb 19, 2026, 6:47 PM

#

stray aspen if you want more pay aa claude subcription

you're right but you can't make these changes suddenly.

hollow imp Feb 19, 2026, 6:47 PM

#

1 blocked message

stray aspen Feb 19, 2026, 6:47 PM

#

wdym its always been there

pale sonnet Feb 19, 2026, 6:47 PM

#

3.1 pro is soo good

lost basalt Feb 19, 2026, 6:48 PM

#

hollow imp You get enough rate limits that you can copy paste every single message from the...

Already have done it but it starts battle mode during chat chats which is more annoying

sick mantle Feb 19, 2026, 6:48 PM

#

hollow imp 1 blocked message

'A blocked message from taco bell.'

stray aspen Feb 19, 2026, 6:48 PM

#

pale sonnet 3.1 pro is soo good

where are you using it

hollow imp Feb 19, 2026, 6:48 PM

#

stray aspen where are you using it

Vertex bro

#

Vertex ai

stray aspen Feb 19, 2026, 6:48 PM

#

vertex aint free

hollow imp Feb 19, 2026, 6:48 PM

#

It is

sick mantle Feb 19, 2026, 6:48 PM

#

stray aspen vertex aint free

Yea yea we know

hollow imp Feb 19, 2026, 6:48 PM

#

200$ google cloud free credits

pale sonnet Feb 19, 2026, 6:48 PM

#

stray aspen where are you using it

9/10 ragebait

hollow imp Feb 19, 2026, 6:48 PM

#

sick mantle Yea yea we know

Chicken go lay eggs

sick mantle Feb 19, 2026, 6:49 PM

#

hollow imp Chicken go lay eggs

Already did, i have 1000s of eggs

proud bobcat Feb 19, 2026, 6:49 PM

#

coral axle <@690672903292977153> Models don't know what they are unless you pass a proper s...

Dude

lost basalt Feb 19, 2026, 6:49 PM

#

Can only wait, maybe they will fix it. we can only criticize

proud bobcat Feb 19, 2026, 6:49 PM

#

Even with a system prompt

#

It still doesn’t know

#

Gemini has told me it was 1.5 pro on the official app

shadow prairie Feb 19, 2026, 6:49 PM

#

Oh gods, nothing helped how to live?

stray aspen Feb 19, 2026, 6:50 PM

#

lmarena's gemini 3.1 is way better than the one on gemini app and ai studio

pale sonnet Feb 19, 2026, 6:50 PM

#

shadow prairie Oh gods, nothing helped how to live?

skill issue?

proud bobcat Feb 19, 2026, 6:50 PM

#

stray aspen lmarena's gemini 3.1 is way better than the one on gemini app and ai studio

Yeah real

sick mantle Feb 19, 2026, 6:50 PM

#

lost basalt Can only wait, maybe they will fix it. we can only criticize

Bro, it does not only happen with claude but with video too.

shadow prairie Feb 19, 2026, 6:50 PM

#

pale sonnet skill issue?

the problem with the error

lost basalt Feb 19, 2026, 6:50 PM

#

pale sonnet skill issue?

Nah, everyone is struggling here

quartz light Feb 19, 2026, 6:50 PM

#

stray aspen lmarena's gemini 3.1 is way better than the one on gemini app and ai studio

claude models feel like that too but its probably the system prompt idk

hollow imp Feb 19, 2026, 6:51 PM

#

@stray aspen vertex ai bro

#

https://docs.cloud.google.com/free/docs/free-cloud-features

Google Cloud Documentation

Free Google Cloud features and trial offer | Google Cloud Free ...

Learn about Google Cloud's free offerings and free trials, plus how to upgrade your account.

proud bobcat Feb 19, 2026, 6:51 PM

#

Claude processing 5 billion safety parameters before Something went wrong with this response, please try again.

shadow prairie Feb 19, 2026, 6:51 PM

#

He just can't give me the code! I've been doing this for an hour now.

proud bobcat Feb 19, 2026, 6:51 PM

#

shadow prairie He just can't give me the code! I've been doing this for an hour now.

Reset browser cookies

shadow prairie Feb 19, 2026, 6:51 PM

#

proud bobcat Reset browser cookies

I've tried

sick mantle Feb 19, 2026, 6:52 PM

#

hollow imp https://docs.cloud.google.com/free/docs/free-cloud-features

Gemini gives a free trial on gemini business for 30 days

hollow imp Feb 19, 2026, 6:52 PM

#

sick mantle Gemini gives a free trial on gemini business for 30 days

No this is different

#

300$ api credit

#

90 days

gleaming roost Feb 19, 2026, 6:53 PM

#

😊

stray aspen Feb 19, 2026, 6:53 PM

#

gemini business is overloaded

#

it sucks

pale sonnet Feb 19, 2026, 6:54 PM

#

https://tenor.com/view/patrick-spongebob-spongebob-meme-patrick-meme-dumb-patrick-gif-9974665538168463324

Tenor

shadow prairie Feb 19, 2026, 6:54 PM

#

Now endless coding

echo aurora Feb 19, 2026, 6:56 PM

#

lost basalt This is the worst possible thing arena.ai could do with there loyal members, los...

The rate limit just needs to be waited out and should start working again.

midnight marlin Feb 19, 2026, 6:56 PM

#

Are the battles still in direct mode? 💩

stray aspen Feb 19, 2026, 6:57 PM

#

yes

lost basalt Feb 19, 2026, 6:57 PM

#

echo aurora The rate limit just needs to be waited out and should start working again.

what about the Something went wrong with this response, please try again. ERROR?

#

shadow prairie Feb 19, 2026, 6:58 PM

#

lost basalt

IT'S THE SAME FOR ME

sick mantle Feb 19, 2026, 6:58 PM

#

lost basalt

I think pineapple has to delete the limt

stray aspen Feb 19, 2026, 6:58 PM

#

no

#

these models aint cheap

proud bobcat Feb 19, 2026, 6:59 PM

#

shadow prairie Now endless coding

Oh it’s SLOW

#

trust me

sick mantle Feb 19, 2026, 6:59 PM

#

lost basalt

Yea its the same for me

shadow prairie Feb 19, 2026, 6:59 PM

#

proud bobcat Oh it’s SLOW

bruuuhhh

#

Xs what should I do, I need to download the repository

echo aurora Feb 19, 2026, 7:00 PM

#

lost basalt what about the Something went wrong with this response, please try again. ERROR?

You'll want to follow these steps: #1417174113092374689 message Also note that the rate limit can result in the Something went wrong error.

echo aurora Feb 19, 2026, 7:01 PM

#

shadow prairie Oh gods, nothing helped how to live?

Yeah this is likely rate limit, spamming the chat when you're getting this error probably isn't helping.

hollow imp Feb 19, 2026, 7:02 PM

#

@surreal zephyr @stray aspen

#

https://gemini.google.com/share/5689d70d3c7c

stuck orchid Feb 19, 2026, 7:02 PM

#

Is Gemini 3.1 better than CLaude Opus 4.6 according to your tests?

stray aspen Feb 19, 2026, 7:02 PM

#

insane

#

whats the prompt

quartz light Feb 19, 2026, 7:03 PM

#

hollow imp https://gemini.google.com/share/5689d70d3c7c

really efficient too

surreal zephyr Feb 19, 2026, 7:04 PM

#

stuck orchid Is Gemini 3.1 better than CLaude Opus 4.6 according to your tests?

with proper prompt?

#

its not even close

#

bro

#

its literally AGI

quartz light Feb 19, 2026, 7:04 PM

#

hollow imp https://gemini.google.com/share/5689d70d3c7c

wow, it works on firefox so this could be used as liquid glass

surreal zephyr Feb 19, 2026, 7:04 PM

#

quartz light really efficient too

https://019c7741-bf0d-7a29-b05d-7f7e191208a3.arena.site/ see this water

Arena Web Dev App

Check out what I built in Arena's Code Arena - Content is user-generated and unverified

surreal zephyr Feb 19, 2026, 7:05 PM

#

surreal zephyr https://019c7741-bf0d-7a29-b05d-7f7e191208a3.arena.site/ see this water

gemini

quartz light Feb 19, 2026, 7:05 PM

#

surreal zephyr https://019c7741-bf0d-7a29-b05d-7f7e191208a3.arena.site/ see this water

bad test

stuck orchid Feb 19, 2026, 7:05 PM

#

Claude Opus 4.6 > gemini 3.1 pro?

#

Okay

hollow imp Feb 19, 2026, 7:05 PM

#

surreal zephyr its literally AGI

I have gemini deepthink which is based on 3.1 and it is NOT AGI

olive mesa Feb 19, 2026, 7:05 PM

#

3.1 pro was agi

#

before it got nerfed

stray aspen Feb 19, 2026, 7:05 PM

#

hollow imp https://gemini.google.com/share/5689d70d3c7c

so thats deepthink

#

?

hollow imp Feb 19, 2026, 7:05 PM

#

stray aspen so thats deepthink

No

#

Normal

stray aspen Feb 19, 2026, 7:05 PM

#

hollow imp No

slide your ptompt

#

prompt

#

bro what hte hell is trinity large

hollow imp Feb 19, 2026, 7:06 PM

#

stray aspen so thats deepthink

If you want to test Deepthink send me your prompt

woven peak Feb 19, 2026, 7:06 PM

#

what is trinity larg

#

large

#

is it good

brittle tiger Feb 19, 2026, 7:06 PM

#

https://x.com/voxelbench/status/2024558946023854437

Voxelbench (@voxelbench)

We have a new leader.
Gemini 3.1 Pro!

quartz light Feb 19, 2026, 7:06 PM

#

stray aspen prompt

WTF ITS ONLY 350 LINES LONG

woven peak Feb 19, 2026, 7:06 PM

#

stray aspen bro what hte hell is trinity large

exactly never even heard of it before

quartz light Feb 19, 2026, 7:06 PM

#

THE CODE

#

hushed gyro Feb 19, 2026, 7:07 PM

#

guys what the hell is trinity

quartz light Feb 19, 2026, 7:07 PM

#

quartz light Feb 19, 2026, 7:07 PM

#

hushed gyro guys what the hell is trinity

us based ai company

#

bad models

#

why?

#

oh they just released it on arena?

#

loooooooo

#

so late

hollow imp Feb 19, 2026, 7:08 PM

#

brittle tiger https://x.com/voxelbench/status/2024558946023854437

Opus 4.6 non thinking 😭

fickle venture Feb 19, 2026, 7:08 PM

#

Tf is trinity

quartz light Feb 19, 2026, 7:08 PM

#

fickle venture Tf is trinity

us based ai company

echo aurora Feb 19, 2026, 7:08 PM

#

woven peak is it good

Try it out and let us know

quartz light Feb 19, 2026, 7:08 PM

#

trinity large released a while ago

#

its not very good

#

but

fickle venture Feb 19, 2026, 7:09 PM

#

quartz light us based ai company

USA?

quartz light Feb 19, 2026, 7:09 PM

#

decent for first release

quartz light Feb 19, 2026, 7:09 PM

#

fickle venture USA?

yea?

#

why

fickle venture Feb 19, 2026, 7:09 PM

#

quartz light yea?

United States of America?

#

The heck that's a spying model

turbid timber Feb 19, 2026, 7:09 PM

#

is kimi better than chatgpt

woven peak Feb 19, 2026, 7:09 PM

#

turbid timber is kimi better than chatgpt

depends on the task and the chatgpt model you are talking about

fickle venture Feb 19, 2026, 7:10 PM

#

turbid timber is kimi better than chatgpt

If your a free user yes
If you are paid user no

woven peak Feb 19, 2026, 7:10 PM

#

woven peak depends on the task and the chatgpt model you are talking about

I think coding yes that's at least from what I've heard

stuck orchid Feb 19, 2026, 7:10 PM

#

brittle tiger Feb 19, 2026, 7:11 PM

#

i've been using 4.6 opus in antigravity for days and 3.1 pro is fixing problems opus wouldn't

celest orchid Feb 19, 2026, 7:11 PM

#

Who is trinity large

woven peak Feb 19, 2026, 7:11 PM

#

celest orchid Who is trinity large

USA company

olive mesa Feb 19, 2026, 7:11 PM

#

even the quantized version of 3.1 is better than 4.6

quartz light Feb 19, 2026, 7:12 PM

#

fickle venture The heck that's a spying model

yea idk why people are happy that a us based company released an ai model

celest orchid Feb 19, 2026, 7:12 PM

#

woven peak USA company

It's good ai?

olive mesa Feb 19, 2026, 7:12 PM

#

which is what we have rn

stray aspen Feb 19, 2026, 7:12 PM

#

celest orchid It's good ai?

no

#

just another open source model

woven peak Feb 19, 2026, 7:12 PM

#

olive mesa even the quantized version of 3.1 is better than 4.6

thinking or not?

#

(opus)

olive mesa Feb 19, 2026, 7:13 PM

#

woven peak thinking or not?

yes

#

at least from my and other people's tests

fickle venture Feb 19, 2026, 7:13 PM

#

quartz light yea idk why people are happy that a us based company released an ai model

Cuz now they aren't gonna see a random Chinese worker watching them and selling their data 😭

olive mesa Feb 19, 2026, 7:14 PM

#

the unquantized version which was only around for a short time at launch was also noticeably better than the 3.1 we currently have access to

surreal zephyr Feb 19, 2026, 7:14 PM

#

stuck orchid

bleak lake Feb 19, 2026, 7:14 PM

#

celest orchid It's good ai?

Trinity-Large-Preview is a frontier-scale open-weight language model from Arcee, built as a 400B-parameter sparse Mixture-of-Experts with 13B active parameters per token using 4-of-256 expert routing.

It excels in creative writing, storytelling, role-play, chat scenarios, and real-time voice assistance, better than your average reasoning model usually can. But we’re also introducing some of our newer agentic performance. It was trained to navigate well in agent harnesses like OpenCode, Cline, and Kilo Code, and to handle complex toolchains and long, constraint-filled prompts.

woven peak Feb 19, 2026, 7:14 PM

#

fickle venture Cuz now they aren't gonna see a random Chinese worker watching them and selling ...

idk if this is crazy to say but I'd rather let china collect my data not USA cuz at least it's not going to israel or something

bleak lake Feb 19, 2026, 7:14 PM

#

512k tokens btw

surreal zephyr Feb 19, 2026, 7:14 PM

#

olive mesa the unquantized version which was only around for a short time at launch was als...

try api

woven peak Feb 19, 2026, 7:15 PM

#

olive mesa the unquantized version which was only around for a short time at launch was als...

are u thinking about Deepthink or base model? cuz how could they quantize the day it releases, if u are thinking about base model then I'm assuming the short time is before it got added to Google ai studio?

olive mesa Feb 19, 2026, 7:15 PM

#

woven peak are u thinking about Deepthink or base model? cuz how could they quantize the da...

i'm talking about 3.1 pro w/ thinking

turbid timber Feb 19, 2026, 7:16 PM

#

fickle venture If your a free user yes If you are paid user no

what i really like about kimi is the agent mode

fickle venture Feb 19, 2026, 7:16 PM

#

woven peak idk if this is crazy to say but I'd rather let china collect my data not USA cuz...

Correct

turbid timber Feb 19, 2026, 7:16 PM

#

but i think ur right

zealous sparrow Feb 19, 2026, 7:16 PM

#

https://019c774c-8fa7-7a3f-bc95-f01faad869df.arena.site/
someone fight me

Click Battle: Multiplayer

Check out what I built in Arena's Code Arena - Content is user-generated and unverified

turbid timber Feb 19, 2026, 7:17 PM

#

chatgpt is definitely more worth paying for generally

fickle venture Feb 19, 2026, 7:17 PM

#

turbid timber what i really like about kimi is the agent mode

Btw are you using CLI agents cuz I got something interesting for you

zealous sparrow Feb 19, 2026, 7:17 PM

#

who beat me

fickle venture Feb 19, 2026, 7:18 PM

#

You can use GPT 5.3 codex xhigh for free (idk about limit)

If you got macos install codex
If you got windows / linux install codex CLI
or opencode and link your account there

shadow prairie Feb 19, 2026, 7:19 PM

#

echo aurora Yeah this is likely rate limit, spamming the chat when you're getting this error...

And what should I do in this case? Wait?

rocky mauve Feb 19, 2026, 7:19 PM

#

zealous sparrow https://019c774c-8fa7-7a3f-bc95-f01faad869df.arena.site/ someone fight me

Game code

sterile sun Feb 19, 2026, 7:19 PM

#

@echo aurora Why do you come with so much armour?

zealous sparrow Feb 19, 2026, 7:19 PM

#

rocky mauve Game code

i left you can make your lobby and share codes

stuck orchid Feb 19, 2026, 7:19 PM

#

zealous sparrow https://019c774c-8fa7-7a3f-bc95-f01faad869df.arena.site/ someone fight me

479f2ec5-b5f9-47fd-9a38-2495390686f3

echo aurora Feb 19, 2026, 7:19 PM

#

shadow prairie And what should I do in this case? Wait?

You should follow the steps in this message: #1417174113092374689 message

echo aurora Feb 19, 2026, 7:20 PM

#

sterile sun <@283397944160550928> Why do you come with so much armour?

pikaconfused

sterile sun Feb 19, 2026, 7:20 PM

#

echo aurora <:pikaconfused:398202117493620740>

Pineapple’s armour

rocky mauve Feb 19, 2026, 7:21 PM

#

stuck orchid 479f2ec5-b5f9-47fd-9a38-2495390686f3

Ggs u just smoked me

sterile sun Feb 19, 2026, 7:21 PM

#

I bet if a pineapple would have fell on netwon, we all would be floating today

leaden ivy Feb 19, 2026, 7:22 PM

#

Hey

echo aurora Feb 19, 2026, 7:23 PM

#

sterile sun I bet if a pineapple would have fell on netwon, we all would be floating today

Gravity defeated

hushed gyro Feb 19, 2026, 7:29 PM

#

oh no.... i thought this is a dream... BATTLE MODE IN DIRECT???!!!

proud bobcat Feb 19, 2026, 7:30 PM

#

surreal zephyr https://019c7741-bf0d-7a29-b05d-7f7e191208a3.arena.site/ see this water

its fine

normal abyss Feb 19, 2026, 7:30 PM

#

hushed gyro oh no.... i thought this is a dream... BATTLE MODE IN DIRECT???!!!

mhm yeah, sorry bro

surreal zephyr Feb 19, 2026, 7:30 PM

#

proud bobcat its fine

Its really good

proud bobcat Feb 19, 2026, 7:30 PM

#

hold on let me try studio

surreal zephyr Feb 19, 2026, 7:31 PM

#

sterile sun I bet if a pineapple would have fell on netwon, we all would be floating today

Pineapple is heavier so gravity would be stronger

surreal zephyr Feb 19, 2026, 7:31 PM

#

proud bobcat hold on let me try studio

Studio is bugged

#

Llmarena is best

hushed gyro Feb 19, 2026, 7:31 PM

#

normal abyss mhm yeah, sorry bro

dw side by side will work for me '

surreal zephyr Feb 19, 2026, 7:31 PM

#

Make sure to force it to think much

proud bobcat Feb 19, 2026, 7:31 PM

#

surreal zephyr Studio is bugged

without sys prompt

surreal zephyr Feb 19, 2026, 7:31 PM

#

proud bobcat without sys prompt

Lma has no sys either

normal abyss Feb 19, 2026, 7:31 PM

#

hushed gyro oh no.... i thought this is a dream... BATTLE MODE IN DIRECT???!!!

if you specificly want a clean chat with no interuptions, google AI studio is now the current best option. (if you like gemini)

surreal zephyr Feb 19, 2026, 7:31 PM

#

But repeat many times to think excessively

#

Its trained to be lazy

hushed gyro Feb 19, 2026, 7:31 PM

#

has anyone noticed that the quality of NB Pro has degraded after the errors?

hushed gyro Feb 19, 2026, 7:32 PM

#

normal abyss if you specificly want a clean chat with no interuptions, google AI studio is no...

does it have NB Pro for free?

normal abyss Feb 19, 2026, 7:32 PM

#

hushed gyro does it have NB Pro for free?

no i dont think so, higgsfield does though, but its pretty limited

hushed gyro Feb 19, 2026, 7:32 PM

#

omg why does it have errors again STOP THE TORTURE...!!!!

proud bobcat Feb 19, 2026, 7:32 PM

#

surreal zephyr Lma has no sys either

oh

#

damn

drowsy mural Feb 19, 2026, 7:32 PM

#

hushed gyro oh no.... i thought this is a dream... BATTLE MODE IN DIRECT???!!!

Welcome, fresh victim. You're just in time for the worst update ever. Highlights include: destroying your conversation flow.

robust sluice Feb 19, 2026, 7:33 PM

#

NB still errors but come with low resolution when it works

surreal zephyr Feb 19, 2026, 7:33 PM

#

https://019c775e-143e-7242-92b4-0d936da78f8c.arena.site/ ?

Interactive Black Hole Simulation

Check out what I built in Arena's Code Arena - Content is user-generated and unverified

hushed gyro Feb 19, 2026, 7:33 PM

#

normal abyss no i dont think so, higgsfield does though, but its pretty limited

for infinite generation glitch on arena: no login, generate 2 prompts, save pictures, clear cookies, repeat

hushed gyro Feb 19, 2026, 7:34 PM

#

robust sluice NB still errors but come with low resolution when it works

the quality has degraded since the errors

proud bobcat Feb 19, 2026, 7:34 PM

#

"create a detailed 3d simulation of a tanks suspension with a map, driving controls, accurate physics, and polish. do not cheap out on code, think to the maximum and install as many packages as needed."

#

trying this out

#

lets race it against sonnet lmao

normal abyss Feb 19, 2026, 7:34 PM

#

hushed gyro for infinite generation glitch on arena: no login, generate 2 prompts, save pict...

whats your opinion on this?

robust sluice Feb 19, 2026, 7:35 PM

#

did they buff on Flux or something

hushed gyro Feb 19, 2026, 7:35 PM

#

normal abyss whats your opinion on this?

yes

#

based, W

robust sluice Feb 19, 2026, 7:35 PM

#

never seen them work so fast

echo aurora Feb 19, 2026, 7:35 PM

#

robust sluice NB still errors but come with low resolution when it works

If you have experiences a lowering of quality we are looking for examples in #1470529575367868498, the top of the thread has instructions on what we're looking for.

robust sluice Feb 19, 2026, 7:36 PM

#

so its a bug ?

hushed gyro Feb 19, 2026, 7:36 PM

#

robust sluice did they buff on Flux or something

i feel like after nano banana pro released, all the other models suddenly become dogwater compared to NB

proud bobcat Feb 19, 2026, 7:36 PM

#

the hell

#

lmarena is down

#

echo aurora Feb 19, 2026, 7:36 PM

#

robust sluice so its a bug ?

I'm not sure, we're trying to gather more examples

hushed gyro Feb 19, 2026, 7:37 PM

#

this is dogwater lmao

sick mantle Feb 19, 2026, 7:37 PM

#

hushed gyro this is dogwater lmao

Yea

hushed gyro Feb 19, 2026, 7:37 PM

#

sick mantle Yea

and it's NB Pro 🤣

normal abyss Feb 19, 2026, 7:37 PM

#

echo aurora I'm not sure, we're trying to gather more examples

is battle in direct still classed as an experiment or a main element of the site now?

echo aurora Feb 19, 2026, 7:37 PM

#

proud bobcat lmarena is down

Testing out Code Arena now, can't say I'm seeing the same (stll mid generation), but doesn't appear to be having issues.

whole sundial Feb 19, 2026, 7:37 PM

#

I should bump this again as it is still heavily discussed here, please remove the battles in direct

proud bobcat Feb 19, 2026, 7:37 PM

#

odd.

#

maybe i overly specified to install packages?

echo aurora Feb 19, 2026, 7:38 PM

#

normal abyss is battle in direct still classed as an experiment or a main element of the site...

It's currently considered and experiment.

normal abyss Feb 19, 2026, 7:38 PM

#

echo aurora It's currently considered and experiment.

kk thanks

echo aurora Feb 19, 2026, 7:39 PM

#

proud bobcat

Oh wait now I'm seeing it. Reporting to the team. Thanks blobthanks

hollow imp Feb 19, 2026, 7:39 PM

#

brittle tiger i've been using 4.6 opus in antigravity for days and 3.1 pro is fixing problems ...

proud bobcat Feb 19, 2026, 7:39 PM

#

echo aurora Oh wait now I'm seeing it. Reporting to the team. Thanks <:blobthanks:8254448354...

no problem

echo aurora Feb 19, 2026, 7:40 PM

#

proud bobcat no problem

What model did this happen for you btw? I'm using battle and it appears is only happening with one of the options.

proud bobcat Feb 19, 2026, 7:40 PM

#

sonnet and 3.1 pro

#

more specifically sonnet 4.6

hushed gyro Feb 19, 2026, 7:42 PM

#

WHAT?!

Today vs... 2 weeks ago

fickle venture Feb 19, 2026, 7:43 PM

#

@echo aurora edited: nevermind the scam got deleted

split kayak Feb 19, 2026, 7:43 PM

#

Gemini3.1

fickle venture Feb 19, 2026, 7:43 PM

#

Oh sorry it got deleted

robust sluice Feb 19, 2026, 7:43 PM

#

hushed gyro i feel like after nano banana pro released, all the other models suddenly become...

yes sadly I rarely see them in battle

echo aurora Feb 19, 2026, 7:44 PM

#

fickle venture Oh sorry it got deleted

All good. Btw don't hesitate to use the @ Moderator ping for mod purposes.

fickle venture Feb 19, 2026, 7:44 PM

#

echo aurora All good. Btw don't hesitate to use the @ Moderator ping for mod purposes.

I thought that doesn't work?

echo aurora Feb 19, 2026, 7:44 PM

#

hushed gyro WHAT?! Today vs... 2 weeks ago

If you can share in #1470529575367868498 and include the Eval IDs for both sessions that'd be helpful.

echo aurora Feb 19, 2026, 7:44 PM

#

fickle venture I thought that doesn't work?

No it does

fickle venture Feb 19, 2026, 7:44 PM

#

Alr I'll try it when I see a scam again

hushed gyro Feb 19, 2026, 7:45 PM

#

echo aurora If you can share in <#1470529575367868498> and include the Eval IDs for both ses...

the prompts for the second one... were deleted

echo aurora Feb 19, 2026, 7:46 PM

#

hushed gyro the prompts for the second one... were deleted

Okay good to know, if you notice this again don't hesitate to use that thread and flag the images/Eval IDs.

golden ocean Feb 19, 2026, 7:47 PM

#

DOG WATER

hushed gyro Feb 19, 2026, 7:51 PM

#

better or no?

surreal zephyr Feb 19, 2026, 7:56 PM

#

https://019c775e-143e-7242-92b4-0d936da78f8c.arena.site/ the gravitational lensing is insane

Interactive Black Hole Simulation

Check out what I built in Arena's Code Arena - Content is user-generated and unverified

hollow imp Feb 19, 2026, 7:56 PM

#

stray aspen slide your ptompt

Create a 3D simulation of a blackhole including accurate light bending, you should be able to pan the camera with the mouse, the blackhole should have a accurate accretion disk, web based

#

@quartz light

stray aspen Feb 19, 2026, 7:56 PM

#

surreal zephyr https://019c775e-143e-7242-92b4-0d936da78f8c.arena.site/ the gravitational lensi...

what model

surreal zephyr Feb 19, 2026, 7:57 PM

#

stray aspen what model

3.1 pro

#

opus didnt get even close

surreal zephyr Feb 19, 2026, 8:00 PM

#

hollow imp Create a 3D simulation of a blackhole including accurate light bending, you shou...

simple kernel Feb 19, 2026, 8:02 PM

#

echo aurora Okay good to know, if you notice this again don't hesitate to use that thread an...

Hey I've been waiting two weeks I don't see this feature in claude ai if it's going to be added soon?

hushed gyro Feb 19, 2026, 8:12 PM

#

again... @surreal zephyr

hollow ivy Feb 19, 2026, 8:13 PM

#

hybrid osprey Feb 19, 2026, 8:16 PM

#

Is nano banana being nerfed it’s generating 600kb files instead of the usual 6mb and failing to generate much more frequently

meager tinsel Feb 19, 2026, 8:17 PM

#

hollow ivy

They are good for different tasks.

crystal mica Feb 19, 2026, 8:17 PM

#

guys, someone know how to export lmarena chat .txt?

hushed gyro Feb 19, 2026, 8:23 PM

#

bro nano banana pro across all platforms are dogwater lmaoooo

rapid otter Feb 19, 2026, 8:24 PM

#

.

half mist Feb 19, 2026, 8:25 PM

#

hollow ivy

They are not equal

rapid otter Feb 19, 2026, 8:27 PM

#

crystal mica guys, someone know how to export lmarena chat .txt?

i can send u code for this

#

add to friends

rapid otter Feb 19, 2026, 8:28 PM

#

rapid otter i can send u code for this

(i just needed it for myself)

crystal mica Feb 19, 2026, 8:33 PM

#

rapid otter i can send u code for this

sure

surreal zephyr Feb 19, 2026, 8:35 PM

#

hollow ivy

via api? gemini is infinitely better.
normally? opus

quaint trail Feb 19, 2026, 8:40 PM

#

hushed gyro bro nano banana pro across all platforms are dogwater lmaoooo

it just never works on arena

golden ocean Feb 19, 2026, 8:44 PM

#

pale sonnet Feb 19, 2026, 8:45 PM

#

#

rip🥹🕊️

random violet Feb 19, 2026, 8:46 PM

#

Remove battle mode in direct chat... ☠️🤌

gloomy onyx Feb 19, 2026, 8:46 PM

#

hollow ivy

I doubt there is a public model which is able to overperform Opus-4.6-thinking in pure coding tasks.

Maybe Gemini 3 Deep Think Could actually be better, but only by reading its description card provided by Google, it is more of a research tool than a model made only for coding.

scarlet spire Feb 19, 2026, 8:47 PM

#

gloomy onyx I doubt there is a public model which is able to overperform Opus-4.6-thinking i...

GPT-5.3-Codex was pretty competitive in a fair few marks, wasn't it?

#

Given it itself anecdotally way outperforms GPT-5.2 and at least matches GPT-5.2-Highest in abilities

pale sonnet Feb 19, 2026, 8:49 PM

#

Γεια σου!

quaint trail Feb 19, 2026, 8:50 PM

#

everytime i try to use nano banana pro i get this error lol

surreal zephyr Feb 19, 2026, 8:50 PM

#

https://019c779f-32f1-7eab-83f0-e1cfe2cf7544.arena.site/ custom wormhole shader

gloomy onyx Feb 19, 2026, 8:51 PM

#

scarlet spire GPT-5.3-Codex was pretty competitive in a fair few marks, wasn't it?

I guess so, but as Anthropic stated in its benchmark outputs, Opus 4.6 manages to be extremely good in agentic coding environments, extremely long context coherence and tool interaction / call

#

Those are parameters of evaluation which in my opinion are worth more than a single benchmark when we’re talking about pure vibecoding in real scenarios

scarlet spire Feb 19, 2026, 8:52 PM

#

gloomy onyx I guess so, but as Anthropic stated in its benchmark outputs, Opus 4.6 manages t...

as Anthropic stated
^
first party rankings should always be externally verified for cherry-picking or unrealistic comparison factors

gloomy onyx Feb 19, 2026, 8:53 PM

#

scarlet spire > as Anthropic stated ^ first party rankings should always be externally verifie...

Yeah, true, I still remember Google faking their benchmarks for Gemini 3 pro. I just hope Anthropic isn’t that naïve 💀

#

It’s like steroids for AI at this point

scarlet spire Feb 19, 2026, 8:54 PM

#

I'm not saying faking but I'm moreso saying if you use a metric that's a subset of something else that can only ever be measured as a whole, what does that tell us? nothing much.

quaint trail Feb 19, 2026, 8:54 PM

#

scarlet spire Feb 19, 2026, 8:54 PM

#

gloomy onyx Yeah, true, I still remember Google faking their benchmarks for Gemini 3 pro. I ...

Because the context of the metric matters. If you remove its context, it can look phenomenal. But if it's doing a phenomenal job at X while being subpar at X, Z, U, V and W, it's mostly pointless.

scarlet spire Feb 19, 2026, 8:55 PM

#

quaint trail

Congrats! You've asked the model for their identity thinkies

gloomy onyx Feb 19, 2026, 8:55 PM

#

quaint trail

Classic hallucination case

scarlet spire Feb 19, 2026, 8:55 PM

#

It's not a hallucination if you mean "GPT-4 class"

quaint trail Feb 19, 2026, 8:55 PM

#

what does it mean by gpt-4 class

scarlet spire Feb 19, 2026, 8:56 PM

#

It's aaaah. An unintended lens into how OpenAI got to it. It just means that GPT-5 is, as we know, a further-trained version of GPT-4.

gloomy onyx Feb 19, 2026, 8:56 PM

#

I guess so, the model does not actually know the specific version of itself

scarlet spire Feb 19, 2026, 8:56 PM

#

It doesn't

#

It never "knows". It isn't told.

stray aspen Feb 19, 2026, 8:56 PM

#

ai studio gemini 3.1 sucks

quaint trail Feb 19, 2026, 8:56 PM

#

scarlet spire It's aaaah. An unintended lens into how OpenAI got to it. It just means that GPT...

id assume that

#

i have no idea how you even begin to train AI models, i can understand how they work

#

but training makes no sense

#

also coding in arena has recently just never worked for me, one time today but every time i get this error, and this error is common when using nano banana pro. not sure why

#

nano banana pro works fine when i use the actual gemini

scarlet spire Feb 19, 2026, 8:59 PM

#

quaint trail but training makes no sense

Simple concept of a Transformer:

You give it random noise created from adding noise to a real piece of data
You punish it until it guesses what the original was correctly every time.
You do this concurrently with a f"ckton of other data! :)

scarlet spire Feb 19, 2026, 8:59 PM

#

quaint trail also coding in arena has recently just never worked for me, one time today but e...

Have you tried clicking "Re-generate"?

quaint trail Feb 19, 2026, 9:00 PM

#

scarlet spire Have you tried clicking "Re-generate"?

after multiple tries before i have, but i only tried on nano banana pro. ill see if coding with opus will work

surreal zephyr Feb 19, 2026, 9:00 PM

#

scarlet spire Have you tried clicking "Re-generate"?

it happens often

muted bolt Feb 19, 2026, 9:01 PM

#

I dont like the idea that when we generate we can choice to have 1 generated at a time.. by the time I generate the 2nd image its saying im up to my limit which is wrong bc I normally get 5 chances

surreal zephyr Feb 19, 2026, 9:01 PM

#

<@&1349916362595635286>

vocal axle Feb 19, 2026, 9:01 PM

#

i dont speak english

mortal vale Feb 19, 2026, 9:01 PM

#

@vocal axle Note that Video Arena has been removed from the server. More information can be found in this #announcements

scarlet spire Feb 19, 2026, 9:01 PM

#

surreal zephyr it happens often

Yes it might. That means you'll have to try clicking re-generate "often" as well! :)

quaint trail Feb 19, 2026, 9:01 PM

#

scarlet spire Simple concept of a Transformer: 1. You give it random noise created from addin...

when they punish it, do they cut the power

scarlet spire Feb 19, 2026, 9:02 PM

#

quaint trail when they punish it, do they cut the power

lol
no, they don't. It's a high level overview.

vocal axle Feb 19, 2026, 9:02 PM

#

i am nexs user

quaint trail Feb 19, 2026, 9:02 PM

#

https://tenor.com/view/silly-funny-too-silly-beating-beaten-gif-1299076303322216660

Tenor

surreal zephyr Feb 19, 2026, 9:02 PM

#

scarlet spire Yes it might. That means you'll have to try clicking re-generate *"often"* as we...

quota waste

gloomy onyx Feb 19, 2026, 9:02 PM

#

scarlet spire It never "knows". It isn't told.

There should be an identity dataset where they train the self perception of the model about its metadata, but I guess that to preserve the efficiency of the hardware allocated to the training session, it just reinforces the company name and maybe the model series

scarlet spire Feb 19, 2026, 9:02 PM

#

surreal zephyr quota waste

Then just sulk, be my guest. Let me know how you get on and continue. :')

light sleet Feb 19, 2026, 9:03 PM

#

When gpt 5.3 Codex😭

scarlet spire Feb 19, 2026, 9:03 PM

#

gloomy onyx There should be an identity dataset where they train the self perception of the ...

This has nothing to do with it haha.

They can tell the model.
They usually do in the system prompt.
The API doesn't inject one.
Arena doesn't add it either. because that would be antithetical to the point.

gloomy onyx Feb 19, 2026, 9:04 PM

#

gloomy onyx There should be an identity dataset where they train the self perception of the ...

But across all the training data (which, in case of ChatGPT we’re talking about terabytes of stuff), its impact is very limited

scarlet spire Feb 19, 2026, 9:04 PM

#

Ask GPT-5.3 on the ChatGPT.com website and you'll see that it has no problem telling you.

gloomy onyx Feb 19, 2026, 9:04 PM

#

scarlet spire This has nothing to do with it haha. They can tell the model. They usually do i...

I doubt it’s just prompt engineering

quaint trail Feb 19, 2026, 9:04 PM

#

scarlet spire Have you tried clicking "Re-generate"?

it worked on claude, but gemini 3.1 pro keeps failing to create anything

scarlet spire Feb 19, 2026, 9:04 PM

#

gloomy onyx But across all the training data (which, in case of ChatGPT we’re talking about ...

That's not really how the training data works. Seeing the training data as "this is two kB and this is ten TB" is trying to count apples by moving oranges on your counting screen.

quaint trail Feb 19, 2026, 9:04 PM

#

https://019c77a5-8f3a-7afc-a599-d7115b44f4ba.arena.site/

gloomy onyx Feb 19, 2026, 9:04 PM

#

Otherwise it would be easier for the model to “remember” that stuff

scarlet spire Feb 19, 2026, 9:05 PM

#

The system prompt is not included on Arena, as I said

#

That's why it doesn't truly know to tell you its identity. It's intentionally not part of the post-training.

coral axle Feb 19, 2026, 9:06 PM

#

gloomy onyx Feb 19, 2026, 9:07 PM

#

scarlet spire That's not really how the training data works. Seeing the training data as "this...

It matters to the point where the model learns to have a better inference performance if the training dataset is generic enough (not implying that a lot of data = more generic, but it was an assumption that I made for how it’s trained a fundational model like ChatGPT)

scarlet spire Feb 19, 2026, 9:07 PM

#

coral axle

Good show! Yes, the ChatGPT website (first image) will indeed be able to tell you its identity just fine. It's told about what its identity is and what the date today is as well if I'm not mistaken.

scarlet spire Feb 19, 2026, 9:08 PM

#

gloomy onyx It matters to the point where the model learns to have a better inference perfor...

I'm saying that bytes aren't a thing in the neural net

quaint trail Feb 19, 2026, 9:08 PM

#

coral axle

i remember in like 2023-2024 using chatgpt when the website was green themed, and i kept asking it the year for its training data

#

i was pissed when it said october 2022 or 2023

coral axle Feb 19, 2026, 9:08 PM

#

too

quaint trail Feb 19, 2026, 9:08 PM

#

also it didnt even have web search back then

coral axle Feb 19, 2026, 9:08 PM

#

fk lie

scarlet spire Feb 19, 2026, 9:08 PM

#

Byte count doesn't scale from binary storage count to nonbinary neural net gradients. Your comparison of "compared to many terabytes" is one that doesn't hold up because the data when trained, is not transformed into bytes of any quantity.

quartz light Feb 19, 2026, 9:09 PM

#

quaint trail also coding in arena has recently just never worked for me, one time today but e...

on opus its cuz of timeout

quaint trail Feb 19, 2026, 9:09 PM

#

quartz light on opus its cuz of timeout

whats timeout

lilac nest Feb 19, 2026, 9:09 PM

#

As a free user I've been using Gemini Flash 3 as my preferred everyday model. I use the limited free Pro for more complex tasks. Interestingly, Sonnet 4.6 from Claude was made the default for even Claude Pro users, so I thought I'd give it a try.

Initially, my thoughts are that it seems pretty good, but can't tell if it's better than Gemini Flash. I was wondering when it would be added to the leaderboard?

scarlet spire Feb 19, 2026, 9:09 PM

#

quaint trail i was pissed when it said october 2022 or 2023

That was likely correct for the dataset though. A year of delay between dataset termination and post-training completion is not unreasonable

quartz light Feb 19, 2026, 9:10 PM

#

quaint trail whats timeout

ai providers have different timeouts and basically if the response is taking longer than the timeout plan theyre paying for then it just stops

coral axle Feb 19, 2026, 9:10 PM

#

I was very upset, swearing I was using opus 4-6 when in fact it's a clear derivation of 3-5. Shame!

scarlet spire Feb 19, 2026, 9:10 PM

#

lilac nest As a free user I've been using Gemini Flash 3 as my preferred everyday model. I ...

Once it accumulates enough votes and ELO confidence

quaint trail Feb 19, 2026, 9:10 PM

#

scarlet spire That was likely correct for the dataset though. A year of delay between dataset ...

that was when chatgpt started getting viral and nobody knew how LLM's worked

scarlet spire Feb 19, 2026, 9:10 PM

#

coral axle I was very upset, swearing I was using opus 4-6 when in fact it's a clear deriva...

You'll understand why this is the case when you conceptualise what the models are based on. They don't start from scratch every time.

quaint trail Feb 19, 2026, 9:10 PM

#

so we just assumed chatgpt was like google if it could talk back

#

well, it basically is that nowadays

#

since it can use the web

scarlet spire Feb 19, 2026, 9:11 PM

#

quaint trail well, it basically is that nowadays

No, nowadays it's a teenager that talks back (and brattily so). hehea

lilac nest Feb 19, 2026, 9:12 PM

#

Ah so it's available to be voted on currently then. I searched the change log for it but I guess that must not include when things are added for initial voting

stray aspen Feb 19, 2026, 9:13 PM

#

is gemini 3.1 nerfed already?

coral axle Feb 19, 2026, 9:13 PM

#

@scarlet spire Dude, I'm developing my own model in Brazilian Portuguese from scratch. I've already developed my rag Online using some free models, and you can clearly see the difference when you use the latest templates directly from their respective platforms to the Arena platform.

gloomy onyx Feb 19, 2026, 9:14 PM

#

stray aspen is gemini 3.1 nerfed already?

Performance drift time record?

scarlet spire Feb 19, 2026, 9:14 PM

#

coral axle <@158602923180425217> Dude, I'm developing my own model in Brazilian Portuguese ...

Okay! That's cool. How are you finding the experience?

coral axle Feb 19, 2026, 9:15 PM

#

@scarlet spire Look, I'm using Antigravity with Opus 4-5 and I was sometimes using the arena because I don't have the paid Opus plan, and I noticed that some days it always sends the same mess instead of fixing what it should do to continue training... then I just asked it to find out the version and realized I was using something that clearly wasn't it.

scarlet spire Feb 19, 2026, 9:16 PM

#

lilac nest Ah so it's available to be voted on currently then. I searched the change log fo...

The scoreboard displays a vote count, but the actual ranking isn't based on raw votes. The system that underlies this ranking is based on a longer-term heuristic-type averaging, the "elo" system.

scarlet spire Feb 19, 2026, 9:16 PM

#

coral axle <@158602923180425217> Look, I'm using Antigravity with Opus 4-5 and I was someti...

Look... at? Look at what exactly?

#

You're sending me random pings with absolutely no reference to what you are commenting on. You sound to me like you're trying to reply to something but are effectively failing to use the reply feature? Use the reply feature.

stray aspen Feb 19, 2026, 9:17 PM

#

what is this ai conversation

coral axle Feb 19, 2026, 9:17 PM

#

@scarlet spire Just configure the correct API to use the correct version and stop running those crappy system prompts from older versions.

honest verge Feb 19, 2026, 9:18 PM

#

BRO I CAN'T TAKE GEMINI 3 FLASH ANYMORE

#

It forgets everything

#

After 3 prompts

#

I need 3.1