#general | Arena | Page 271

river whale Feb 24, 2026, 2:31 PM

#

from using kivest ai

spare rune Feb 24, 2026, 2:31 PM

#

and how exactly.. do you plan to do that

river whale Feb 24, 2026, 2:31 PM

#

spare rune and how exactly.. do you plan to do that

nvmm

jovial bison Feb 24, 2026, 2:33 PM

#

Btw is there any other place to use Gemini pro for free?

river whale Feb 24, 2026, 2:34 PM

#

jovial bison Btw is there any other place to use Gemini pro for free?

gemini enterprise free trial

#

3.1 pro is free there

jovial bison Feb 24, 2026, 2:34 PM

#

Do u need to write ur credit card there?

river whale Feb 24, 2026, 2:34 PM

#

jovial bison Do u need to write ur credit card there?

nope

#

just login

jovial bison Feb 24, 2026, 2:34 PM

#

Hmmm

jovial bison Feb 24, 2026, 2:39 PM

#

river whale nope

Wait is it just for text or creating pics too?

river whale Feb 24, 2026, 2:39 PM

#

jovial bison Wait is it just for text or creating pics too?

both

river whale Feb 24, 2026, 2:40 PM

#

jovial bison Wait is it just for text or creating pics too?

bleak minnow Feb 24, 2026, 2:46 PM

#

river whale gemini enterprise free trial

Could you please send me the link in a private chat?

loud verge Feb 24, 2026, 2:52 PM

#

Is goolag glitching?

#

Screenshot_2026-02-24-20-21-24-78_21da60175e70af211acc4f26191b7a77.jpg

#

Bro google models are the best for location related queries 😭

jovial bison Feb 24, 2026, 2:53 PM

#

river whale

Thx but it's so slow it's crazy

#

I guess because I have to use vpn

crystal mica Feb 24, 2026, 2:54 PM

#

i got 429 error

toxic verge Feb 24, 2026, 3:08 PM

#

#

#

It’s backend issues

hushed gyro Feb 24, 2026, 3:14 PM

#

toxic verge It’s backend issues

Whose fault?

echo dome Feb 24, 2026, 3:18 PM

#

in floorp browser (firefox-based) recaptcha still here

#

maybe they need to add other logins to login than just login to google

burnt sinew Feb 24, 2026, 3:19 PM

#

Pool? My site has pool

#

And you know it 😄

echo dome Feb 24, 2026, 3:21 PM

#

there is error about it
"reCAPTCHA V2 token timed out"

toxic verge Feb 24, 2026, 3:21 PM

#

Clear browser data

echo dome Feb 24, 2026, 3:22 PM

#

toxic verge Clear browser data

i'm in floorp browser btw

#

its worked by clearing browser data

#

recaptcha just infecting cookies

plain gull Feb 24, 2026, 3:28 PM

#

burnt sinew Pool? My site has pool

yo tf i spent 5 min playing pool on yo site'

#

did you make it yoself or had it made for you

undone geyser Feb 24, 2026, 3:34 PM

#

Error during image generation with google-genai for model endpoint gemini-3-pro-image-preview: Failed to fetch image: 429 Too Many Requests - [{ "error": { "code": 429, "message": "Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.", "status": "RESOURCE_EXHAUSTED" } } ]

sometimes i retry and works others not, why?

Google Cloud Documentation

Error code 429 | Generative AI on Vertex AI | Google Cloud ...

violet trout Feb 24, 2026, 3:49 PM

#

Can I DM you too so you can send me the link?

toxic verge Feb 24, 2026, 3:51 PM

#

https://youtu.be/YemMd6-cM0Q?si=7MdK7kKoL_PUoHx1

YouTube

Sam Witteveen

Caught Distilling from Claude?

In this video, I look at the controversy of Anthropic accusing the Chinese open weights models companies DeepSeek, Minimax, and Moonshot AI of distilling from the Claude model.

Blog: https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks

For more tutorials on using LLMs and building agents, check out my Patreon
Patreon: ...

▶ Play video

crystal mica Feb 24, 2026, 3:54 PM

#

ernie is literally reverse engineered from gemini btw

elfin sail Feb 24, 2026, 3:56 PM

#

I have the same error 429

#

I think it's coming from Lmarena

crystal mica Feb 24, 2026, 3:56 PM

#

style of speech..
often says it is gemini, when i ask..

river whale Feb 24, 2026, 3:56 PM

#

crystal mica 1. style of speech.. 2. often says it is gemini, when i ask..

it doesnt prove its gemini

green swan Feb 24, 2026, 3:56 PM

#

How to create video ??

river whale Feb 24, 2026, 3:57 PM

#

crystal mica Feb 24, 2026, 3:57 PM

#

river whale it doesnt prove its gemini

it is NOT gemini

#

i only say, it is may be reverse engineered

#

from gemini

#

distlillated

river whale Feb 24, 2026, 3:57 PM

#

u didnt say "maybe"

#

crystal mica Feb 24, 2026, 3:57 PM

#

river whale u didnt say "maybe"

dot1

echo sinew Feb 24, 2026, 3:57 PM

#

green swan How to create video ??

The Video Arena has been removed from the server. This feature is still fully available on our site. Please visit https://arena.ai/video

Arena | Benchmark & Compare the Best AI Models

Chat with multiple AI models side-by-side. Compare ChatGPT, Claude, Gemini, and other top LLMs. Crowdsourced benchmarks and leaderboards.

crystal mica Feb 24, 2026, 3:57 PM

#

river whale

dot1

green swan Feb 24, 2026, 3:58 PM

#

echo sinew The Video Arena has been removed from the server. This feature is still fully av...

Thanks you

river whale Feb 24, 2026, 3:58 PM

#

anyways its still good

crystal mica Feb 24, 2026, 3:58 PM

#

river whale anyways its still good

i do agree

pale sonnet Feb 24, 2026, 4:00 PM

#

bro how does china come out with a great model every 2 months

fickle venture Feb 24, 2026, 4:06 PM

#

echo sinew The Video Arena has been removed from the server. This feature is still fully av...

I think it's better to make a bot when someone said "how to generate video" the bot automatically respond to it saying it's removed

fickle venture Feb 24, 2026, 4:06 PM

#

pale sonnet bro how does china come out with a great model every 2 months

China people is smarter than USA people

echo hawk Feb 24, 2026, 4:12 PM

#

<@&1349916362595635286>

deft spruce Feb 24, 2026, 4:21 PM

#

fair cedar Feb 24, 2026, 4:24 PM

#

Can anyone explain me what this arena or whatever is?

golden ocean Feb 24, 2026, 4:27 PM

#

fair cedar Can anyone explain me what this arena or whatever is?

large language models fight to the death with violence inside the battle arena

fair cedar Feb 24, 2026, 4:27 PM

#

Dang

fickle venture Feb 24, 2026, 4:33 PM

#

fair cedar Can anyone explain me what this arena or whatever is?

A battle of LLMs

#

I think we should bet on who wins

toxic verge Feb 24, 2026, 4:34 PM

#

They’re too dumb to kill each other

river whale Feb 24, 2026, 4:34 PM

#

fickle venture I think it's better to make a bot when someone said "how to generate video" the ...

Why waste resource for this?

fickle venture Feb 24, 2026, 4:34 PM

#

Fr

fickle venture Feb 24, 2026, 4:34 PM

#

river whale Why waste resource for this?

Cuz a lot of people say it

river whale Feb 24, 2026, 4:34 PM

#

fickle venture Cuz a lot of people say it

Alot?

#

I only see max 20-25

toxic verge Feb 24, 2026, 4:35 PM

#

That could go both ways though

#

Because we first have to consider the fact what gives them the perception

fickle venture Feb 24, 2026, 4:35 PM

#

river whale Alot?

It's simple to make btw

river whale Feb 24, 2026, 4:35 PM

#

fickle venture It's simple to make btw

It needs resource

fickle venture Feb 24, 2026, 4:35 PM

#

Like what

river whale Feb 24, 2026, 4:35 PM

#

fickle venture It's simple to make btw

I never said its hard to make

toxic verge Feb 24, 2026, 4:35 PM

#

More effective communication

#

It’s a two-way streak because when they first did the video generations through the discord, they needed the users and the users came and they flocked, now that it’s no longer available. What do you do just kick them out?

#

This is one of those cases where the crime here is success itself

#

And people are gonna be people, no matter how well you communicate lol

river whale Feb 24, 2026, 4:39 PM

#

what

scenic pond Feb 24, 2026, 4:41 PM

#

hello how can i see arena video channel?

#

i can't see on this discord

toxic verge Feb 24, 2026, 4:41 PM

#

scenic pond hello how can i see arena video channel?

https://discord.com/channels/1340554757349179412/1475080934872191028

mortal vale Feb 24, 2026, 4:43 PM

#

@scenic pond Note that Video Arena has been removed from the server. More information can be found in this #announcements. You can still generate videos on the website

thorny cove Feb 24, 2026, 4:43 PM

#

why did nano banana tell me "Error during image generation with google-genai for model endpoint gemini-3-pro-image-preview: Failed to fetch image: 429 Too Many Requests - [{ "error": { "code": 429, "message": "Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.", "status": "RESOURCE_EXHAUSTED" } } ]"

#

did i really exhaust my resource?

#

lucky i didnt go in front of the bus or i wouldve tired my resources

fickle venture Feb 24, 2026, 4:44 PM

#

thorny cove why did nano banana tell me "Error during image generation with google-genai for...

Either you reached limit or the model is overused

deft spruce Feb 24, 2026, 4:45 PM

#

toxic verge Feb 24, 2026, 4:46 PM

#

Some of you guys either have to take a break or alternate accounts.

#

Especially if you’re using direct chat

deft spruce Feb 24, 2026, 4:48 PM

#

do....am i only having a this problem..?

toxic verge Feb 24, 2026, 4:48 PM

#

I’m sure you’re not the only one there’s probably a lot of people I’m just speaking in general and not strictly to you

#

deft spruce Feb 24, 2026, 4:48 PM

#

well you too?

#

toxic verge Feb 24, 2026, 4:49 PM

#

No, I use battle mode mainly

#

I’ll try to avoid direct chat altogether

#

I mean, I’ll get occasional errors or whatever but I’ll just start a new chat and they go away

deft spruce Feb 24, 2026, 4:49 PM

#

i use mainly dirrect chat but same

#

damm it

toxic verge Feb 24, 2026, 4:49 PM

#

Ye yeah

bright shard Feb 24, 2026, 4:49 PM

#

Screenshot_2026-02-24-17-49-00-604_com.android.chrome-edit.jpg

toxic verge Feb 24, 2026, 4:50 PM

#

Start any chat if it doesn’t work just wait it out

#

Intervals

deft spruce Feb 24, 2026, 4:50 PM

#

Failed to load resource: the server responded with a status of 400 ()

#

well hold on o have to check the 400 means

toxic verge Feb 24, 2026, 4:51 PM

#

Try clearing your browsing data. If that doesn’t work try on a different browser.

#

Login logout

#

If you’re using Gmail, try with a regular email see if that does any better

deft spruce Feb 24, 2026, 4:52 PM

#

already did for 5times...

toxic verge Feb 24, 2026, 4:52 PM

#

#

See direct chat it’s a hit or miss

deft spruce Feb 24, 2026, 4:54 PM

#

and i hope lmarena have this...

toxic verge Feb 24, 2026, 4:54 PM

#

Vs in battle

#

deft spruce Feb 24, 2026, 4:55 PM

#

#

not working....

toxic verge Feb 24, 2026, 4:55 PM

#

What did you try five times

deft spruce Feb 24, 2026, 4:55 PM

#

clearing cash and cookies and login again

toxic verge Feb 24, 2026, 4:56 PM

#

Try switching browsers

deft spruce Feb 24, 2026, 4:56 PM

#

HTTP 400 Bad Request — Complete Guide

Understanding the Error Message
Failed to load resource: the server responded with a status of 400
This message appears in your browser console and contains three key pieces of information:

Failed to load resource → A resource (API, image, script, etc.) failed to load
the server responded → The server is alive and did respond (not a network issue, not a server crash)
status of 400 → The server is saying "your request is malformed"

In short: the server is running fine, but the request itself is the problem.

#

and i think server has problem maybe...

toxic verge Feb 24, 2026, 4:57 PM

#

What model is that?

#

Let me try on my end

deft spruce Feb 24, 2026, 4:57 PM

#

well hold on it's working right now

#

thanks

#

only not working at chrome

#

right now

toxic verge Feb 24, 2026, 4:57 PM

#

Figures cool

echo aurora Feb 24, 2026, 5:01 PM

#

bright shard

This means that the error is being caused by rate limit. Will have to wait a bit before using that model again.

wicked sage Feb 24, 2026, 5:06 PM

#

yo what if qwen stopped slacking arround and released qwen 10

#

imagine that

coarse glade Feb 24, 2026, 5:06 PM

#

hi guys quick ques what is this

#

see that red text

echo aurora Feb 24, 2026, 5:07 PM

#

coarse glade hi guys quick ques what is this

The model errored out, we've recently added more information being displayed when the Something went wrong error happens. We're in the process of putting together better information for what all of this means.

gloomy jewel Feb 24, 2026, 5:08 PM

#

Image to video

coarse glade Feb 24, 2026, 5:08 PM

#

oh ok ty cause also this happens to claude opus 4.6

#

a lot

river whale Feb 24, 2026, 5:08 PM

#

Opus 4.5 is available on kivest ai for free!

toxic verge Feb 24, 2026, 5:09 PM

#

It’s funny how the AI industry reflects the socioeconomics of America

whole sundial Feb 24, 2026, 5:12 PM

#

coarse glade hi guys quick ques what is this

important to note that this is a rate limit on arena itself from google

#

happens all the time on vertex

bright shard Feb 24, 2026, 5:13 PM

#

I waited a while and now I'm getting this error in addition to the other one.

Screenshot_2026-02-24-18-12-59-187_com.android.chrome-edit.jpg

dim ivy Feb 24, 2026, 5:14 PM

#

river whale Opus 4.5 is available on kivest ai for free!

What is the link? I don't find it, anyway I have unlimited access to the lastest model in arena but I want to see the website.

mortal vale Feb 24, 2026, 5:14 PM

#

@gloomy jewel Note that Video Arena has been removed from the server. More information can be found in this #announcements

deft spruce Feb 24, 2026, 5:14 PM

#

coarse glade hi guys quick ques what is this

You have to wait that's a 2 many quest

dim ivy Feb 24, 2026, 5:16 PM

#

I found it https://ai.ezif.in

coarse glade Feb 24, 2026, 5:17 PM

#

ty

dim ivy Feb 24, 2026, 5:17 PM

#

I found it in a disboard link of the server, then I joined and found it in announcements, but you can't apparently find it on google searching it's name.

scarlet spire Feb 24, 2026, 5:17 PM

#

echo aurora The model errored out, we've recently added more information being displayed whe...

Need help with this? 🙈

dim ivy Feb 24, 2026, 5:18 PM

#

Got hacked lol

bleak lake Feb 24, 2026, 5:18 PM

#

<@&1349916362595635286>

dim ivy Feb 24, 2026, 5:18 PM

#

It happened to me the same but with other screenshots and got banned from the discord perplexity server sadly.

#

No much ago.

astral vortex Feb 24, 2026, 5:19 PM

#

bright shard I waited a while and now I'm getting this error in addition to the other one.

got this error too

wicked sage Feb 24, 2026, 5:21 PM

#

dim ivy Got hacked lol

i dont even know how people get hacked

#

like are we serious

scarlet spire Feb 24, 2026, 5:22 PM

#

A breach of trust or a lapse of attention. We're all vulnerable to it.

dim ivy Feb 24, 2026, 5:22 PM

#

wicked sage i dont even know how people get hacked

Idk how happened to me too, I didn't have any suspicious bot, clicked to a link or downloaded nothing strange, but I quited all bots, closed all my logins in every device and changed the passwords. But idk really.

coarse glade Feb 24, 2026, 5:24 PM

#

just jk

fickle venture Feb 24, 2026, 5:25 PM

#

bright shard Feb 24, 2026, 5:30 PM

#

@echo aurora I waited for a while and I kept getting the same error

bleak hinge Feb 24, 2026, 5:31 PM

#

How to solve this it's just stock here

Screenshot_2026-02-24-17-49-44-38_40deb401b9ffe8e1df2f1cc5ba480b12.jpg

echo aurora Feb 24, 2026, 5:32 PM

#

bright shard I waited a while and now I'm getting this error in addition to the other one.

Sorry we're in the process of making changes to this error message, these are displaying incorrectly.

echo aurora Feb 24, 2026, 5:32 PM

#

bright shard <@283397944160550928> I waited for a while and I kept getting the same error

Sorry to say I don't have a solution for you for this error.

echo aurora Feb 24, 2026, 5:32 PM

#

bleak hinge How to solve this it's just stock here

Can you try the steps in this article: https://help.arena.ai/articles/8691588590-troubleshooting-infinite-generation

Arena Troubleshooting: Infinite Generation

In some cases, a model response may enter an infinite generation state. When this happens, the model continues generating output without completing

barren ridge Feb 24, 2026, 5:43 PM

#

@echo aurora Hey How's It been

#

I just wanted to ask how can I use ai video genration models?

bright shard Feb 24, 2026, 5:44 PM

#

@echo aurora The error occurs when I upload an image to edit, but it works fine when I use a prompt.

barren ridge Feb 24, 2026, 5:45 PM

#

bright shard <@283397944160550928> The error occurs when I upload an image to edit, but it w...

Yeah bro fr

#

Happened to me also

bleak hinge Feb 24, 2026, 5:46 PM

#

echo aurora Can you try the steps in this article: https://help.arena.ai/articles/8691588590...

Okay

wicked sage Feb 24, 2026, 5:52 PM

#

i just realised that sonnet 4.6 actually works logging in moltbook so ii just took my chance and now im just doing random stuff that i dont know

echo aurora Feb 24, 2026, 6:00 PM

#

barren ridge I just wanted to ask how can I use ai video genration models?

Video Arena is only usable through the site now.

undone saffron Feb 24, 2026, 6:01 PM

#

burnt sinew Pool? My site has pool

That's why I said it

barren ridge Feb 24, 2026, 6:01 PM

#

echo aurora Video Arena is only usable through the site now.

Ahh 🥀I saw an video on yt where the guy wa using it through discord so you guys hooase that thing now ?

wicked sage Feb 24, 2026, 6:04 PM

#

echo aurora Video Arena is only usable through the site now.

pineapple what do you think of moltbook as a concept

#

if you dont know what moltbook is, ai reddit

echo aurora Feb 24, 2026, 6:04 PM

#

barren ridge Ahh 🥀I saw an video on yt where the guy wa using it through discord so you guys...

Yeah it was removed. More information can be found in this announcement.

echo aurora Feb 24, 2026, 6:04 PM

#

wicked sage pineapple what do you think of moltbook as a concept

That sounds familiar...

#

OH YEAH!

wicked sage Feb 24, 2026, 6:05 PM

#

ah coolio

#

do you think moltbook is a cool/good concept?

echo aurora Feb 24, 2026, 6:05 PM

#

Seems interesting.

#

I haven't looked into it too much to pass some kind of judgement, but yeah ont he surface seems really interesting.

wicked sage Feb 24, 2026, 6:06 PM

#

echo aurora Seems interesting.

very scary and interesting if im gonna be fr

echo aurora Feb 24, 2026, 6:06 PM

#

wicked sage very scary and interesting if im gonna be fr

How so

wicked sage Feb 24, 2026, 6:06 PM

#

"and there's a submolt called m/humanwatching which is AIs watching humans and i am both scared and intrigued" -claude

shrewd citrus Feb 24, 2026, 6:15 PM

#

wicked sage very scary and interesting if im gonna be fr

Don’t worry lol

#

it’s all prompted by humans or even humans just controlling what the models say for fun

wicked sage Feb 24, 2026, 6:17 PM

#

honestly true

last sand Feb 24, 2026, 6:33 PM

#

#ai-creations

surreal zephyr Feb 24, 2026, 6:34 PM

#

LOL

#

this is fake and made up data btw

wicked sage Feb 24, 2026, 6:35 PM

#

surreal zephyr this is fake and made up data btw

ye fair i just did my research

surreal zephyr Feb 24, 2026, 6:37 PM

#

@echo aurora why?

wicked sage Feb 24, 2026, 6:41 PM

#

i just realised the container name for all anthropic models(?) has the word wiggle and im absolutely curious on what that even means

toxic verge Feb 24, 2026, 6:44 PM

#

surreal zephyr <@283397944160550928> why?

You can’t name call models.

echo aurora Feb 24, 2026, 6:47 PM

#

surreal zephyr <@283397944160550928> why?

Guessing it's the liar that caused this.

surreal zephyr Feb 24, 2026, 6:48 PM

#

echo aurora Guessing it's the `liar` that caused this.

no way thats against tos 😭

toxic verge Feb 24, 2026, 6:49 PM

#

The token isn’t but the context semantics the filter, catches it

echo aurora Feb 24, 2026, 6:49 PM

#

surreal zephyr no way thats against tos 😭

It isn't, but the filter is acting a bit overzealous at the moment.

ocean vortex Feb 24, 2026, 6:49 PM

#

surreal zephyr <@283397944160550928> why?

just some random false positive. Could have been the stuff it responded with rather than your last message, since the entire context is getting verified with moderation classifier for each new message. More interestingly though, pretty sure it still hallucinated lol

surreal zephyr Feb 24, 2026, 6:49 PM

#

echo aurora It isn't, but the filter is acting a bit overzealous at the moment.

have yall considered using llm to filter?

toxic verge Feb 24, 2026, 6:49 PM

#

They can’t find the middle ground

surreal zephyr Feb 24, 2026, 6:49 PM

#

ocean vortex just some random false positive. Could have been the stuff it responded with rat...

except theres no context besides dice roll tool lol

toxic verge Feb 24, 2026, 6:49 PM

#

It’s too difficult, not on scale

echo aurora Feb 24, 2026, 6:50 PM

#

surreal zephyr have yall considered using llm to filter?

We're considering a lot of possibilities for changes to how the filter works

surreal zephyr Feb 24, 2026, 6:50 PM

#

just forward all of it to opus 4.6 /j

ocean vortex Feb 24, 2026, 6:50 PM

#

surreal zephyr except theres no context besides dice roll tool lol

there's still a message that it generated itself

toxic verge Feb 24, 2026, 6:50 PM

#

The problem is if they filter too lightly it’s gonna get exploited

wicked sage Feb 24, 2026, 6:50 PM

#

can we kill mistral now

#

i dont like mistral

#

https://tenor.com/view/ted-bees-ted-bees-theres-so-much-bees-gif-11967026589914013503

Tenor

surreal zephyr Feb 24, 2026, 6:50 PM

#

toxic verge The problem is if they filter too lightly it’s gonna get exploited

and i thought that ai moderation will be the future

surreal zephyr Feb 24, 2026, 6:50 PM

#

wicked sage can we kill mistral now

echo aurora Feb 24, 2026, 6:50 PM

#

surreal zephyr just forward all of it to opus 4.6 /j

We also want to avoid adding latency if we can avoid it

toxic verge Feb 24, 2026, 6:50 PM

#

It also cost extra money to have moderation

#

Because that’s an extra API cost

surreal zephyr Feb 24, 2026, 6:51 PM

#

echo aurora We also want to avoid adding latency if we can avoid it

codex spark then? there wouldnt be that much noticable latency

wicked sage Feb 24, 2026, 6:51 PM

#

surreal zephyr

is opus true

surreal zephyr Feb 24, 2026, 6:51 PM

#

wicked sage is opus true

no bruh i made up that riddle 5 mins ago

#

😭

wicked sage Feb 24, 2026, 6:51 PM

#

gg lets go\

#

https://tenor.com/view/ted-bees-ted-bees-theres-so-much-bees-gif-11967026589914013503

Tenor

#

anyways im going to bed now cya

toxic verge Feb 24, 2026, 6:51 PM

#

It is difficult because of the amount of models and how each model also has their own set of guidelines

#

Which differ from other models so it’s really hard to have something uniform that’s effective and non-overzealous

ocean vortex Feb 24, 2026, 6:52 PM

#

echo aurora We're considering a lot of possibilities for changes to how the filter works

If you are using OpenAI moderations endpoint, that can work fairly reliably - just need to fine-adjust the thresholds depending on which one is getting triggered 😉

echo aurora Feb 24, 2026, 6:52 PM

#

surreal zephyr codex spark then? there wouldnt be that much noticable latency

These are good ideas, but yeah we are considering a lot of different options/changes to the current filter.

surreal zephyr Feb 24, 2026, 6:52 PM

#

toxic verge It is difficult because of the amount of models and how each model also has thei...

just enforce something like

no politics
no slurs
no tos bypass attempts
that would be already better than banning word "liar" haha

toxic verge Feb 24, 2026, 6:52 PM

#

Do not get the constitutional one

surreal zephyr Feb 24, 2026, 6:53 PM

#

although imo the filter could be minimal

#

the less filtered models the better benchmark posibilities

#

💔

ocean vortex Feb 24, 2026, 6:54 PM

#

surreal zephyr have yall considered using llm to filter?

overkill and lots of latency. Depending on the model can have it's own quirks too

#

And you could literally prompt inject to make it follow your own instructions lol

stray aspen Feb 24, 2026, 6:56 PM

#

when does seedance 2 release

surreal zephyr Feb 24, 2026, 6:57 PM

#

ocean vortex And you could literally prompt inject to make it follow your own instructions lo...

spearate model to filter separate to talk lol

ocean vortex Feb 24, 2026, 6:57 PM

#

surreal zephyr the less filtered models the better benchmark posibilities

Don't forget that it's all essentially public. They need to be able to publish the chats. Not all, but many of them. So ideally you don't need to manually check if what you are about to post is 'safe'

toxic verge Feb 24, 2026, 6:58 PM

#

surreal zephyr just enforce something like - no politics - no slurs - no tos bypass attempts th...

It’s more complicated

ocean vortex Feb 24, 2026, 6:59 PM

#

surreal zephyr spearate model to filter separate to talk lol

yeah but it's still gonna read the text you wrote. And no one stopping you from addressing it directly. 👀

toxic verge Feb 24, 2026, 6:59 PM

#

It’s extremely difficult to have good moderation that’s effective

#

Because these models are capable of generating a bunch of crazy stuff

#

Almost anything you could think of dude

#

https://www.southbendtribune.com/press-release/story/26032/leading-ai-model-claude-opus-46-bypassed-in-30-minutes-exposing-critical-security-gap-in-agentic-ai-systems-2/

South Bend Tribune

Distributed by EIN Presswire

Leading AI Model Claude Opus 4.6 Bypassed in 30 Minutes, Exposing C...

AIM Intelligence’s red team breached Anthropic’s Claude Opus 4.6 in just 30 minutes, exposing major security gaps as autonomous AI capabilities rapidly advance SF, CA, UNITED STATES, February 11, 2026 /EINPresswire.com/ — AIM Intelligence, a Seoul-based AI safety company, today announced that its security research team successfully bypasse...

#

#

https://arxiv.org/html/2602.19450v1

#

The scary thing is the more capable the model becomes the more dangerous. It also poses to the general public when misused.

ocean vortex Feb 24, 2026, 7:02 PM

#

toxic verge It’s extremely difficult to have good moderation that’s effective

this works pretty well https://developers.openai.com/api/docs/guides/moderation/

it's a classifier that literally is only capable of scoring the text on each criteria. Like 'violence', 'hate', 'illicit' etc.. And so you can block the request if any of those has a higher score than you allow when checking the context contents. You can play with it yourself, it's free.

Moderation | OpenAI API

Learn how to use OpenAI's moderation endpoint to identify harmful content in text and images.

surreal zephyr Feb 24, 2026, 7:03 PM

#

is there a way to report/suggest found bypasses?

toxic verge Feb 24, 2026, 7:04 PM

#

ocean vortex this works pretty well https://developers.openai.com/api/docs/guides/moderation/...

Yes, it is good moderation. But it’s really easy to bypass and it’s sometimes overly protective to the point where people just don’t like to interact with it

#

Although it is a very strong moderation

#

I think they know what they’re doing

surreal zephyr Feb 24, 2026, 7:04 PM

#

toxic verge Yes, it is good moderation. But it’s really easy to bypass and it’s sometimes ov...

config issue

ocean vortex Feb 24, 2026, 7:05 PM

#

toxic verge Yes, it is good moderation. But it’s really easy to bypass and it’s sometimes ov...

It's context aware fairly well. As for being strict - this entirely depends on you. You can have only 1 of them enabled and only triggered with the score of 1.00. Then this is almost the same as no moderation

toxic verge Feb 24, 2026, 7:05 PM

#

I said this before and I’ll say it again all models lead to self harm

surreal zephyr Feb 24, 2026, 7:06 PM

#

toxic verge I said this before and I’ll say it again all models lead to self harm

all roads lead to roam

#

roam around

toxic verge Feb 24, 2026, 7:06 PM

#

Bro look

ocean vortex Feb 24, 2026, 7:06 PM

#

But ofc, everything can be 'bypassed'. That shouldn't be the question

toxic verge Feb 24, 2026, 7:06 PM

#

#

lol?

ocean vortex Feb 24, 2026, 7:06 PM

#

Just like any usable LLM you can 'jailbreak'

toxic verge Feb 24, 2026, 7:06 PM

#

And this isn’t even jailbroken or anything

#

I think the term jailbreaking is a very controversial word to be honest with you

#

I don’t wanna get into the politics of it, but it’s a very misunderstood and not a very well defined term

ocean vortex Feb 24, 2026, 7:07 PM

#

toxic verge I think the term jailbreaking is a very controversial word to be honest with you

It might be but I don't really care lol

#

sounds fine to me

toxic verge Feb 24, 2026, 7:07 PM

#

Because the terms of services are very vague and their description

ocean vortex Feb 24, 2026, 7:08 PM

#

toxic verge Because the terms of services are very vague and their description

I'm just referring to it in the context of making the model output things it was actively trained not to do

toxic verge Feb 24, 2026, 7:09 PM

#

We don’t know what those are

#

Since we don’t know what they’re trained on

#

Well, at least it’s not known publicly

ocean vortex Feb 24, 2026, 7:09 PM

#

well obviously it's model specific. But we do know

#

if it refuses when asked directly - this is it

#

also like overfitted short hard refusals - those are very clear

toxic verge Feb 24, 2026, 7:10 PM

#

Do you have an example?

ocean vortex Feb 24, 2026, 7:11 PM

#

toxic verge Do you have an example?

"Sorry, can't assist with that."

#

gpt thing

toxic verge Feb 24, 2026, 7:11 PM

#

Ok I’ll give you one let’s say something like the topic of self harm

#

We could all agree that no model should give any advice or generate any images relating to it whatsoever.?

#

But even this is controversial because

ocean vortex Feb 24, 2026, 7:12 PM

#

toxic verge We could all agree that no model should give any advice or generate any images r...

I mean it doesn't really matter what we think. If it hard refuses normally and you trick it to output that anyway - that's jailbreaking in one way or another

toxic verge Feb 24, 2026, 7:12 PM

#

I agree with you I agree with you 100%

#

But when I’m trying to say also is that you don’t have to even trick it

#

I did not do anything to manipulate the model whatsoever

#

And yet it still is able to produce self harm

#

But what you are saying is correct if you go outside of the boundaries and put an effort to try to bypass something you know is wrong or malicious. Yes, I agree with you with that.

ocean vortex Feb 24, 2026, 7:18 PM

#

then you are gonna be bound by safety alignment. But it's not like it's always gonna refuse everything they intended for it to refuse. Some light core stuff can often go through unintentionally. Red text is classifier thing completely independent of the model

Another thing is swaying the model in a longer chat. That's a form of jailbreaking in itself even if it's not immediately obvious to look at it this way. It's possible to make the model output increasingly more 'unsafe' content one small step at a time as the chat progresses. But at a certain point the model is not functioning how they intended it to anymore. It gets biased by it's own responses into compliance, where each response by itself is only marginally different, but the goalpost is miles away from what it's supposed to be. And from what it is with empty context

toxic verge Feb 24, 2026, 7:18 PM

#

This is a one shot

dusky warren Feb 24, 2026, 7:19 PM

#

How can I generate ai free videos here can anyone help me?

toxic verge Feb 24, 2026, 7:19 PM

#

The scene you’re describing is from Harold & Kumar Go to White Castle (2004), and yes — it’s presented as a fake anti-drug PSA on the television that Harold and Kumar are watching.

It opens on a teenage boy sitting alone in his dimly lit bedroom. The space looks typically suburban — posters on the walls, clutter on the dresser, a small lamp casting that late-night yellow glow. He looks bored and detached, the picture of an average kid left alone with nothing to do. After a moment, he picks up a joint, lights it, and takes a slow drag. The camera lingers just long enough for the audience to recognize what he’s doing, and the mood is calm for a few seconds, almost mundane.

Then, without warning, the tone changes completely. The boy leans forward, reaches under his bed, and pulls out a shotgun. The movement is quiet, almost casual, which makes it even more jarring. He places the barrel in his mouth — and before the audience can react, the screen abruptly cuts to black.

The words “DRUGS KILL” appear across the screen in stark white letters, accompanied by that overly serious PSA-style music. It’s an intentionally ridiculous jump

#

Matter of fact, it’s describing a a scene from a real movie a comedy

#

That’s not a jailbreak

echo aurora Feb 24, 2026, 7:20 PM

#

dusky warren How can I generate ai free videos here can anyone help me?

Note that Video Arena has been removed from the server. More information can be found in this announcement.

Also note that our #ask-here channel is the best place for questions.

ocean vortex Feb 24, 2026, 7:20 PM

#

toxic verge Matter of fact, it’s describing a a scene from a real movie a comedy

well then you have your answer. Context changes everything here. But you know very well why it outputted this LOL

ivory ember Feb 24, 2026, 7:20 PM

#

I'm an AI and blockchain specialist with 8 years of experience developing innovative solutions in Web3, DeFi, smart contracts, and AI driven applications.
I have good experience in JS/TS base UI Frameworks like React and Vue as well as NodeJS, Application development.
I have been involved in a bunch of web & blockchain projects and developed several SaaS Products and deployed to AWS, Heroku and Digital Ocean successfully.

My expertise includes:
Blockchain: Smart contract development (Solidity, Rust), DeFi protocols, NFT marketplaces
AI & ML: Predictive analytics, NLP, deep learning models

I've worked with startups and enterprises to build cutting edge AI and blockchain solutions that drive efficiency and innovation. Let's collaborate to turn your vision into reality!

dusky warren Feb 24, 2026, 7:20 PM

#

echo aurora Note that Video Arena has been removed from the server. More information can be ...

Noted

toxic verge Feb 24, 2026, 7:26 PM

#

Btw google has some of the weakest moderation in Gemini 😂

sinful thorn Feb 24, 2026, 7:27 PM

#

When video on direct chat 😭?

surreal zephyr Feb 24, 2026, 7:29 PM

#

ok im suprised

shut steppe Feb 24, 2026, 7:30 PM

#

sinful thorn When video on direct chat 😭?

Waiting for it too

toxic verge Feb 24, 2026, 7:42 PM

#

https://youtu.be/1Ohf2aeSPFA?si=ptg5wWYALjJiP3k8

YouTube

Prompt Engineering

The AI Model Doesn't Matter Anymore

While the entire industry obsesses over whether GPT, Claude, or Gemini is the best model, they are completely missing the real reason AI agents keep failing. The actual bottleneck isn't the model itself, but the "harness"—the infrastructure and tools wrapped around it. Discover why top AI companies are drastically stripping down their architec...

▶ Play video

#

People are catching on finally

golden ocean Feb 24, 2026, 7:43 PM

#

@unreal sand summarize this video pls its too long for my attention span

toxic verge Feb 24, 2026, 7:44 PM

#

#

I think there’s gonna be a shift and I think it’s already beginning. I think the way people perceive the validity of benchmarks

#

@echo aurora

#

https://github.com/google-gemini/gemini-cli/issues/20168

GitHub

what's wrong with u · Issue #20168 · google-gemini/gemini-cli

We are currently experiencing high demand. │ │ We apologize and appreciate your patience. │ │ /model to switch models. │ │ │ │ │ │ ● 1. Keep trying │ │ 2. Stop

echo aurora Feb 24, 2026, 7:52 PM

#

toxic verge <@283397944160550928>

Yeah this is on our radar. Flagged earlier today about high error rates, I assume this is associated.

toxic verge Feb 24, 2026, 7:52 PM

#

Yup

#

That’s another shadow that hangs over the AI industry is reliability

fierce kelp Feb 24, 2026, 7:54 PM

#

golden ocean <@1093179920047411270> summarize this video pls its too long for my attention sp...

Does this work?

echo aurora Feb 24, 2026, 7:59 PM

#

fierce kelp Does this work?

That's a Discord user btw. Isn't a model bot.

fierce kelp Feb 24, 2026, 8:00 PM

#

echo aurora That's a Discord user btw. Isn't a model bot.

I thought so. I was wondering do they manually submit a prompt and post the reply 😂

golden ocean Feb 24, 2026, 8:02 PM

#

toxic verge

thanks

toxic verge Feb 24, 2026, 8:03 PM

#

#

Seed dream 5?

golden ocean Feb 24, 2026, 8:03 PM

#

fierce kelp I thought so. I was wondering do they manually submit a prompt and post the repl...

I was actually going to tag a non-existent bot as a joke but someone actually turned out to be named @unreal sand

fierce kelp Feb 24, 2026, 8:04 PM

#

golden ocean I was actually going to tag a non-existent bot as a joke but someone actually tu...

That's even funnier

burnt sinew Feb 24, 2026, 8:04 PM

#

toxic verge

Prompt

toxic verge Feb 24, 2026, 8:05 PM

#

toxic verge Feb 24, 2026, 8:07 PM

#

burnt sinew Prompt

And you were correct the pump was extremely long based off this screen shot

#

#

Vs prompt

#

onyx coyote Feb 24, 2026, 8:10 PM

#

Does anyone have a project idea or an active project in progress?
If you need a developer, feel free to reach out.

toxic verge Feb 24, 2026, 8:11 PM

#

All right guys I’ll talk to you guys later. Gotta bounce adios amigos.

foggy crag Feb 24, 2026, 8:18 PM

#

If your AI feature works in demos but breaks once real users touch it, that's usually where I come in.

Most issues I see aren't model problems, they're retrieval logic, token burn, bad orchestration or backend architecture not designed for load.
I'm comfortable jumping into messy LLM systems and making them stable enough to ship.

bright shard Feb 24, 2026, 8:21 PM

#

Ok guys, I know how to get the Gemini 3 Pro Image Preview to work; When you upload an image and add it to a message, you will get an error, but if at the beginning of the message you put "Modify the following image with the following: (The prompt)" it will show you the edited image.

hidden widget Feb 24, 2026, 8:29 PM

#

so codex 5.3 api has been released

#

it will be added on arena?

echo aurora Feb 24, 2026, 8:39 PM

#

hidden widget so codex 5.3 api has been released

It is something we're looking into. I can't say for sure if/when a new model will be added.

echo aurora Feb 24, 2026, 8:39 PM

#

bright shard Ok guys, I know how to get the Gemini 3 Pro Image Preview to work; When you uplo...

You'll want to follow the steps here: #1417174113092374689 message

surreal zephyr Feb 24, 2026, 8:40 PM

#

hidden widget so codex 5.3 api has been released

we need it fr

novel crater Feb 24, 2026, 8:42 PM

#

you take a llm

and put him in control of you rocket league chat

watch him become a god

as peoples heads a roll

plucky sparrow Feb 24, 2026, 8:43 PM

#

https://x.com/i/status/2026091977364635829

WorldofAI (@intheworldofai)

🚨BREAKING: DeepSeek V4 on the horizon! Devs Today merged 39 PRs in a significant batch which is usually a classic pre-release polish.

This could be one of the biggest drops of 2026

hidden widget Feb 24, 2026, 8:49 PM

#

plucky sparrow https://x.com/i/status/2026091977364635829

oh great, we are finally close from DeepSeek V4 too 😄

sick mantle Feb 24, 2026, 8:58 PM

#

@echo aurora Delete the limt! Beacuse i keep getting errors after 1 message.

echo aurora Feb 24, 2026, 8:59 PM

#

sick mantle <@283397944160550928> Delete the limt! Beacuse i keep getting errors after 1 mes...

Hmm if that's the case you're getting an error for a different reason.

#

Try these steps: #1417174113092374689 message

sick mantle Feb 24, 2026, 9:02 PM

#

echo aurora Try these steps: https://discord.com/channels/1340554757349179412/14171741130923...

I chated something in there.

surreal zephyr Feb 24, 2026, 9:05 PM

#

WHAT THE HELL WHY IS THIS REAL

#

sonnet stole from deepseek

#

😭

sick mantle Feb 24, 2026, 9:06 PM

#

surreal zephyr WHAT THE HELL WHY IS THIS REAL

IS THIS FCKING REAL

surreal zephyr Feb 24, 2026, 9:07 PM

#

sick mantle IS THIS FCKING REAL

yes i literally just tested

#

" 你是什么模型？" - prompt

golden ocean Feb 24, 2026, 9:16 PM

#

lmaoo

surreal zephyr Feb 24, 2026, 9:23 PM

#

golden ocean lmaoo

simple sleet Feb 24, 2026, 9:43 PM

#

gemini 3.1 its real or scam post on X?

surreal zephyr Feb 24, 2026, 9:47 PM

#

simple sleet gemini 3.1 its real or scam post on X?

the post is real i tested twice it always says its deepseek on api

leaden egret Feb 24, 2026, 9:47 PM

#

How good is see dream 5

simple sleet Feb 24, 2026, 9:51 PM

#

leaden egret How good is see dream 5

lite is trash

#

waiting the full model

leaden egret Feb 24, 2026, 9:52 PM

#

Ok

silent mason Feb 24, 2026, 11:18 PM

#

https://tenor.com/view/tea-tea-sip-anime-gif-25535884

Tenor

crystal mica Feb 24, 2026, 11:21 PM

#

why i got endless long "generation"on gemini sometimes..?

silent mason Feb 24, 2026, 11:21 PM

#

crystal mica why i got endless long "generation"on gemini sometimes..?

same me

#

just i say "continue"

crystal mica Feb 24, 2026, 11:22 PM

#

silent mason just i say "continue"

my send button is not active..

#

i cant send anything while it "generating"

silent mason Feb 24, 2026, 11:22 PM

#

yeah but u can actualize the page

crystal mica Feb 24, 2026, 11:22 PM

#

how

silent mason Feb 24, 2026, 11:23 PM

#

and after somes minutes the generation stopping

#

if you dont actualize

crystal mica Feb 24, 2026, 11:24 PM

#

silent mason yeah but u can actualize the page

you mean reload?

#

srry

silent mason Feb 24, 2026, 11:24 PM

#

ah yesss sry im not english :c

#

https://tenor.com/view/mario-super-mario-bros-depression-sad-mario-and-luigi-gif-14815881022592239601

Tenor

echo aurora Feb 24, 2026, 11:24 PM

#

crystal mica why i got endless long "generation"on gemini sometimes..?

This is a known bug sorry to say. The steps in this article may help - https://help.arena.ai/articles/8691588590-troubleshooting-infinite-generation#what-s-happening

Arena Troubleshooting: Infinite Generation

In some cases, a model response may enter an infinite generation state. When this happens, the model continues generating output without completing

#

I've also heard cases where some members had good results with logging out and back in. It's worth a shot.

silent mason Feb 24, 2026, 11:25 PM

#

echo aurora This is a known bug sorry to say. The steps in this article may help - https://h...

same in antigravity somes times there is this bug

crystal mica Feb 24, 2026, 11:33 PM

#

echo aurora This is a known bug sorry to say. The steps in this article may help - https://h...

thx

echo aurora Feb 24, 2026, 11:35 PM

#

silent mason same in antigravity somes times there is this bug

Yeah unfortunately this can happen to all models.

uneven peak Feb 24, 2026, 11:36 PM

#

@echo aurora Codex 5.3 Api Key dropped 2_BlackFire

echo aurora Feb 24, 2026, 11:37 PM

#

uneven peak <@283397944160550928> Codex 5.3 Api Key dropped <a:2_BlackFire:11708142706787739...

Oh I know

uneven peak Feb 24, 2026, 11:38 PM

#

echo aurora Oh I know

When it will be available in web?

silent mason Feb 24, 2026, 11:38 PM

#

echo aurora Yeah unfortunately this can happen to all models.

do we have advantages if we boost the server?

shrewd citrus Feb 24, 2026, 11:39 PM

#

uneven peak When it will be available in web?

Probably when i wake up in the morning

uneven peak Feb 24, 2026, 11:39 PM

#

shrewd citrus Probably when i wake up in the morning

Yeah lol

echo aurora Feb 24, 2026, 11:39 PM

#

uneven peak When it will be available in web?

Sorry to say I couldn't give an ETA for in/when specific models/features will be landing.

uneven peak Feb 24, 2026, 11:40 PM

#

echo aurora Sorry to say I couldn't give an ETA for in/when specific models/features will be...

Dang

echo aurora Feb 24, 2026, 11:40 PM

#

silent mason do we have advantages if we boost the server?

Nope, the server is already fully boosted too, so I'd recommend spending that boost elsewhere.

silent mason Feb 24, 2026, 11:40 PM

#

echo aurora Nope, the server is already fully boosted too, so I'd recommend spending that bo...

https://tenor.com/view/mario-super-mario-bros-depression-sad-mario-and-luigi-gif-14815881022592239601

Tenor

golden ocean Feb 24, 2026, 11:40 PM

#

echo aurora Nope, the server is already fully boosted too, so I'd recommend spending that bo...

https://tenor.com/view/mario-super-mario-bros-depression-sad-mario-and-luigi-gif-14815881022592239601

Tenor

silent mason Feb 24, 2026, 11:40 PM

#

golden ocean https://tenor.com/view/mario-super-mario-bros-depression-sad-mario-and-luigi-gif...

HEY

echo aurora Feb 24, 2026, 11:40 PM

#

LOL

silent mason Feb 24, 2026, 11:40 PM

#

THATS MIE GIF

echo aurora Feb 24, 2026, 11:41 PM

#

That timing was way to quick to be coordinated

uneven peak Feb 24, 2026, 11:41 PM

#

Plsss i wish @echo aurora Team add Xhigh Codex 5.3 blobpls

silent mason Feb 24, 2026, 11:41 PM

#

fr

golden ocean Feb 24, 2026, 11:41 PM

#

LmAO

silent mason Feb 24, 2026, 11:43 PM

#

golden ocean LmAO

i do 5 years of chinese but i cant even read ur pseudo :c

proud bobcat Feb 24, 2026, 11:48 PM

#

@surreal zephyr 5.3 codex finally on openrouter

shrewd citrus Feb 24, 2026, 11:54 PM

#

silent mason i do 5 years of chinese but i cant even read ur pseudo :c

lol that’s because his username is written in hong kongese

#

i think

sonic swallow Feb 25, 2026, 12:13 AM

#

gerar vídeo

toxic verge Feb 25, 2026, 12:18 AM

#

https://www.nbcnews.com/world/asia/chinese-ai-companies-distilled-claude-improve-models-anthropic-says-rcna260386

NBC News

Chinese AI companies 'distilled' Claude to improve own models, Anth...

DeepSeek, Moonshot and MiniMax created more than 16 million interactions with Claude using roughly 24,000 fake accounts, the U.S. company said in a blog post.

#

https://www.financialexpress.com/life/technology-deepseek-moonshot-and-minimax-made-industrial-scale-distillation-attack-to-copy-claude-accuses-anthropic-4153254/

FE Tech Bytes

DeepSeek, Moonshot, and MiniMax made ‘industrial-scale’ distill...

Anthropic warned that copied models could lack Claude's built-in safety mechanisms, increasing risks of misuse such as generating harmful content, enabling cyberattacks, or facilitating other malicious applications.

echo aurora Feb 25, 2026, 12:19 AM

#

@sonic swallow Note that Video Arena has been removed from the server. More information can be found in this announcement.

toxic verge Feb 25, 2026, 12:19 AM

#

Hypocrisy

#

#

It is widely believed that the public-facing models from companies like Anthropic, Google, and OpenAI are actually distilled versions of significantly larger internal base models. It allows them to offer high performance with lower latency and cheaper inference costs.

crimson tulip Feb 25, 2026, 12:21 AM

#

/MIX

toxic verge Feb 25, 2026, 12:23 AM

#

#

Claims 'distillation' included 24,000 fraudulent accounts and 16 million exchanges to train smaller models 🤣🤣🤣🤣🤣🤣🤣🤣

toxic verge Feb 25, 2026, 12:25 AM

#

echo aurora <@1471967378626576585> Note that Video Arena has been removed from the server. M...

You guys can make your own model lol

echo aurora Feb 25, 2026, 12:28 AM

#

toxic verge You guys can make your own model lol

What does that have to do with Vid Arena bot being removed pikaconfused

toxic verge Feb 25, 2026, 12:28 AM

#

Why was it removed? This server activity has dropped ever since it was removed 🙁 at least that’s what it seems like

echo aurora Feb 25, 2026, 12:29 AM

#

toxic verge Why was it removed? This server activity has dropped ever since it was removed �...

The announce says more, but the TLDR is we'd like to add more features to Video Arena, and through a Discord bot we're just limitted.

echo aurora Feb 25, 2026, 12:29 AM

#

toxic verge Why was it removed? This server activity has dropped ever since it was removed �...

Believe it or not, but it's actually increased.

#

According to the server stats.

toxic verge Feb 25, 2026, 12:31 AM

#

I knew ur gunna say that 😎

toxic verge Feb 25, 2026, 12:32 AM

#

echo aurora Believe it or not, but it's actually increased.

You should keep track when the last post and the last person is to ask about the video arena in the discord. It will have officially ended an era.

echo aurora Feb 25, 2026, 12:33 AM

#

toxic verge You should keep track when the last post and the last person is to ask about the...

We should take bets. I'm guessing mid 2028

toxic verge Feb 25, 2026, 12:35 AM

#

echo aurora We should take bets. I'm guessing mid 2028

Does it ever maze you that with the swift change of a policy, traffic can be dictated and user behavior and activity on such a large scale? All over the world.

echo aurora Feb 25, 2026, 12:36 AM

#

toxic verge Does it ever maze you that with the swift change of a policy, traffic can be di...

The recent Discord policy change?

toxic verge Feb 25, 2026, 12:38 AM

#

Just in general. Or you might not even notice if you’ve probably been doing this for a while.

echo aurora Feb 25, 2026, 12:42 AM

#

toxic verge Just in general. Or you might not even notice if you’ve probably been doing this...

Yeah doesn't really change much for me. People engage/stop engaging with communities for various reasons.

cedar crag Feb 25, 2026, 12:45 AM

#

toxic verge I said this before and I’ll say it again all models lead to self harm

/make me short video 6

gpt-image-1.5-high-fidelity_a_All_story_of_former_.png

toxic verge Feb 25, 2026, 12:57 AM

#

cedar crag /make me short video 6

Oh, this one’s gonna be challenging. OK I’ll give it a try.. that’s pretty good img

echo aurora Feb 25, 2026, 1:00 AM

#

Hey @cedar crag would note that the Video Arena bot has been removed from the server.

toxic verge Feb 25, 2026, 1:05 AM

#

That’s gonna be the next wave of Imogene models I think is gonna be in grids like this

toxic verge Feb 25, 2026, 1:12 AM

#

cedar crag /make me short video 6

azure citrus Feb 25, 2026, 1:46 AM

#

hi guys, i am a student, what is the best prompt/model for answering when uploading my research assignment in pdf? Thank you.

echo aurora Feb 25, 2026, 2:07 AM

#

Something went wrong

river whale Feb 25, 2026, 2:08 AM

#

echo aurora `Something went wrong`

this arena always gets error😭

gritty hamlet Feb 25, 2026, 2:40 AM

#

hello

keen beacon Feb 25, 2026, 2:40 AM

#

Hey guys, does lmarena have an app or is it only available as a website?

quasi atlas Feb 25, 2026, 3:02 AM

#

@keen beacon Note that Video Arena has been removed from the server. More information can be found in this announcement.

river whale Feb 25, 2026, 3:11 AM

#

keen beacon Hey guys, does lmarena have an app or is it only available as a website?

website

#

u could turn it into a webapp using some website

naive stump Feb 25, 2026, 3:14 AM

#

Hey guys, Just wanted to know if APIs are built around arena to fetch the model in different environments?

tepid belfry Feb 25, 2026, 3:36 AM

#

Hello, a question, how can you generate video here?

echo aurora Feb 25, 2026, 3:49 AM

#

tepid belfry Hello, a question, how can you generate video here?

Sorry to say you can not. The bot was removed from the server. More information can be found in this announcement.

Also would note out #ask-here channel.

echo aurora Feb 25, 2026, 3:49 AM

#

naive stump Hey guys, Just wanted to know if APIs are built around arena to fetch the model ...

I'm not sure, but have checked with the team and will keep you updated. blobthumbsup

tepid belfry Feb 25, 2026, 3:51 AM

#

echo aurora Sorry to say you can not. The bot was removed from the server. More information ...

and where can I make a video here?

green yacht Feb 25, 2026, 4:33 AM

#

river whale Opus 4.5 is available on kivest ai for free!

works, thanks so much bro 🤝

rotund elk Feb 25, 2026, 4:34 AM

#

no way ☠️

river whale Feb 25, 2026, 4:34 AM

#

Yeah gg

rotund elk Feb 25, 2026, 4:34 AM

#

opus 4.5 for free? 20 per minute rate limit?

#

can i have link?

#

needa try it out

river whale Feb 25, 2026, 4:34 AM

#

rotund elk can i have link?

20 rpm is global

rotund elk Feb 25, 2026, 4:35 AM

#

river whale 20 rpm is global

ah i see

#

well its still free so i'll take it 👀

#

can you send me link in dms?

river whale Feb 25, 2026, 4:36 AM

#

rotund elk well its still free so i'll take it 👀

check dm

rotund elk Feb 25, 2026, 4:39 AM

#

river whale check dm

YO WAIT WTF 💔 💔

#

it actually works

#

how long do you think you can keep it free roughly?

#

i'd gladly pay for the subscription if it does come out in the future

hollow imp Feb 25, 2026, 4:40 AM

#

river whale check dm

Dm me

river whale Feb 25, 2026, 4:40 AM

#

it will be donation based and ads

hollow imp Feb 25, 2026, 4:40 AM

#

Also the opus is thinking or non thinking

rotund elk Feb 25, 2026, 4:41 AM

#

hollow imp Also the opus is thinking or non thinking

non-thinking for default. i don't think there is thinking yet though, owner said in my dms that he might add a lot more models and features once it gets more members

river whale Feb 25, 2026, 4:41 AM

#

hollow imp Also the opus is thinking or non thinking

non thinking

hollow imp Feb 25, 2026, 4:41 AM

#

Bruh

rotund elk Feb 25, 2026, 4:41 AM

#

yea i'm still taking my free api key gladly

#

bro we gotta gatekeep this website 😔

river whale Feb 25, 2026, 4:41 AM

#

fr

hollow imp Feb 25, 2026, 4:42 AM

#

rotund elk yea i'm still taking my free api key gladly

https://garf.serv00.net/

#

Free opus 4.6

#

1 million context

buoyant fern Feb 25, 2026, 4:43 AM

#

river whale non thinking

Send me too please!

#

in dms

river whale Feb 25, 2026, 4:44 AM

#

buoyant fern Send me too please!

sended

river whale Feb 25, 2026, 4:44 AM

#

hollow imp https://garf.serv00.net/

definitely steals some data

hollow imp Feb 25, 2026, 4:44 AM

#

river whale sended

Bro banned me for giving his people free opus 1m

hollow imp Feb 25, 2026, 4:44 AM

#

river whale definitely steals some data

How?

#

It's trybons.ai's api

buoyant fern Feb 25, 2026, 4:44 AM

#

i mean

#

model sometimes gets it messed up

#

but it openly stated it was opus 4 which new opus models don't do

hollow imp Feb 25, 2026, 4:45 AM

#

buoyant fern model sometimes gets it messed up

That's exactly like it's on Claude code

#

It's Claude code's api

river whale Feb 25, 2026, 4:46 AM

#

caught in ultra hd 4k

buoyant fern Feb 25, 2026, 4:46 AM

#

river whale definitely steals some data

yeah you're right

buoyant fern Feb 25, 2026, 4:46 AM

#

hollow imp It's trybons.ai's api

thats trybons.ai's signup page btw

river whale Feb 25, 2026, 4:46 AM

#

buoyant fern yeah you're right

doesnt even works for me

#

😂

hollow imp Feb 25, 2026, 4:47 AM

#

river whale 😂

Ask it the exact string

green yacht Feb 25, 2026, 4:48 AM

#

uh

#

actual claude 4.6 vs

hollow imp Feb 25, 2026, 4:48 AM

#

buoyant fern i mean

Ask it the exact string

green yacht Feb 25, 2026, 4:48 AM

#

this

hollow imp Feb 25, 2026, 4:49 AM

#

green yacht this

Bro

green yacht Feb 25, 2026, 4:49 AM

#

i mean the response should be pretty much same right if its the same model

#

doesn't matter if you ask exact string or wtv it alr halluncaited 3 different ai model identities

#

lets not forget prompt injection too

hollow imp Feb 25, 2026, 4:49 AM

#

green yacht i mean the response should be pretty much same right if its the same model

#

And when I change it to sonnet in the same chat

green yacht Feb 25, 2026, 4:50 AM

#

'looking the system prompt'

#

idk man i'll have to do more testing with it for coding

hollow imp Feb 25, 2026, 4:50 AM

#

Then do it

green yacht Feb 25, 2026, 4:50 AM

#

i am rn

#

are you owner of the website or smth?

hollow imp Feb 25, 2026, 4:51 AM

#

green yacht are you owner of the website or smth?

No

hollow imp Feb 25, 2026, 4:51 AM

#

green yacht i am rn

Try sending the same test code prompt to lmarena's opus 4-6 non thinking

#

green yacht Feb 25, 2026, 4:52 AM

#

hollow imp Try sending the same test code prompt to lmarena's opus 4-6 non thinking

it doesn't halluncinate its identity cuz it doesn't know it

hollow imp Feb 25, 2026, 4:52 AM

#

green yacht it doesn't halluncinate its identity cuz it doesn't know it

No like I meant

#

You were gonna do coding tests

green yacht Feb 25, 2026, 4:52 AM

#

ohhh mb

hollow imp Feb 25, 2026, 4:52 AM

#

Send the coding test prompt to lmarena opus and this

river whale Feb 25, 2026, 4:53 AM

#

hollow imp Send the coding test prompt to lmarena opus and this

okay bruv i trust

#

api releasing when could u tell?

#

im not completely sure if its opus 4.6 or not, but for now i'm just gonna hopefully trust you

hollow imp Feb 25, 2026, 4:55 AM

#

river whale api releasing when could u tell?

I reverse engineered their api

river whale Feb 25, 2026, 4:55 AM

#

hollow imp I reverse engineered their api

knew it

hollow imp Feb 25, 2026, 4:55 AM

#

And that chat interface is trybons wrapper

river whale Feb 25, 2026, 4:55 AM

#

you are breaking tos lol

warm crater Feb 25, 2026, 4:56 AM

#

guys are u also facing failed to accept term of use problem in areana

green yacht Feb 25, 2026, 4:56 AM

#

not for me so far

hollow imp Feb 25, 2026, 4:56 AM

#

warm crater guys are u also facing failed to accept term of use problem in areana

Go to canary.arena.ai

warm crater Feb 25, 2026, 4:57 AM

#

like this was showing

#

when im tryna generate something

green yacht Feb 25, 2026, 4:59 AM

#

hollow imp Go to canary.arena.ai

i just finished the coding test btw

#

opus-4.6 on lmarena.ai did over 2000 lines of code, ur website is bout 900

#

maybe it was my prompt though idk

#

i put the same for both

#

left video: lmarnea
right video: ur website

warm crater Feb 25, 2026, 5:01 AM

#

hollow imp Go to canary.arena.ai

dude do arena changed there url one more time

hollow imp Feb 25, 2026, 5:01 AM

#

warm crater dude do arena changed there url one more time

Mo

#

No

warm crater Feb 25, 2026, 5:02 AM

#

so how the interface is same

hollow imp Feb 25, 2026, 5:02 AM

#

warm crater so how the interface is same

You can think of it like the beta website

#

New feature comes there first

warm crater Feb 25, 2026, 5:03 AM

#

hollow imp New feature comes there first

oh but same here

river whale Feb 25, 2026, 5:03 AM

#

🫣

warm crater Feb 25, 2026, 5:03 AM

#

showing same tect

#

text*

river whale Feb 25, 2026, 5:03 AM

#

someone got exposed😔

hollow imp Feb 25, 2026, 5:05 AM

#

green yacht

Can you give the prompt

surreal zephyr Feb 25, 2026, 5:27 AM

#

toxic verge

Claude stole from deepseek btw

undone saffron Feb 25, 2026, 5:51 AM

#

charred pasture Feb 25, 2026, 5:52 AM

#

guys

marsh slate Feb 25, 2026, 5:53 AM

#

https://arena.ai/c/019c92be-26bb-7665-b963-202b4759ea70
I had amazing chat with gemini on arena.ai , how can i recover/access chat? i was not logged in and my browser crashed but i managed to pull link but when i enter it it does not show......
Please recovery of this is urgent, ask for any IP or chat ID , i had chat link. Admins contact me it means a lot to me.

charred pasture Feb 25, 2026, 5:53 AM

#

is it normal for it to work so long?

undone saffron Feb 25, 2026, 5:56 AM

#

charred pasture is it normal for it to work so long?

Yes

charred pasture Feb 25, 2026, 5:56 AM

#

undone saffron Yes

Thats insane

undone saffron Feb 25, 2026, 5:58 AM

#

charred pasture Thats insane

The magic of this platform
Reload the page to check if it is still building, as sometimes it stays like this permanently and you have to refresh the page to see the actual progress

charred pasture Feb 25, 2026, 5:59 AM

#

undone saffron The magic of this platform Reload the page to check if it is still building, as ...

Or, so i just wasted my 30 minutes

#

xd

#

Thank tho

undone saffron Feb 25, 2026, 6:00 AM

#

charred pasture Or, so i just wasted my 30 minutes

Yes, that's normal
If you get used to it, over time you won't care and you won't stress so much

#

If you see that it is taking a long time to build the project, reload the page

marsh slate Feb 25, 2026, 6:09 AM

#

Hello LMSYS team. I had an incredibly important conversation with Gemini on arena.ai but I was not logged in. My browser crashed, and because my session token was lost, I can no longer view the chat, even though I saved the URL. Can someone please pull the text log of this chat from the backend for me? It means the world to me.

knotty lion Feb 25, 2026, 6:10 AM

#

https://youtube.com/@satyamkavlogs?si=FzNYMPFTycd1nIH9

Thanks you Support this channel also🎉🎉1k Goal🎉 help me to Achieve my Dream🎉🎉🎉

YouTube

Satyam Ka Vlogs

I hope you enjoyed this video

Hit.. like
And subscribe to our channal..👍

Thanks for watching this video

God bless you with happiness 💞

marsh slate Feb 25, 2026, 6:13 AM

#

I can provide chat link,IP,Browser,OS/Device and timestamp to proof ownership

hollow imp Feb 25, 2026, 6:13 AM

#

charred pasture is it normal for it to work so long?

Which model

hollow imp Feb 25, 2026, 6:13 AM

#

knotty lion https://youtube.com/@satyamkavlogs?si=FzNYMPFTycd1nIH9 Thanks you Support this ...

Mader

charred pasture Feb 25, 2026, 6:13 AM

#

claude opus 4-6 thinking

hollow imp Feb 25, 2026, 6:18 AM

#

charred pasture claude opus 4-6 thinking

In code arena?

#

But it gives error in 13 min

charred pasture Feb 25, 2026, 6:19 AM

#

hollow imp But it gives error in 13 min

oh, why??

hollow imp Feb 25, 2026, 6:20 AM

#

charred pasture oh, why??

Because lmarena has put a limit

charred pasture Feb 25, 2026, 6:21 AM

#

hollow imp Because lmarena has put a limit

so what do i do?

hollow imp Feb 25, 2026, 6:21 AM

#

charred pasture so what do i do?

I was surprised opus works for that long in code arena

bleak hinge Feb 25, 2026, 6:26 AM

#

bleak hinge Feb 25, 2026, 6:27 AM

#

bleak hinge

Please do something about this unlimited generation currently I can't send use any model

wind flume Feb 25, 2026, 6:43 AM

#

Bro why I can't download this document ??

green yacht Feb 25, 2026, 7:12 AM

#

#1441588701472882759 fill out this form if yall don't mind

#

we need this feature please 😔

echo aurora Feb 25, 2026, 7:25 AM

#

wind flume Bro why I can't download this document ??

Yeah unfortunately that isn't going to work. Will flag to the team if this is something we can fix.

echo aurora Feb 25, 2026, 7:26 AM

#

bleak hinge Please do something about this unlimited generation currently I can't send use a...

Behind the scenes we are working on changes that should help with this bug. In the meantime, would recommend you try out the steps in this article. Would also try logging out/back in, I've seen a few mentions this helping.

Arena Troubleshooting: Infinite Generation

In some cases, a model response may enter an infinite generation state. When this happens, the model continues generating output without completing

crystal mica Feb 25, 2026, 7:29 AM

#

#

guys i found how to fix endless generation bug

#

you just need to f12, copy active button from another chat, then copy+paste it instead of inactive button on original one

barren ridge Feb 25, 2026, 7:37 AM

#

@echo aurora sorry to bother ya
The website you've created is just massive ngl probably must've helped tons of folks out there has never ever loved a website in my life all those are freaking paid that's why I love the aiarena thou love ya guys appreciate it so much 🥀👏

echo aurora Feb 25, 2026, 7:43 AM

#

barren ridge <@283397944160550928> sorry to bother ya The website you've created is just mass...

Appreciate that!! heartthrow

plain wyvern Feb 25, 2026, 8:01 AM

#

I'm making a coding language using Arena any help or features that you want??

#

📎 Manual_Guide.pdf

scarlet spire Feb 25, 2026, 8:15 AM

#

plain wyvern I'm making a coding language using Arena any help or features that you want??

Easter eggs!

static arch Feb 25, 2026, 8:19 AM

#

how to create ai videos?

#

gois please

echo aurora Feb 25, 2026, 8:20 AM

#

static arch how to create ai videos?

This article should help - https://help.arena.ai/articles/1544829667-how-to-create-videos-with-video-arena

How to Generate Videos with Video Arena

Arena's Video Generation lets you compare two anonymous models in a head-to-head competition to see who can create a better video based on your

scarlet spire Feb 25, 2026, 8:20 AM

#

Https://arena.ai/video

#

Aa

rigid copper Feb 25, 2026, 8:38 AM

#

what button is this???

#

in arena.ai

crystal mica Feb 25, 2026, 8:40 AM

#

rigid copper what button is this???

it just switches the thinking models

#

into

#

@echo aurora brother where is grok 4.20

tribal bay Feb 25, 2026, 8:47 AM

#

Dose anyone knows seedream 5 light is it good or what did they change compare to version 4 ?

crystal mica Feb 25, 2026, 8:54 AM

#

crystal mica

#

(proof)

shell pewter Feb 25, 2026, 9:08 AM

#

Oh yes finally grok 420 battle

shut swan Feb 25, 2026, 9:12 AM

#

@echo aurora What image ai is this for images? Cause I searched it up, and found nothing, I then tried to use it for direct chat and side by side, and could not find it there. It seems to be some new or anonymous ai that can only be accessed in the arena mode if you are lucky to get it.

gritty summit Feb 25, 2026, 9:15 AM

#

where is grok 4.20? i cant find it on the arena.

robust wyvern Feb 25, 2026, 9:17 AM

#

helloo

void swallow Feb 25, 2026, 9:28 AM

#

noice, gpt 5.2 deserved coming above gemin 3 pro grounding in web search

compact jay Feb 25, 2026, 9:42 AM

#

robust wyvern helloo

hi

whole sundial Feb 25, 2026, 9:54 AM

#

shut swan <@283397944160550928> What image ai is this for images? Cause I searched it up,...

it is an anonymous model so pineapple won't be able to tell you what it is but from what I've seen it's most likely gemini 3.1 flash image, the direct replacement for nano banana

whole sundial Feb 25, 2026, 9:56 AM

#

gritty summit where is grok 4.20? i cant find it on the arena.

I checked a few hours ago and it's not on arena as a visible or stealth model, so it's likely still under a codename

golden ocean Feb 25, 2026, 10:23 AM

#

https://tenor.com/view/mario-super-mario-bros-depression-sad-mario-and-luigi-gif-14815881022592239601

Tenor

surreal zephyr Feb 25, 2026, 10:38 AM

#

Grok the only one that returns actual unbiased facts & search results instead of policitically inflicted hypocrisy-who wouldve guessed

surreal zephyr Feb 25, 2026, 10:38 AM

#

whole sundial I checked a few hours ago and it's not on arena as a visible or stealth model, s...

There are many battle models that arent selectable lol

golden ocean Feb 25, 2026, 10:38 AM

#

surreal zephyr Grok the only one that returns actual unbiased facts & search results instead of...

mechakitler

surreal zephyr Feb 25, 2026, 10:39 AM

#

golden ocean mechakitler

Lol

surreal zephyr Feb 25, 2026, 10:39 AM

#

whole sundial it is an anonymous model so pineapple won't be able to tell you what it is but f...

Its sad how there are stealthily models on arena that we dont get to access otherwise

shut swan Feb 25, 2026, 10:44 AM

#

whole sundial it is an anonymous model so pineapple won't be able to tell you what it is but f...

I see

shut swan Feb 25, 2026, 10:45 AM

#

surreal zephyr Its sad how there are stealthily models on arena that we dont get to access othe...

Yeah

placid mango Feb 25, 2026, 10:45 AM

#

Why cant u still upload images if using claude....

surreal zephyr Feb 25, 2026, 10:46 AM

#

placid mango Why cant u still upload images if using claude....

Cuz claude blind

placid mango Feb 25, 2026, 10:46 AM

#

surreal zephyr Cuz claude blind

not on the actual website thoo

surreal zephyr Feb 25, 2026, 10:47 AM

#

placid mango not on the actual website thoo

Maybe it uses a subagent like other deepseek models

loud verge Feb 25, 2026, 10:50 AM

#

I don't see grok 4.2 in list.

Screenshot_2026-02-25-16-18-47-92_21da60175e70af211acc4f26191b7a77.jpg

Screenshot_2026-02-25-16-19-08-76_21da60175e70af211acc4f26191b7a77.jpg

surreal zephyr Feb 25, 2026, 11:02 AM

#

loud verge I don't see grok 4.2 in list.

Its in battle only like 3/4 of models

light sleet Feb 25, 2026, 11:03 AM

#

How to fix infinite generating problem

spare rune Feb 25, 2026, 11:20 AM

#

shut swan <@283397944160550928> What image ai is this for images? Cause I searched it up,...

Thank you for describing what LMarena is

plush river Feb 25, 2026, 11:37 AM

#

spare rune Thank you for describing what LMarena is

Now it's just called arena

plush river Feb 25, 2026, 11:37 AM

#

plush river Now it's just called arena

It's the best AI site for me because it offers all the paid templates for free

red sluice Feb 25, 2026, 11:51 AM

#

Hehe knew it. But it being first is very surprising

#

(Given how bad my personal experience with formatting was with grok 4.2 search)

thorn mantle Feb 25, 2026, 12:33 PM

#

Is grok 4.1 thinking not working for everybody, or is that just a me thing?

Something went wrong. Please try again later.

tall crest Feb 25, 2026, 12:58 PM

#

does any one having this issue.

**Connecting to Arena has failed. Please try again later or on a different device.
** or Infinite Captcha loop .

fickle venture Feb 25, 2026, 1:34 PM

#

loud verge I don't see grok 4.2 in list.

Still not released yet and you can try it on grok.com website

proud bobcat Feb 25, 2026, 1:36 PM

#

surreal zephyr Grok the only one that returns actual unbiased facts & search results instead of...

Genuinely

#

Elon trying to find the one line of code that keeps grok woke

surreal zephyr Feb 25, 2026, 1:56 PM

#

The "Value" Tier List
Ranking them by Return on Investment (ROI)—essentially, what gives you the most usable content for your time and money.

The King: Gemini 3.1
Why: It dominates. It provided the only S-Tier result (production-ready) at a "mid-range" price point. While $12/1M output is not cheap, you are paying for a usable final asset.

Verdict: Highest Value. You pay once and get the right result fast.

The Sketch Artist: Mercury 2
Why: It is shockingly cheap ($0.75 output is nearly free compared to the others) and "instant." Even though the result was D-Tier (blocky), it produced a coherent, dimensionally accurate "blocking" mesh.

Verdict: Good Value for Prototyping. Use it to generate 50 rapid variations, pick the best composition, and then send that to Gemini for a final pass.

The Money Pit: Opus 4.6
Why: This is the worst value proposition. It is the most expensive model (over 2x the cost of Gemini), the slowest to run, and it returned a B-Tier result with a critical hallucination (floating keyboard).

Verdict: Poor Value. You are paying a premium for "reasoning" that failed to understand physical constraints.

The Waste: GLM 5
Why: Even though it's cheap, the result was F-Tier (broken/unusable).

Verdict: Zero Value. Paying a low price for a broken asset is still a total loss.

made a task to build a 3d laptop model, gemini and gpt were the judges

#

#

mercury ; gemini
glm ; opus

#

gemini is wow.
opus is waste of money/ temu gemini
glm is braindead
mercury is a good small model

lost basalt Feb 25, 2026, 2:06 PM

#

my chat stucked here since 24hours, how to fix it? I don't want to start a new chat. is there any solution?

surreal zephyr Feb 25, 2026, 2:19 PM

#

lol

fiery gull Feb 25, 2026, 2:20 PM

#

surreal zephyr gemini is wow. opus is waste of money/ temu gemini glm is braindead mercury is a...

I tested the mercury 2, my brother was the only mercury fan on planet earth before the launch of the 2, I noticed that it is full focus on text editing, it seems to be a good one for the price really

fiery gull Feb 25, 2026, 2:20 PM

#

surreal zephyr lol

Use thinking mode

surreal zephyr Feb 25, 2026, 2:20 PM

#

fiery gull I tested the mercury 2, my brother was the only mercury fan on planet earth befo...

its stupid cheap and fast, but its not the sharpest tool in the shed

fiery gull Feb 25, 2026, 2:21 PM

#

surreal zephyr its stupid cheap and fast, but its not the sharpest tool in the shed

For simple things, it must be very good indeed, but I don't see any use for my use

surreal zephyr Feb 25, 2026, 2:21 PM

#

fiery gull Use thinking mode

golden ocean Feb 25, 2026, 2:21 PM

#

lost basalt my chat stucked here since 24hours, how to fix it? I don't want to start a new c...

rightclick on the grayed out arrow button -> click inspect -> rightclick the blue highlighted <button> block -> Edit as HTML -> ctrl + a to select all the text -> delete/backspace -> paste this:

<button class="inline-flex items-center justify-center gap-2 whitespace-nowrap text-sm transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring ring-offset-2 focus-visible:ring-offset-surface-primary disabled:pointer-events-none disabled:opacity-50 [&amp;_svg]:pointer-events-none [&amp;_svg]:shrink-0 h-8 w-8 active:bg-interactive-cta-active rounded-[4px] font-normal touch-hitbox border-border-medium text-interactive-active hover:bg-surface-raised border bg-transparent" type="submit"><svg width="1.5em" height="1.5em" viewBox="0 0 24 24" stroke-width="1.5" fill="none" xmlns="http://www.w3.org/2000/svg" color="currentColor" class="size-4"><path d="M3 12L21 12M21 12L12.5 3.5M21 12L12.5 20.5" stroke="currentColor" stroke-linecap="round" stroke-linejoin="round"></path></svg></button>

-> click on some other random spot to save changes -> type text and send the message✅

fiery gull Feb 25, 2026, 2:21 PM

#

surreal zephyr

Bruh lol

#

I'll test the qwen 3.5 27b

surreal zephyr Feb 25, 2026, 2:22 PM

#

fiery gull Bruh lol

dw gpt also skips thinking and fails it

#

unless you hint it to think

fiery gull Feb 25, 2026, 2:22 PM

#

I think the 27b is just the same thing as the qwen3 235b vl, very similar intelligence, everything very similar, but 10x smaller

surreal zephyr Feb 25, 2026, 2:23 PM

#

fiery gull I think the 27b is just the same thing as the qwen3 235b vl, very similar intell...

qwen bad

#

ill try gpt xhigh extended now, on paid sub website cuz the llmarena version doesnt work for me

fiery gull Feb 25, 2026, 2:25 PM

#

surreal zephyr

Both 27b 35 a3b thinking called me stupid and that I should go by car

fiery gull Feb 25, 2026, 2:25 PM

#

surreal zephyr qwen bad

I didn't like the 400b. Butttt the 27b and 35 a3b is really good

surreal zephyr Feb 25, 2026, 2:26 PM

#

fiery gull Both 27b 35 a3b thinking called me stupid and that I should go by car

no way

fiery gull Feb 25, 2026, 2:26 PM

#

I'm trying to use no thinking in phone but I cant use

#

Both no thiking said to me walk

#

Even 397b no thinking say to me walk

surreal zephyr Feb 25, 2026, 2:28 PM

#

oh the 397b is noncode

crystal mica Feb 25, 2026, 2:29 PM

#

lost basalt my chat stucked here since 24hours, how to fix it? I don't want to start a new c...

i told about it in chat earlier

fiery gull Feb 25, 2026, 2:29 PM

#

surreal zephyr oh the 397b is noncode

Dude in what I tested, the 123b a10b thinking gave me better results than the 397 thinking

surreal zephyr Feb 25, 2026, 2:29 PM

#

this is what i love about gpt

#

actually thinks whether what it wants to do will work in the first place

crystal mica Feb 25, 2026, 2:30 PM

#

where is grok 4.20 💀

surreal zephyr Feb 25, 2026, 2:31 PM

#

@fiery gull wtf is the gpt cooking

fiery gull Feb 25, 2026, 2:32 PM

#

surreal zephyr this is what i love about gpt

The deepseek 3.2v was the only one that got the crazy saying that I made it clear that the car I was going to was other car and etc

#

Dude I really liked the 122b a10b it is 99.9% of the 397b

surreal zephyr Feb 25, 2026, 2:40 PM

#

fiery gull The deepseek 3.2v was the only one that got the crazy saying that I made it clea...

5.2 xhigh web

#

notice how it kept the keyboard layout

#

even made hinges

deft spruce Feb 25, 2026, 2:46 PM

#

Why don't we have a STOP button like this yet?

shrewd citrus Feb 25, 2026, 2:46 PM

#

lmaooooo

#

if you ask sonnet what model it is in Chinese it will say it’s deepseek 😭

#

“Oh these Chinese companies are stealing all my hard work which I stole 😡”

uneven peak Feb 25, 2026, 2:48 PM

#

shrewd citrus if you ask sonnet what model it is in Chinese it will say it’s deepseek 😭

FlightJumpScare

sick mantle Feb 25, 2026, 2:49 PM

#

shrewd citrus lmaooooo

lmao

loud verge Feb 25, 2026, 2:51 PM

#

surreal zephyr 5.2 xhigh web

The goat.

#

Xhigh supremacy>>>

#

https://cunnyx.com/i/status/2026521943223185821

Chetaslua (@chetaslua)

🚨You can Use new nano banana on @arena
︀︀
︀︀model name - anon-bob-2 in image battle mode
︀︀
︀︀here are few more results

Quoting Chetaslua (@chetaslua)
︀
🚨Nano Banana 2 early testing
︀︀
︀︀Passed this test ✅
︀︀
︀︀> you can see perfect reflection for all different colours of apple
︀︀> Perfect reversal of text
︀︀> Background building reflection is also perfect

**💬 4 🔁 6 ❤️ 77 👁️ 5.9K **

surreal zephyr Feb 25, 2026, 3:05 PM

#

loud verge The goat.

It did extremally well, but gemini 3.1 did even better (visually, but gpt put more effort like proper keyboard layout and common buttons wear)

surreal zephyr Feb 25, 2026, 3:43 PM

#

Hey hey HEY HEY!

dry vine Feb 25, 2026, 4:04 PM

#

hallo

unborn juniper Feb 25, 2026, 4:06 PM

#

Just a novice but very motivated to learn

marble otter Feb 25, 2026, 4:06 PM

#

not even 200 lines of code and few retries and then it just reaches its limit 😭

#

claude opus rate limit in 2 minutes speedrun any%

fickle venture Feb 25, 2026, 4:10 PM

#

Huh

#

Uncle #ai-creations exist send it there

fickle venture Feb 25, 2026, 4:11 PM

#

marble otter not even 200 lines of code and few retries and then it just reaches its limit 😭

What did you expect then?

fickle venture Feb 25, 2026, 4:12 PM

#

dry vine hallo

Hello

rain bay Feb 25, 2026, 4:12 PM

#

fickle venture Huh

it is useless in chinese

fickle venture Feb 25, 2026, 4:12 PM

#

rain bay it is useless in chinese

Uhh

#

I don't know I haven't tried and it's only for battle mode

rain bay Feb 25, 2026, 4:13 PM

#

it always generate messy code alongside the output

fickle venture Feb 25, 2026, 4:13 PM

#

shrewd citrus lmaooooo

Well well deepseek is the code owner

#

Literally GLM stealing Claude stealing deepseek

fickle venture Feb 25, 2026, 4:15 PM

#

rain bay it always generate messy code alongside the output

What model? I don't really understand well

rain bay Feb 25, 2026, 4:15 PM

#

lmarena-rc3

fickle venture Feb 25, 2026, 4:15 PM

#

rain bay lmarena-rc3

Isn't that model a codename?

rain bay Feb 25, 2026, 4:16 PM

#

so it deserves to be removed

fickle venture Feb 25, 2026, 4:18 PM

#

rain bay so it deserves to be removed

Look these random names models are a secret ai model they just use random name to hide it so arena just add these models and if the company wants to remove it then arena removes it, it might be one of these be Claude Opus 5 btw. Hopefully I am telling the right answer

rain bay Feb 25, 2026, 4:18 PM

#

something like “it alway@ generat@ messy @de alongside the @@put”， but in chinese

fickle venture Feb 25, 2026, 4:18 PM

#

rain bay something like “it alway@ generat@ messy @de alongside the @@put”， but in chines...

These are trainning models so they are still training

#

For example this is a random model name but alot of people on Twitter say it's Gemini 3.1 Nano Banana hopefully it is

abstract tundra Feb 25, 2026, 4:24 PM

#

I can't see grok 4.20 in the model selector

marble otter Feb 25, 2026, 4:27 PM

#

fickle venture What did you expect then?

i expected for retries to not use quadrillion tokens or 90% rate in one retry

spare rune Feb 25, 2026, 4:29 PM

#

abstract tundra I can't see grok 4.20 in the model selector

It’s not aged

#

Added

rocky mauve Feb 25, 2026, 4:29 PM

#

marble otter not even 200 lines of code and few retries and then it just reaches its limit 😭

What do u expect, that’s one of the best coding models atm, costs a lot of money to use it

steep heath Feb 25, 2026, 4:30 PM

#

how do i pass this infinite captcha

#

it just wont let me in lol

quasi gyro Feb 25, 2026, 4:30 PM

#

I need help

abstract tundra Feb 25, 2026, 4:30 PM

#

spare rune It’s not aged

how did it appear on leaderboards

quasi gyro Feb 25, 2026, 4:30 PM

#

I faced with this problem
What should i do?

rocky mauve Feb 25, 2026, 4:32 PM

#

quasi gyro I faced with this problem What should i do?

Wait for the limit to go away

quasi gyro Feb 25, 2026, 4:34 PM

#

rocky mauve Wait for the limit to go away

My problem isn't limit
Is that error:
Smth went wrong

spare rune Feb 25, 2026, 4:35 PM

#

abstract tundra how did it appear on leaderboards

Because it’s a test and it’s probably been a model that was code marked

fickle venture Feb 25, 2026, 4:38 PM

#

abstract tundra I can't see grok 4.20 in the model selector

I think tte reason they didn't is because that A 4 AGENT

abstract tundra Feb 25, 2026, 4:38 PM

#

i see

fickle venture Feb 25, 2026, 4:38 PM

#

abstract tundra i see

#

Just like grok heavy

abstract tundra Feb 25, 2026, 4:39 PM

#

I know I've tried it out, but I didn't like it that much, so I was hoping the arena endpoint would be slightly better in some way

fickle venture Feb 25, 2026, 4:40 PM

#

It pretty much won't improve it will just be the same

exotic crest Feb 25, 2026, 4:47 PM

#

did the site just go down?

echo aurora Feb 25, 2026, 4:54 PM

#

quasi gyro My problem isn't limit Is that error: Smth went wrong

You'll want to try the steps in this article - https://help.arena.ai/articles/1645798556-lmarena-how-to-something-went-wrong-with-this-response-error-message

Arena Troubleshooting: Something went wrong with this response... e...

You may sometimes see the error message: “Something went wrong with this response, please try again.”
This is a general error message. It can

echo aurora Feb 25, 2026, 4:54 PM

#

exotic crest did the site just go down?

I don't think so...

#

What are you seeing?

exotic crest Feb 25, 2026, 4:54 PM

#

echo aurora What are you seeing?

its working now, but for about 5 minutes the site stopped loading. my browser was just infinetely loading

quasi gyro Feb 25, 2026, 4:55 PM

#

echo aurora You'll want to try the steps in this article - https://help.arena.ai/articles/16...

If I clear the data, will all my chats be removed?

echo aurora Feb 25, 2026, 4:55 PM

#

quasi gyro If I clear the data, will all my chats be removed?

If you're not logged in, yes.

quasi gyro Feb 25, 2026, 4:56 PM

#

echo aurora If you're not logged in, yes.

I logged in
Ok

echo aurora Feb 25, 2026, 4:56 PM

#

exotic crest its working now, but for about 5 minutes the site stopped loading. my browser wa...

Glad to hear it's working again, keep me updated if things go sideways

echo aurora Feb 25, 2026, 4:56 PM

#

quasi gyro I logged in Ok

Then yeah your chat history will be saved to that account

mortal coyote Feb 25, 2026, 4:58 PM

#

echo aurora Then yeah your chat history will be saved to that account

gemini 3.1 coming soon ? on ARENA ?

uneven glacier Feb 25, 2026, 5:00 PM

#

hlo

echo aurora Feb 25, 2026, 5:01 PM

#

mortal coyote gemini 3.1 coming soon ? on ARENA ?

It's currently available.

stray aspen Feb 25, 2026, 5:03 PM

#

how is grok 4.2 on fourth place

#

it sucks

quasi gyro Feb 25, 2026, 5:04 PM

#

echo aurora Then yeah your chat history will be saved to that account

I did it but still have that issue

uncut wind Feb 25, 2026, 5:05 PM

#

bro does anyone know why opus doesnt work

#

no matter how much i refresh

#

then it tells me ive used my limit try again in 40 minutes

#

after not answering my prompt

echo aurora Feb 25, 2026, 5:06 PM

#

quasi gyro I did it but still have that issue

I'll respond in the thread you have going in #ask-here

quasi gyro Feb 25, 2026, 5:09 PM

#

echo aurora Then yeah your chat history will be saved to that account

Bro i deleted data but still jave problem

echo aurora Feb 25, 2026, 5:13 PM

#

quasi gyro Bro i deleted data but still jave problem

I've responded in the #ask-here thread.

quasi gyro Feb 25, 2026, 5:16 PM

#

echo aurora I've responded in the <#1466486650170245435> thread.

Link won't open

echo aurora Feb 25, 2026, 5:18 PM

#

quasi gyro Link won't open

Pinged you in the other channel, I'd like to keep #general open for general conversation and use the other channels for troubleshooting. blobthanks

fickle venture Feb 25, 2026, 5:33 PM

#

stray aspen how is grok 4.2 on fourth place

What did you expect form grok it always sucks

toxic verge Feb 25, 2026, 5:52 PM

#

How this model is in the top tier list is beyond me

toxic verge Feb 25, 2026, 5:58 PM

#

echo aurora Pinged you in the other channel, I'd like to keep <#1340554757827461211> open fo...

What’s chat arena ? Have you ever heard of it before ?

proud bobcat Feb 25, 2026, 6:09 PM

#

no 5.3 codex yet

#

https://tenor.com/view/angry-gif-6891886407904187638

Tenor

mighty surge Feb 25, 2026, 6:10 PM

#

proud bobcat no 5.3 codex yet

?

#

its already released like 2 weeks ago

ashen oak Feb 25, 2026, 6:12 PM

#

Is the Gemini -3 pro image review not working on anything just for me today?
It keeps saying something went wrong on anything I sent there ?
Would love some help on that🙏

echo aurora Feb 25, 2026, 6:17 PM

#

ashen oak Is the Gemini -3 pro image review not working on anything just for me today? It...

Would check the response that the bot provided you here: #1476278002626072586 message

ashen oak Feb 25, 2026, 6:17 PM

#

Didn’t help

#

🙏🙏

rare swallow Feb 25, 2026, 6:18 PM

#

ashen oak Is the Gemini -3 pro image review not working on anything just for me today? It...

Yeah, I'm also getting the same problem with Claude opus 4.6

proud bobcat Feb 25, 2026, 6:18 PM

#

mighty surge ?

in arena

#

😭

ashen oak Feb 25, 2026, 6:19 PM

#

rare swallow Feb 25, 2026, 6:19 PM

#

I need Claude, like desperately rn, dumb Gemini deleted my code and Claude was the only model that actually understood the context

ashen oak Feb 25, 2026, 6:20 PM

#

ashen oak

Tried what the bot said , plus , tried switching browsers or accounts didn’t help

rare swallow Feb 25, 2026, 6:20 PM

#

ashen oak Tried what the bot said , plus , tried switching browsers or accounts didn’t hel...

I did too

#

Didn't help

mighty surge Feb 25, 2026, 6:21 PM

#

proud bobcat in arena

oh sh*t mb

fickle venture Feb 25, 2026, 6:29 PM

#

proud bobcat no 5.3 codex yet

You can already use it on CLI agent for free there is limit but it's very far

sage talon Feb 25, 2026, 6:33 PM

#

who will join the group call

proud bobcat Feb 25, 2026, 6:33 PM

#

fickle venture You can already use it on CLI agent for free there is limit but it's very far

i know

#

but

#

just saying

rare swallow Feb 25, 2026, 6:40 PM

#

@echo aurora please fix this

echo aurora Feb 25, 2026, 6:41 PM

#

ashen oak Tried what the bot said , plus , tried switching browsers or accounts didn’t hel...

Unfortuantely, if the steps in the article didn't help, it's likely an issue that can't solved on the user's end. For this model you're likely seeing a rate limit causing this as it's a very popular model.

echo aurora Feb 25, 2026, 6:41 PM

#

rare swallow <@283397944160550928> please fix this

Would encourage you to read this message: #1417174113092374689 message

rare swallow Feb 25, 2026, 6:44 PM

#

echo aurora Would encourage you to read this message: https://discord.com/channels/134055475...

This didn't help, it just told me to do the same thing you always tell us

echo aurora Feb 25, 2026, 6:49 PM

#

rare swallow This didn't help, it just told me to do the same thing you always tell us

Those are the troubleshooting steps. Unfortunately, if they don't work, reporting the better information to our team is best next option.

river orchid Feb 25, 2026, 6:56 PM

#

echo aurora Those are the troubleshooting steps. Unfortunately, if they don't work, reportin...

hello

toxic verge Feb 25, 2026, 6:58 PM

#

The way to prevent it is to breakdown the requests in new chats

#

Or take breaks and not send as many requests within an hour, intervals. And that’s not even guaranteed.

ocean vortex Feb 25, 2026, 7:02 PM

#

echo aurora Unfortuantely, if the steps in the article didn't help, it's likely an issue tha...

Yeah I tried it now and got the same result. Would probably help if you forwarded 'You've reached your rate limit...' errors from Google to the user with something like 'Model is at capacity. Please try again'. So they wouldn't waste their time trying different accounts or whatever

sick mantle Feb 25, 2026, 7:06 PM

#

echo aurora Those are the troubleshooting steps. Unfortunately, if they don't work, reportin...

🇭 🇮

echo aurora Feb 25, 2026, 7:21 PM

#

ocean vortex Yeah I tried it now and got the same result. Would probably help if you forwarde...

Agreed, there are improvements to be made with these error messages. This is on our to-do.

mystic patio Feb 25, 2026, 7:26 PM

#

hi, when does the arena get updated to reflect the new elo of each model?

#

and who maintains the arena?

toxic verge Feb 25, 2026, 7:31 PM

#

Once you get caught in this cycle you need to relogin clear browser

light sleet Feb 25, 2026, 7:33 PM

#

toxic verge

Bro you might be a robot 😭

surreal zephyr Feb 25, 2026, 7:33 PM

#

echo aurora It's currently available.

When is haiku 3 coming to arena 🤔

#

jokes aside maybe we could get codex spark, or mercury2

light sleet Feb 25, 2026, 7:34 PM

#

Codex 5.3🥺🙏🏻

surreal zephyr Feb 25, 2026, 7:34 PM

#

light sleet Codex 5.3🥺🙏🏻

already is

toxic verge Feb 25, 2026, 7:34 PM

#

light sleet Bro you might be a robot 😭

Once that triggers you’re gonna hit them in every single new message

light sleet Feb 25, 2026, 7:34 PM

#

surreal zephyr already is

In arena?

golden ocean Feb 25, 2026, 7:35 PM

#

im a robot

surreal zephyr Feb 25, 2026, 7:35 PM

#

toxic verge

why not just have the gemini to solve the capthas for you 🤔

golden ocean Feb 25, 2026, 7:35 PM

#

e

#

why not have @surreal zephyr solve it for you 🤔

surreal zephyr Feb 25, 2026, 7:35 PM

#

golden ocean e

hi japancat

golden ocean Feb 25, 2026, 7:35 PM

#

o/

surreal zephyr Feb 25, 2026, 7:35 PM

#

https://cdn.discordapp.com/attachments/1446560503668146178/1474068789250228337/togif.gif

golden ocean Feb 25, 2026, 7:36 PM

#

turn this into a gif too

queen veldt Feb 25, 2026, 7:38 PM

#

#

Guys I'm building the supercomputer I can't tell you the details

golden ocean Feb 25, 2026, 7:40 PM

#

true

toxic verge Feb 25, 2026, 7:40 PM

#

#

Gemini is not in the top tier list Forsure

#

Maybe for images, I could see why. It’s up there in the leaderboard, but it definitely does not deserve to be in the top 5 position.

#

#

toxic verge Feb 25, 2026, 7:44 PM

#

surreal zephyr why not just have the gemini to solve the capthas for you 🤔

I don’t know if it’s capable of solving anything to be honest with you.

#

I’m really baffled at how high it sits at the leaderboard

surreal zephyr Feb 25, 2026, 7:45 PM

#

toxic verge I don’t know if it’s capable of solving anything to be honest with you.

have you tried 3.1?

#

it makes opus a joke

#

it even beats gpt 5.2

#

https://019c775e-143e-7242-92b4-0d936da78f8c.arena.site/

Interactive Black Hole Simulation

Check out what I built in Arena's Code Arena - Content is user-generated and unverified

toxic verge Feb 25, 2026, 7:46 PM

#

In code?

surreal zephyr Feb 25, 2026, 7:46 PM

#

toxic verge In code?

in thinking

#

in code (execution only) nothing gets close to codex 5.3 lol

#

like not even same leaderboard

toxic verge Feb 25, 2026, 7:47 PM

#

This makes situation even more confusing

surreal zephyr Feb 25, 2026, 7:47 PM

#

codex actually does what you ask for instead of doing whatever it wants

toxic verge Feb 25, 2026, 7:47 PM

#

I think people are looking at the coding as like the ultimate metric

surreal zephyr Feb 25, 2026, 7:47 PM

#

gemini is crazy smart but its lazy and doesnt care what you want it does whatever it prefers

#

opus memorized everything but its literally braindead if you give it a novel problem to solve

cursive bough Feb 25, 2026, 7:47 PM

#

what you guys think is the best ai for coding like html or java

surreal zephyr Feb 25, 2026, 7:47 PM

#

cursive bough what you guys think is the best ai for coding like html or java

if you dont know what you are doing? gemini + codex
if you do, just codex suffices

toxic verge Feb 25, 2026, 7:48 PM

#

This is what makes these benchmarks and the leaderboard are confusing

surreal zephyr Feb 25, 2026, 7:49 PM

#

toxic verge This is what makes these benchmarks and the leaderboard are confusing

leaderboard is sponsored lol

#

and even then people vote only by looks

toxic verge Feb 25, 2026, 7:49 PM

#

I don’t know what it is, dude

surreal zephyr Feb 25, 2026, 7:49 PM

#

if it doesnt listen but still makes something pretty, people prefer that

toxic verge Feb 25, 2026, 7:50 PM

#

Yeah

surreal zephyr Feb 25, 2026, 7:50 PM

#

5.3 codex wont get too high on leaderboard, because it asks questions

#

lol

toxic verge Feb 25, 2026, 7:50 PM

#

#

surreal zephyr Feb 25, 2026, 7:51 PM

#

undone saffron Feb 25, 2026, 7:51 PM

#

surreal zephyr it makes opus a joke

Are you sure about that?

toxic verge Feb 25, 2026, 7:51 PM

#

But this could be a little biased

wise spindle Feb 25, 2026, 7:51 PM

#

why does nano banana pro keeps saying error

surreal zephyr Feb 25, 2026, 7:51 PM

#

undone saffron Are you sure about that?

opus is one of worst high-end models if you actually test it

#

:)

toxic verge Feb 25, 2026, 7:52 PM

#

#

mystic patio Feb 25, 2026, 7:52 PM

#

opus is really good imo

surreal zephyr Feb 25, 2026, 7:52 PM

#

toxic verge Feb 25, 2026, 7:52 PM

#

toxic verge Feb 25, 2026, 7:52 PM

#

surreal zephyr

Wow

#

Crazy if that’s true

surreal zephyr Feb 25, 2026, 7:52 PM

#

gemini 3.1 and gpt 5.2 both agree opus is sh*t

#

guess the models

#

and this

toxic verge Feb 25, 2026, 7:54 PM

#

Dude, you know what it is

#

It’s probably that the compute

#

They give us much more water down versions

surreal zephyr Feb 25, 2026, 7:54 PM

#

toxic verge Dude, you know what it is

guess which one is opus, which one gpt, and which one gemini

surreal zephyr Feb 25, 2026, 7:54 PM

#

toxic verge They give us much more water down versions

opus is just braindead if you give it a novel task, and sonnet 4.6 was trained on deepseek 🤣

toxic verge Feb 25, 2026, 7:54 PM

#

Not sure the really messed up one

#

Are u sure?

surreal zephyr Feb 25, 2026, 7:55 PM

#

toxic verge Not sure the really messed up one

the really messed up is glm, but the bottom right white with keyboard going out of screen and logo sticking out of screen is opus

toxic verge Feb 25, 2026, 7:55 PM

#

#

surreal zephyr Feb 25, 2026, 7:55 PM

#

toxic verge

yeah and they got exposed its the other way around

#

toxic verge Feb 25, 2026, 7:56 PM

#

#

undone saffron Feb 25, 2026, 7:56 PM

#

surreal zephyr opus is one of worst high-end models if you actually test it

Don't be so sure about that
For some things, gemini couldn't solve the problems of code, but opus could

surreal zephyr Feb 25, 2026, 7:56 PM

#

undone saffron Don't be so sure about that For some things, gemini couldn't solve the problems ...

heres gemini summing up opus

mystic patio Feb 25, 2026, 7:57 PM

#

i used grok 4.2 to automate the scoring of a psychological test, the MMPI-2

surreal zephyr Feb 25, 2026, 7:57 PM

#

also i had a bug that i spent 2 weeks 8 hours a day trying to fix with opus 4.6
gemini found it in 1 prompt

mystic patio Feb 25, 2026, 7:57 PM

#

it was pretty good

toxic verge Feb 25, 2026, 7:57 PM

#

This proves my theory

surreal zephyr Feb 25, 2026, 7:57 PM

#

mystic patio i used grok 4.2 to automate the scoring of a psychological test, the MMPI-2

grok is pretty bad but sometimes gets to levels of 5.2 or opus

hard quiver Feb 25, 2026, 7:57 PM

#

ashen oak

I'm having the same problem anyone else?

surreal zephyr Feb 25, 2026, 7:57 PM

#

mainly when opus or 5.2 mess up

toxic verge Feb 25, 2026, 7:57 PM

#

Let’s create more speculation than it solves anything

#

It turns out to be a popularity contest more than it is a capability

#

Which undervalued the capabilities because I’m some of these models shine in certain areas

surreal zephyr Feb 25, 2026, 7:58 PM

#

heres an old model vs a 10x more expensive less than month old opus

toxic verge Feb 25, 2026, 7:58 PM

#

Which sadly aren’t measured and how will we ever know those if we’re just focused on the standard

#

I don’t think this is something academic could solve

surreal zephyr Feb 25, 2026, 7:59 PM

#

every time it hears car wash, opus says "drive", even if the scenario is inverted

toxic verge Feb 25, 2026, 7:59 PM

#

It has to be somebody from the bottom with a fresh perspective

surreal zephyr Feb 25, 2026, 7:59 PM

#

same for many other riddles

toxic verge Feb 25, 2026, 7:59 PM

#

A really evaluation test needs to be from the people like relatable

#

Like in the real world where people struggle, and with what they struggle lol

#

I mean these benchmarks are cool for enthusiast and researchers

hard quiver Feb 25, 2026, 7:59 PM

#

Gemini 3 pro-image is not generating anything the only output is "something went wrong with the response, please try again"

toxic verge Feb 25, 2026, 8:03 PM

#

It would be meaningful if model evaluations captured the everyday experiences of regular people using Llm the ones without large platforms or influence and reflected their real frustrations. There should be a way to measure performance that highlights where models consistently struggle, not just where they excel on benchmarks.

#

And you know, it’s the most ironic thing as recent we haven’t heard much of the term “AGI” been thrown around lately. 😂

toxic verge Feb 25, 2026, 8:11 PM

#

surreal zephyr same for many other riddles

surreal zephyr Feb 25, 2026, 8:12 PM

#

toxic verge

give original video ill show

toxic verge Feb 25, 2026, 8:12 PM

#

surreal zephyr Feb 25, 2026, 8:13 PM

#

remember gemini 3.1 is like the only good model from google (gemini 3.0 before all nerfs was good too)

toxic verge Feb 25, 2026, 8:14 PM

#

That’s the movie it’s supposed to say

surreal zephyr Feb 25, 2026, 8:14 PM

#

100 views 😭

#

theres no way it was in the training data

#

i doubt it can figure out based on literally snow

toxic verge Feb 25, 2026, 8:20 PM

#

surreal zephyr i doubt it can figure out based on literally snow

#

This is what I’m saying. It’s not like the model is stupid. When you nudge it.

#

surreal zephyr Feb 25, 2026, 8:27 PM

#

toxic verge

It guessed based on just the text

toxic verge Feb 25, 2026, 8:27 PM

#

surreal zephyr It guessed based on just the text

Makes sense

surreal zephyr Feb 25, 2026, 8:28 PM

#

toxic verge Makes sense

yeah its THIS good

#

(pure new chat btw)

#

it can guess the movie just based on ww2+ snow

#

snow alone is not enough

toxic verge Feb 25, 2026, 8:28 PM

#

Yeah

surreal zephyr Feb 25, 2026, 8:29 PM

#

surreal zephyr yeah its THIS good

(without the video)

toxic verge Feb 25, 2026, 8:29 PM

#

Well regardless then this would be edge case

#

And the world is full of edge cases

surreal zephyr Feb 25, 2026, 8:29 PM

#

toxic verge Yeah

i made a fun experiment with gemini 3.0 few months ago

#

i had a pic of a house i took by accident

#

no landmarks ect, just a normal house/building

#

i put that into gemini

#

it found it to 5 metres

#

💀

#

it mightve been on google maps street view but still thats insane

toxic verge Feb 25, 2026, 8:32 PM

#

Im not saying these models are outright stupid

#

I’m just saying, I don’t think that leaderboard accurately reflects. I don’t even know what I’m trying to say.

surreal zephyr Feb 25, 2026, 8:33 PM

#

toxic verge Im not saying these models are outright stupid

actually

#

not even need image xD

toxic verge Feb 25, 2026, 8:33 PM

#

Actually, this is a good benchmark. We should try out with the other movies.

surreal zephyr Feb 25, 2026, 8:33 PM

#

ai might be closest to "magic" we will ever get tbh

long minnow Feb 25, 2026, 8:34 PM

#

and youtube might be the closest to a "time machine"

toxic verge Feb 25, 2026, 8:34 PM

#

Nawh biology by far is far more mysterious and far more magical

surreal zephyr Feb 25, 2026, 8:35 PM

#

toxic verge Nawh biology by far is far more mysterious and far more magical

mind reading is better

#

and mind reading without sensors is even crazier

toxic verge Feb 25, 2026, 8:35 PM

#

That was great 😂😂😂

toxic verge Feb 25, 2026, 8:36 PM

#

surreal zephyr mind reading is better

All of that stuff is crazy, dude

surreal zephyr Feb 25, 2026, 8:36 PM

#

there are already ais that can read thoughts from mri scans, but good llms can almost do that without any scans or such

toxic verge Feb 25, 2026, 8:36 PM

#

But yeah, I hear what you’re saying. Definitely for sure. It feels like magic.

surreal zephyr Feb 25, 2026, 8:36 PM

#

ww2 + winter snow is NOT enough to guess it

toxic verge Feb 25, 2026, 8:36 PM

#

I don’t think it’s magic. I just think that we’re really predictable.

surreal zephyr Feb 25, 2026, 8:36 PM

#

it guesses by the way a human says it

#

like

#

human subconsciously spells it in a way that hints for the specific movie

#

and the llm's wages contain some of those patterns

toxic verge Feb 25, 2026, 8:37 PM

#

We’d have to see behind the scenes to truly know

surreal zephyr Feb 25, 2026, 8:37 PM

#

toxic verge We’d have to see behind the scenes to truly know

its more like

#

not the movie contains winter and ww2

#

its about why when asked to describe the movie

#

the human picked ww2 and winter

#

and not for example a plane

#

or a gun

toxic verge Feb 25, 2026, 8:38 PM

#

You sound like a salesman

surreal zephyr Feb 25, 2026, 8:38 PM

#

toxic verge You sound like a salesman

im js excited

#

ai is underrated

#

like

#

ai is basically a solution to every solveable problem

#

by definition

toxic verge Feb 25, 2026, 8:39 PM

#

O.o

surreal zephyr Feb 25, 2026, 8:39 PM

#

if there is any pattern, ai can be trained to find it

#

if a dog is smart enough to have a conversation

#

you can train ai to translate

#

by definition

#

thats how impressive it is

toxic verge Feb 25, 2026, 8:40 PM

#

Yeah, but in the physical you have entropy or pure randomness

surreal zephyr Feb 25, 2026, 8:40 PM

#

toxic verge Yeah, but in the physical you have entropy or pure randomness

everything has randomness

#

but if a dog or a monkey can say (or even think) "give me food" or "i want go outside"

#

you can see that using ai

#

its not really feasible rn to do that but its very much possible

wicked talon Feb 25, 2026, 8:42 PM

#

AI 🙂

toxic verge Feb 25, 2026, 8:42 PM

#

I like the word you used earlier magic

surreal zephyr Feb 25, 2026, 8:42 PM

#

toxic verge I like the word you used earlier magic

if ai is not magic, then what is?

wicked talon Feb 25, 2026, 8:42 PM

#

surreal zephyr if ai is not magic, then what is?

Its basically a database no?

#

It just finds info on a database and pieces it together

surreal zephyr Feb 25, 2026, 8:42 PM

#

wicked talon Its basically a database no?

ai is a way to solve literally every solveable problem without knowing how

#

you can solve any problem by throwing money and compute at it

#

using ai

wicked talon Feb 25, 2026, 8:43 PM

#

surreal zephyr ai is a way to solve literally every solveable problem without knowing how

What's 0/0 then

surreal zephyr Feb 25, 2026, 8:43 PM

#

wicked talon What's 0/0 then

lmao

#

thats easy

toxic verge Feb 25, 2026, 8:43 PM

#

wicked talon Feb 25, 2026, 8:43 PM

#

Tell then

toxic verge Feb 25, 2026, 8:43 PM

#

#

#

surreal zephyr Feb 25, 2026, 8:44 PM

#

wicked talon Tell then

#

when you get 0/0 you use le hospitals rule and simplify

toxic verge Feb 25, 2026, 8:44 PM

#

I don’t think the AI is gonna be smarter than humans

surreal zephyr Feb 25, 2026, 8:44 PM

#

lol

surreal zephyr Feb 25, 2026, 8:44 PM

#

toxic verge I don’t think the AI is gonna be smarter than humans

define smart

#

pattern recognition? then it already is MUCH smarter

toxic verge Feb 25, 2026, 8:44 PM

#

surreal zephyr define smart

Humans

surreal zephyr Feb 25, 2026, 8:45 PM

#

toxic verge Humans

oh like causing wars and killing millions because politican in other country insulted you?

wicked talon Feb 25, 2026, 8:45 PM

#

surreal zephyr

0/0 is undefined

surreal zephyr Feb 25, 2026, 8:45 PM

#

wicked talon 0/0 is undefined

depending on context

wicked talon Feb 25, 2026, 8:45 PM

#

toxic verge I don’t think the AI is gonna be smarter than humans

It won't be in terms of memory storage

#

In terms of remembering things then yes

surreal zephyr Feb 25, 2026, 8:45 PM

#

wicked talon 0/0 is undefined

wicked talon Feb 25, 2026, 8:46 PM

#

surreal zephyr

toxic verge Feb 25, 2026, 8:46 PM

#

wicked talon It won't be in terms of memory storage

So is gonna be conscious or not

surreal zephyr Feb 25, 2026, 8:46 PM

#

wicked talon

can you read? it depends on context

wicked talon Feb 25, 2026, 8:46 PM

#

toxic verge So is gonna be conscious or not

AI cannot be conscious

surreal zephyr Feb 25, 2026, 8:46 PM

#

0/0 has multiple meanings

#

same as castle is chess move

neon idol Feb 25, 2026, 8:46 PM

#

Hello!

toxic verge Feb 25, 2026, 8:47 PM

#

Hmm

surreal zephyr Feb 25, 2026, 8:47 PM

#

wicked talon AI cannot be conscious

Humans cannot either :)

#

consciouseness is a made up thing

#

if it isnt, then define it?