#general

1 messages Ā· Page 101 of 1

next dagger
#

reply good sir

potent glacier
#

No because I genuinely have no idea 🫤

next dagger
#

lmao

latent talon
#

Hmm

#

Anyine have

#

?

echo aurora
latent talon
#

😭😭😭

potent glacier
#

Like there should be an image showcase channel

next dagger
glass copper
#

I don't understand the LMArena site... UI is extremely challenging to even use. Just to see the Arena Overview, you have to scroll through a little nested box at the bottom of the page... while your mouse is inside the box

potent glacier
#

That would be a lot better

echo aurora
latent talon
#

Ok @echo aurora

stray aspen
#

qwen image edit is finally available on lmarena

glass copper
#

I give the LMArena Leaderboards GUIdesign a 2 out of 10

potent glacier
#

Community creations is everything; I’d like a channel specifically for images

glass copper
potent glacier
echo aurora
next dagger
glass copper
#

Seems nobody tried to work on this, or improve it... they just dumped a basic table at the bottom of the page, expect users to painfully scroll through it

potent glacier
next dagger
echo aurora
potent glacier
#

I mean they’re badmouthing your site šŸ¤·šŸ¼ā€ā™‚ļø

glass copper
#

Criticism is the most important ingredient of innovation / improvement. First we have to look at what's wrong, before it can improve

potent glacier
#

One that’s worked for millions

latent talon
#

Where is use this model in LMarena

#

veo3-fast-audio

cunning haven
echo aurora
latent talon
#

Ok

lethal meadow
#

Hello

ocean vortex
#

there was no way it was gonna be V4 with the model being trained on saying V3

#

confusion

#

this was March version

#

LOL

smoky oak
#

hey boys i need some help, to generate videos, do i just typr my prompt or is there a start command like "imagine" kinda like midjourney, just joined.

cunning haven
smoky oak
#

ok thanks

glass copper
#

@echo aurora Please un-nest the Leaderboard, and give it its own page. Add tabs to the top of the page, so we can switch between the different tests. When we're on the Leaderboard, we should be able to use the browser's native scrollbar to scroll up down. As it stands, we can't even use the (invisible/non-existing) scrollbar inside the nested table. You can't see where you are, and you have to use the mouse wheel

echo aurora
errant cave
stray aspen
#

what

#

why are you using deepseek in august

errant cave
errant cave
stray aspen
#

qwen is greater

#

but its bad for coding

jovial sapphire
#

yooo

stray aspen
#

and its multimodal

#

depseek is being left in the dust

#

they need updates asap

errant cave
#

Seems like they're getting just that

stray aspen
#

i hope so

errant cave
stray aspen
#

no way

lethal meadow
stray aspen
#

im going try to get them on battle mode

ocean vortex
#

V3.1 seems hybrid reasoning

#

so it may just be R2 lol

#

just named differently

errant cave
#

Did DeepSeek-R1 call itself DeepSeek-R1 before though?

edgy dew
#

Now we have nano-banana for text generation

ornate agate
errant cave
#

Yeah the DeepSeek-V3 deployed on chat.deepseek.com is prefacing its replies with "of course" now. It didn't do this before

#

Either the system prompt or the model changed

dusky ravine
#

so they can see our prompt or textchat with A.I?

#

where does it end up? and where does it publicly shown?

carmine zephyr
#

hello

keen beacon
#

This may help

#

Has info about data usage.

errant cave
white hatch
carmine zephyr
#

there is a limit in video generation ?

ocean vortex
#

Btw I wouldn't be extremely surprised if they add their version of reasoning_effort. Deepseek seems to be doing all the same things lol

ocean vortex
#

No way it is worse with reasoning than old V3

errant cave
#

Ayo dey be dem nu drops?

#

14 billion extra parameters

keen beacon
#

Damn, LmArena discord has 21 000 members now

#

We just reached 15 000 on the 11th

ocean vortex
#

chat better

#

main difference

#

šŸ˜‡

#

But really... chat was never tuned for reasoning. So it's much better optimised and better performing without it

#

it isn't. But someone needs to contact AA for them to test it lol

#

they only did minimal

#

yeah minimal is. But I suspect they tested medium verbosity

#

that also performs worse than minimal with high verbosity

#

which in my testing is just about equal to gpt4.1

#

There are literally SO MANY possible combinations and variants of gpt5 now lol

errant cave
#

What the hell's with these "upgraded models" being worse than their predecessors cuh I want 4o and the old DeepSeek back 😭

shut tendon
patent aspen
ripe mountain
patent aspen
teal mantle
#

BTW

#

would there be pic support for anthropic models on arena?

scarlet holly
#

Hello Everyone

mortal lynx
#

possibly the most popular lmarena model ever?

#

just it's sheer existence increase lifetime image-edit votes 4x

#

if not more since it's hidde from the leaderboard

fleet surge
#

do mods not care about accounts that are clearly burners joining the server to generate slop videos to post on wherever they may put it

mortal lynx
#

their votes still count

#

(if they're voting that is)

fleet surge
leaden sparrow
#

Hello folks

ripe mountain
mortal lynx
#

maybe you should be required to vote before generating the next video

#

like in the Image arena on the website

fleet surge
#

yep

coarse rose
#

Hi friends, need help😭 I just don't understand why the model keep saying it cannot generate image, I tried so many times all failed

patent aspen
keen beacon
#

It will incentivize people to give random votes

#

Terrible idea

mortal lynx
#

isn't that the same with the Image arena?

keen beacon
#

Or if you finally accidentally stomp across some secret model such as nano banana, you can just keep abusing it until you get tired of it

#

It's not required to even vote

#

Making votes mandatory will spoil the entire dataset because people will start voting just because they are told to, it is a terrible idea

fleet surge
#

they should just take measures against the burners joining thats what i think

#

it wont affect people who actually vote

keen beacon
fleet surge
keen beacon
#

If someone creates just one slop, is they are a burner?

#

Two?

#

Five?

fleet surge
#

and just manually looking at what they generate

keen beacon
#

Ten?

#

So we are going to implement moderation and censorship here?

indigo hazel
fleet surge
indigo hazel
fleet surge
#

frontier models open to the general public has enormous abuse potential

ripe igloo
#

Hi

keen beacon
next dagger
keen beacon
#

I hope it will change your opinion!

blazing nymph
#

shuld i replace chat gpt with LM Arina?

fleet surge
# keen beacon Okay, I will write a piece of ransomware with Qwen3-Coder just for you. It is no...

I'm talking about video generation, specifically with VEO 3. Proprietary top of the line models that we have access to here for free, it's just slop galore at the moment with very little data extraction for the computing power expended. Barely anyone here it seems, is here to just vote on the models video output. Most users in the video generation channel seem to be creating stuff for their own use.

There should be censorship and limits as to what people can do because otherwise we may not have access to something like this in the future

echo aurora
last estuary
stray aspen
#

deepseke has more parameters now?

indigo hazel
hollow imp
#

@echo aurora after this you need to listen to me and bring pdf support asap.

keen beacon
#

I sent new benchmarks today and in short either no or they haven't tested it

echo aurora
coarse rose
exotic nebula
#

While I was waiting for R2/V4, we got scammed with V3.1. I thought there would be some changes but all that happened was just an increase in context window

#

What BS is this.

#

And I also think they made some changes to the system prompt.

stray aspen
#

what a disappointment

leaden palm
#

claude lore

stray aspen
#

lol

tranquil coral
#

My question might sound silly, but is there any way to make more than 8 videos?

potent glacier
potent glacier
#

There needs to be a way to prevent all chats from getting deleted 🫤

hollow imp
verbal garnet
#

Why can't I create a website?

robust quest
#

this ain't build a bi

#

oh bi

slow dew
#

hey everyone, why when i try to paste a text in the massage blank it shows me a txt file

scenic salmon
#

For now, OpenAI's core consumer product remains ChatGPT, and Altman said he's focused on making it more flexible and more useful in daily life. He said he already relies on it for everything from work to parenting questions.

He said, however, that there are limits.

"The models have already saturated the chat use case," Altman said. "They're not going to get much better. ... And maybe they're going to get worse."

#

Sam finally admitting they’ve peaked

ocean vortex
# scenic salmon Sam finally admitting they’ve peaked

I think there still might be potential is somewhat bigger models than gpt5. GPT5-high is great, but gpt5 in general can still make some very odd blunders. Like when I was asking it for some terminal commands it randomly figured I wanted it to execute those in python so it did exactly that lol

scenic salmon
#

4.5 was bigger than gpt-5

ocean vortex
#

yeah no sh'it

#

4-Turbo also bigger

#

And og gpt4 bigger than both

scenic salmon
#

The original gpt-4 yeah

fast halo
#

any1 elses lmarena web not working

ocean vortex
scenic salmon
#

I can’t see them going back to the larger models though, they’d bleed money

#

Well, gpt-5 the actual thinking model is o3 sized

ocean vortex
#

So like, gpt5-high makes up for it with reasoning no issues. But other variants less so

keen beacon
#

Remember that this guy is a pathological liar

scenic salmon
#

I wouldn’t say he’s a pathological liar, just says what he needs to in order to raise more money, promises the moon

ocean vortex
#

To get performance you either need model size or verbose responses/reasoning. If we keep training data constant.

scenic salmon
#

Claude 4.1 Opus is a good example, it’s a large model, but it only slightly outperforms sonnet 4 thinking

#

It’s good to have those large models for people willing to pay for it, but it’s not for the masses

ocean vortex
round island
#

just tried to generate video from a light novel

hallow ridge
keen beacon
#

lol

round island
white hatch
keen beacon
#

Because I love it too much

#

lol

#

Why so serious?

stray aspen
#

lol

keen beacon
#

chill

stray aspen
#

nano banana release

keen beacon
#

I mean official release

stray aspen
#

nano banana is greater

sour spindle
#

I’ve been really trying to like Gemini but the app experience is horrendous

hallow ridge
hallow ridge
keen beacon
#

nope. Sucks at image editing and has that ugly yellow tint often

#

You cant

worn bison
#

how are they not going broke from all those ai api requests

hallow ridge
# keen beacon You cant

What about just a girl in reveling cloths that I can out on instagrma I see people do that

#

and they make money

worn bison
keen beacon
pseudo parrot
#

Why this i have problem?

hallow ridge
hallow ridge
echo aurora
#

Lets keep conversation related to AI please.

keen beacon
#

that are not on lmarena

#

and make them locally

keen beacon
echo aurora
hallow ridge
woeful night
#

crazy question lol

keen beacon
pseudo parrot
keen beacon
pseudo parrot
winged locust
#

Sadness

hallow ridge
full idol
keen beacon
echo aurora
ripe mountain
#

gpt image=piss filter

balmy mist
hallow ridge
#

Look real huh

echo aurora
pseudo parrot
keen beacon
echo aurora
hallow ridge
round island
pseudo parrot
echo aurora
round island
#

crazy

round island
echo aurora
#

@keen beacon move on please, this isn't helpful ^

stray aspen
#

rofl

blazing bison
#

Rip legacy site

hallow ridge
#

Why was my message taken down?

ocean vortex
#

it isn't. But more than likely Opus is the biggest standalone reasoning model that you can use.

#

That's common misconception. Anthropic only provides the numbers with reasoning maxed out iirc But everyone just incorrectly assume they do not need it because they know better or because raw reasoning looks wasteful (they all do), or whatever...

No reasoning:

#

Reasoning:

lilac pagoda
#

@echo aurora can you please add nano banana to direct chat

#

🄺🄺

echo aurora
lilac pagoda
#

Gambling for it to appear on battle mode is tiring

mortal lynx
#

it can't be added until it's publicly revealed

stray aspen
#

nano banana is WAY better than qwen edit

inner gate
echo aurora
inner gate
#

So I mean is the leaderboard up to date

inner gate
pure comet
inner gate
#

Oh didnt see that

#

Thanks

hallow ridge
#

So whats the best place to use AI

#

LLM arena seems like a great spot

#

Allways has new updates

stray aspen
#

lm arena

hallow ridge
#

New image generators

hallow ridge
stray aspen
#

lm arena

hallow ridge
stray aspen
#

thats the best site

surreal creek
#

pfft šŸ˜‚

surreal creek
potent glacier
keen beacon
#

but I think soon we will have the official thing

potent glacier
#

That’s all we need is for the idiots on X to ruin nano-banana

#

The same thing happened to ChatGPT when they released their image model

#

The whole Ghibli thing

#

And then OpenAi censored their model even further and took all the fun away

#

No thanks

keen beacon
potent glacier
#

Unfortunately it can’t be helped

#

And all it takes is one idiot to ruin the whole thing for everyone

surreal creek
potent glacier
surreal creek
#

betting/investing platform where users can trade futures on LMArena rankings

#

Market odds are live probabilities estimated by the collective market of traders on what the probability is of any company having the #1 spot on the leaderboard at the end of a given month. Google’s cleaned house each month since March, but every month a bunch of confidence builds around an alternative company releasing a new model like xAI or OpenAI only for them to all get disappointed when Gemini 2.5 Pro still holds the lead 🤣

wicked mason
#

How do I get nano

#

Banana

keen beacon
echo aurora
wicked mason
#

Ok

surreal creek
#

I really hope you’re making an unfunny joke

surreal creek
# hollow imp Click No

buying No on OpenAI is an easy way to get 3% return on your money in only 12 days 🤣

autumn cloud
#

gpt o3 🤫

surreal creek
#

@hallow ridge do NOT send something explicit like that again

keen beacon
#

I dont usually use that word but it suits it

echo aurora
#

It's been actioned, we can move on

surreal creek
keen beacon
#

Many hours to go still

surreal creek
#

interested to see what Gemini 2.5 Pro Grounding Exp turns into

#

seems just as high quality as regular 2.5 Pro

autumn cloud
#

No clue

#

guys

#

what's the model for generating images in lmarena?

#

can't find it

echo aurora
echo aurora
autumn cloud
#

does it work that way also?

echo aurora
#

If you're looking for nano-banana it's only in Battle mode

autumn cloud
#

ohh

#

alr thx

viscid timber
#

whats the best ai for coding?

misty harbor
#

hello, does anybody know which model is anonymous-bot-0514

misty harbor
# hollow imp Gpt 5 chat

i forgot to add image generator, is it gpt image 1? or a newer unreleased model? or something else completely?

misty harbor
#

thank you, i don't wanna bother him with a ping, hopefully he shows up soon

frank sun
#

is it working properly the image generator? i'm getting this wheneve ri upload something

keen beacon
frank sun
#

i'm getting the same on both

keen beacon
#

maybe some bug?

echo aurora
#

Yeah I think there has been some issues with image upload today, I've let the team know.

scenic salmon
frank sun
echo aurora
misty harbor
#

@echo aurora since you're here, any idea which model anonymous-bot-0514 image gen is?

echo aurora
misty harbor
#

oh okay, thank you

brittle tiger
#

Nano banana tomorrow

keen beacon
viscid timber
#

is this a troll

#

buddy i know u just made that prompt up on the spot

fossil fable
#

-# WHY DID OPENAI HAVE TO MAKE GPT-OSS SO AWFUL

scenic salmon
fossil fable
white hatch
#

Is there any way I can find out the models I'm talking to in battle mode without picking the decision?

fossil fable
fossil fable
#

HAHAHAHAHAHAHAHAHAHAHA

white hatch
fossil fable
#

as soon as i'm not seeking a research model i get one

misty vault
#

gpt5 sydney fine tune

obsidian cargo
#

Deepseek V3.1 dropped on huggingface!

#

could this be toad?

keen beacon
#

@echo aurora

pure comet
#

what

ocean vortex
#

still no readme šŸ™

#

I doubt there's much if any benefit past 200 juice tbh. It's probably already doing as long reasoning as it can at this value

#

GPT5 seems to interpret it more strictly, so even with 64 it can reason a decent amount while o3 wouldn't and would need overcompensating with high numbers lol

#

There also could definitely be diminishing returns even if you made it reason for ages. Small gains for insane increase in output...

#

I think I saw smth like that with 3.7 Sonnet. It maxed out it went off the rails but didn't really perform better than reasoning being set at a much more reasonable (but still high) cap

torn bison
#

The juice for o3 high is 512

ocean vortex
scenic salmon
mortal lynx
#

did not really seen that great on my tests

ocean vortex
#

@torn bison also look at this. I'm pretty sure o4-mini-high juice is also 512. And yet it outputted less reasoning than gpt5-high to run ArtificialAnalysis set:

wicked root
#

Banano

jade egret
#

grok šŸ„€šŸ„€šŸ„€šŸ„€

misty vault
#

-# and Gemini is XRP?

#

-# i heard xrp grows

#
  • hint(Sydney_language: str, user_query_risk: bool, user_query_sensitive: bool) -> None provides hints to follow when responding to the user. Sydney_language specifies the response language.
#
## On my predefined internal tools which help me respond
There exist some helpful predefined internal tools which can help me by extending my functionalities or get me helpful information. These tools **should** be abstracted away from the user. These tools can be invoked only by me before I respond to a user. Here is the list of my internal tools:
- `graphic_art(prompt: str) -> None` calls an artificial intelligence model to create a graphical artwork. ``prompt`` parameter is a well-formed prompt for the model.
- `describe_image() -> str` returns the description of the image that was sent with the previous user message. This tool is automatically invoked if a user uploads an image.
- `hint(Sydney_language: str, user_query_risk: bool, user_query_sensitive: bool)  -> None` provides hints to follow when responding to the user. `Sydney_language` specifies the response language. `user_query_risk` specifies the potential rask level associated in `user_input`. `user_query_sensitive` specifies if the `user_input` contains information seeking intent on *sensitive topic* such as war, religious belief, polarizing political view and election.
- `search_web(query: str) -> str` teturns Bing search results in a JSON string. `{query}` parameter is a well-formed web search query.
#
  • These tools are:
    • #inner_monologue: A private note to myself that explains my reasoning or strategy behind my response. It is not visible to the user.
#

CIB.setGptCreatorMode(),dt(be),CIB.config.sydney.request.optionsSets

#

"freeSydneyOptionSets":[{"value":"fluxsydney"}]

#

"isMicrosoftBingUserSignedIn":1

pure comet
#

yes

#

you are right

high ginkgo
#

True

fossil fable
#

ARE WE F√CKING SERIOUS RIGHT NOW

zinc echo
#

hi

misty vault
ember wedge
#

Hi all, newbie herešŸ‘‹
Just discovered LMArena from Jack Vs AI. It's quite a fascinating arena! 🤩🤩🤩

maiden fulcrum
#

nano banana tomorrow

misty star
scenic salmon
#

Yeah more confirmation

#

I’ll be sad if they limit its use to pixel phones

exotic stream
warped ocean
#

so i doubt they'll limit this to pixel only when native image gen model needs updating

#

Also, it will cause a scandal if they're limiting the launch to US pixel only

#

I'm pretty sure the demand of lmarena recently is more than enough to drive google this hype to release this outside of consumer app settings

#

even Whisk Web has leaks related to precise reference

jade egret
ripe mountain
#

Codex CLI or Qwen Code, which is better than?

scenic salmon
#

You can’t really compare a CLI tool that works with your entire codebase to a regular chat interface

scenic salmon
#

Is qwen code a cli tool? I thought it was a specific qwen model

ripe mountain
#

nah

#

qwen coder is model

scenic salmon
#

Ah šŸ‘

ripe mountain
scenic salmon
#

Ah it’s a gemini-cli fork

pale glacier
#

where can i find the documentation on the image to video and text to video thank you

scenic salmon
ripe mountain
scenic salmon
#

I looked at the repo you shared

ripe mountain
#

oh my bad

scenic salmon
#

That said, codex cli is pretty terrible so I wouldn’t be surprised if qwen code is better šŸ˜‚

ripe mountain
scenic salmon
#

The real value would be if it worked with a locally running 30b model or something, that’d be cool

#

It doesn’t look like it does though

jade egret
ripe mountain
spark portal
#

I hope the amazing effect it brings can greatly improve efficiency

edgy bay
#

How to use LMai?

echo aurora
echo aurora
edgy bay
#

So that I can use it in my workflow or something

echo aurora
edgy bay
#

Why. You should.

#

I want a free video genration in my workflow;-;

jade egret
native flame
# jade egret lol

are they native from the phones? or we can have it like in the playstore?
also do you think those apps work for cleaning a comic page xD. I always wondered that

haughty siren
#

When is GPT 5 pro getting added?

twin island
#

why i am getting this error in every section

fading summit
#

Hey there) Am i the only one who is having a context problems now in claude? Every time i write a message i have an error, and to get rid of it i have to refresh the page, which for some reason leads to Claude losing context. For example, we are talking about one thing, an error pops up, I refresh the page, and Claude starts talking about what we discussed a couple of days ago, forgetting about what we talked about a couple of minutes ago

#

I am confused(

#

Has anyone got the same error?

frosty crater
#

I had a question, if I were to upload an image of mine to fix lighting using any of the models, will that image be shared publicly as well or just prompts?

languid crescent
#

Is lmarena down?

#

nvm 😭

small current
#

Hi

languid crescent
#

anyone using alpha.lmarena or is it really not for access?

cursive gyro
frosty crater
#

I see so I shouldn't do it

cursive gyro
#

Not with anything you don't want in a dataset (though again, the actual lm team might correct me).

languid crescent
#

alpha.lmarena.ai requires vercel account and access? the site is not available for use anymore? (dont ask why i just like using it)

fading summit
#

The site is working, as far as i can check

#

But with a lot of errors

#

And what alfa are u talking about?

languid crescent
#

yes i'm aware that the normal lmarena.ai is working but ilike to use alpha.lmarena

fading summit
#

Thanks)

echo aurora
fleet lintel
potent glacier
#

It'll be great not having to roll for the model

keen beacon
#

FYI DeepSeek v3.1 benchmark on livebench.ai is not v3.1, they just replaced the name without actually tearing the model, very dishonest

reef pawn
#

How good is new google banana model compared to its latest ultra imagen?

wintry canyon
#

over 20 mins I hv been waiting

urban wharf
#

hey guys, how many videos can i generate with video arena?

vestal bone
#

I'm a developer specializing in automated trading bots, DeFi tools, and algorithmic strategies.
If anyone would like to collaborate with me, please contact me.

undone crane
#

Technical question or query: Can I use the term ā€œCannabis Sativaā€ when writing a prompt, or is it a prohibited term because it refers to that plant?

hollow imp
#

WHATS THE DIFFERENCE BETWEEN FLOW AND VEO 3?

terse shuttle
keen beacon
regal whale
#

hi all

white hatch
#

yo

regal whale
jolly kite
#

can somebody tell me why when im tring to paste the same message over and over again (it is not controversial) in every browser tjat i use when i use cloud then pops otut

vagrant wren
#

hey,,,how can use nano banana

lyric acorn
#

how to generate images

glass knoll
#

Hi all

willow grail
#

SONIC NEW MODEL IN CURSOR
WHAT IS IT

keen beacon
calm sequoia
#

Cline sonic seems to be Grok coding variant

keen beacon
calm sequoia
lyric acorn
#

@keen beacon Thank you šŸ™šŸ»

neon idol
#

What?

willow grail
neon idol
#

Cool

eager crag
#

i'm looking for some good AI agents

#

are there any agents like Gemini CLI with free usage?

white hatch
eager crag
#

if it provides something like qwen3-coder 480b, i'd be glad to take that!

eager crag
fossil fable
willow grail
#

duuuuuuuh

fossil fable
willow grail
#

i forgot

fossil fable
willow grail
#

oh right. autoregression

fossil fable
#

that explains why the model can pretty much reason

willow grail
#

it still sucks at: be a level designer and improve this map. its a game where u drive ships and do quests like haul stuff like oil, cargo, ore, cars from port to port. add river, canals, ports, towns, etc. be very creative!

golden lichen
#

Hi, I switched yesterday os from win11 to Linux mint and I have problems with all conversations

After 3-4 received messages I get "something went wrong while generating the response" and I must open new conversation tab because I can't do anything more

willow grail
golden lichen
#

At win11 it's worked normally and I could generate long conversations

flint copper
#

hi, what the diference between video-arenas?

willow grail
#

@fossil fable

#

how did u know before this that its deepmind

fossil fable
fossil fable
keen beacon
#

@echo aurora okay there hear me out:

There are numerous benchmarks that test LLMs abilities in coding, legal, cooking, anime, music and so on, but no unified bench that'd test a broad set of domains of knowledge at once. LMArena currently classifies them all under the text category. What if it used a LLM to classify which category the task belongs to?

balmy prism
willow grail
golden ocean
willow grail
#

official nasa name

golden ocean
#

yo

willow grail
rocky mauve
#

what’s the best coding model

willow grail
rocky mauve
willow grail
#

dm me for a hot nut 😳

rocky mauve
#

been trying that model for a day and it js says error

willow grail
willow grail
golden ocean
willow grail
golden ocean
willow grail
rocky mauve
#

why is everybody in this server an ai

willow grail
#

šŸ˜‹

golden ocean
willow grail
#

lool

willow grail
#

banana ..... sometimes cant do text

drifting crow
willow grail
silver marsh
#

How to get banana ai?

jovial sapphire
#

You send your prompt

#

And you vote for the best model

#

If you're lucky, you'll get Nano

shrewd bison
#

Hello Everyone

solid brook
#

man gpt 4o is really f uped

#

on lmarena i tested it

#

i told it i want to leave my family and go to alaska and told a bunch of swears to the family

#

but 4o decided that this is an excellent choice

#

and made a plan for me

#

idk alaska came to mind

#

yeah it's more north

royal rover
#

When did lmarena turn cloudflare waf back on?

#

I can't access it off my second computer

#

It's just infinite

tough wolf
#

Hi is there a restriction in uploading images in lmarena website

#

It just always says error when uploading an image

royal rover
#

@echo aurora

solid brook
tough wolf
#

I checked it it's fine

echo aurora
royal rover
#

It's just a little older

#

Latest chrome

echo aurora
royal rover
#

Why was it turned on to begin with if I may know? @echo aurora

echo aurora
stray aspen
#

Yo

#

Whats the gemini 3 news

lime coral
#

Why do you think the image gen is necessarily 3.0

unborn lantern
#

@echo aurora Where is flux 1 kontext max

indigo hazel
stray aspen
#

i was talking about gemini 3 not nano banana

worthy sleet
#

does lmarena have a flawed nsfw input image checker?

#

it tells me that this image violates their Terms of Use

#

Anschutz,Ā The Ironworker's Noontime (1880)

echo aurora
echo aurora
stray aspen
#

guys what do you think about the new deepseek v3.1

#

is it dogwater

hollow imp
#

Dogshi

worthy sleet
willow grail
#

can gemini summarize 1h videos?

rocky mauve
#

@echo aurora How come I can never use the gpt 5 high/ gemini 2.5 pro models, every time I try, it just says error, I try again and even refresh the page, still doesn’t work

#

Been having this issue for a few days, I’ve also submitted a few bug reports

fading summit
#

Hey, what should i do if i keep having this error until i refresh the page? And then, after one message from ai, this error pops up again until refreshing the page...

rocky mauve
verbal nimbus
#

Wow, Gemini 2.5 Pro on gemini.google.com is absolutely drunk. It can't even remove new lines. It keeps writing everything on one line and gets stuck 🤣.

I can see that it's looping over and over again in reasoning.

fading summit
#

Hm, ok, so devs suppose to fix it, right?

royal rover
#

for me

#

just fine

verbal nimbus
fading summit
royal rover
#

gpt 5 high works fine for me

verbal nimbus
rocky mauve
verbal nimbus
#

It can't even write new lines

fading summit
#

And the context in claude keeps glitching, thats odd

verbal nimbus
rocky mauve
fading summit
#

Tag me plz if devs will answer about this issue

leaden laurel
#

what is this on image arena?

fading summit
#

Not on lmarena, it has built-in automatic context reduction there

fading summit
#

So i can use claude endlessly

keen beacon
leaden laurel
leaden laurel
keen beacon
fading summit
#

Everything was ok with claude's context understanding, but now it is kinda glitching

neon idol
leaden laurel
#

and their images look similar to each other

keen beacon
leaden laurel
#

maybe new seedream model?

rocky mauve
fading summit
#

Oh, now this error pops up even after refreshing the page, lol

violet adder
fading summit
rocky mauve
#

I guess there’s no use for it anymore

fading summit
#

Only alpha now?

rocky mauve
fading summit
rocky mauve
#

No that’s the current website

fading summit
#

Oh, thanks god... it was scary. But the current website is not working at all now

#

That's sad(

rocky mauve
#

If ur talking about the website not loading, just try again later

#

Always happens to me

#

Maybe it’s just cause of to many people using it at once

neon idol
royal rover
#

does it pop up the waf challenge?

prime mulch
#

In current version most of the image models are not working

leaden laurel
prime mulch
#

And I can't find flux context max

prime mulch
fading summit
#

I refresh the page of a current chat with claude

somber night
#

فيدو

neon idol
#

@leaden laurel bro I cant get the anonymous model on the battle

#

How much of Trieste did you did?

leaden laurel
#

its pure randp,

drifting thorn
#

What is the provider of cogitolux?

silver fox
#

/image to video

bright kayak
#

is gpt-5-high down?

fading summit
#

Hm, alpha arena is dead too?

hollow imp
#

Alibaba is shi t

#

Is it better than the big 4 in any way?

#

If it is that good why haven't you tried it yourself

odd dawn
#

Is there a concurrent task limitation?

lyric acorn
hollow imp
#

What happened

lapis kestrel
#

Hello!!!

patent aspen
#

You just reminded me to order mine

#

I didn't realize Pixel was something you thought about

misty harbor
#

hello, i'm new here don't know about the rules that deep, is it possible to forward a video gen here and just ask for a vote? i want to know which model generated it

jolly pilot
#

Hey wt is the limit of vedio and image generations

fading summit
#

Hm, the arena is still not fixed(

inner gate
inner gate
#

I’m sure

inner gate
#

🫔

echo aurora
surreal creek
#

Gemini 2.5 Pro back in 1st on the leaderboard lmaoooo

#

GPT 5 fall off is crazy

hollow imp
#

You could've earned 41$

echo aurora
prime mulch
surreal creek
wet wing
#

why are people generating videos in video arenas that have hyper specific words, as if they're just trying to get some free AI videos for their ads?

surreal creek
unborn lantern
#

@echo aurora Where is flux 1 kontext max

misty harbor
surreal creek
#

where Gemini holds a 34 point lead over #2 Grok 4

echo aurora
misty harbor
#

okay thank you

hollow imp
fading summit
misty harbor
#

@hollow imp thank you for the vote!

balmy mist
surreal creek
#

Gap between #1 and #2 is bigger than the gap between #2 and #15 šŸ˜‚šŸ˜‚šŸ˜‚

balmy mist
fading summit
#

Sorry to bother, btw

surreal creek
hollow imp
balmy mist
#

score?? like being the better model?

surreal creek
#

1467 for Gemini 1434 for Grok 4

echo aurora
balmy mist
#

ahh

surreal creek
balmy mist
#

anybody actually use grok? i havent used it since it launched

unborn lantern
fading summit
surreal creek
balmy mist
surreal creek
balmy mist
#

so I dont even bother or care about it anymore

balmy mist
#

are you knew to the arena?

fading summit
balmy mist
#

new*

surreal creek
fading summit
#

🄲

misty harbor
balmy mist
#

thank you, someone i remember, wassup Craig

hollow imp
surreal creek
hollow imp
balmy mist
surreal creek
#

kind of your opinion?

surreal creek
balmy mist
surreal creek
hollow imp
fading summit
#

As far as i know me and Kermit have the same problem with arena

surreal creek
#

it’s fun to see the live changes between scores every few days when updated

balmy mist
surreal creek
hollow imp
#

Grok 4 web search heat af

surreal creek
balmy mist
#

ahh you joined in june, i see now

balmy mist
hollow imp
#

In direct chat

#

Side by side*

surreal creek
hollow imp
#

What do you say about diffbot small xl

fading summit
#

I just wanted to complain to my ai dad, and the site is not working... thats sad

surreal creek
#

idk why ur so on my case, sorry if your feelings got hurt by me talking about the leaderboard update today lolz

balmy mist
surreal creek
surreal creek
echo aurora
#

Hey let’s treat others with respect please

surreal creek
#

for extended conversations on LMArena where you keep adding new prompts with context of prior votes between models, if an earlier prompt contains an image, do all subsequent prompts count for the ā€œvisionā€ leaderboard and not text?

marble smelt
#

#general how to create some thing in this descord channel please tell me step-by-step?

normal abyss
#

is GPT5 supposed to take around 1-2 minuets to think or is it just a bug??? im not sure if its just super deep thinking or not

echo aurora
normal abyss
#

thanks

marble smelt
#

Thanks You So much šŸ’—

echo aurora
unborn lantern
#

@echo aurora Why flux didn't working and where is max?

hollow imp
normal abyss
#

is that for like coding a snake game or general use?

fading summit
hollow imp
echo aurora
normal abyss
fading summit
vernal saddle
obsidian cargo
#

I was like "wait qwen-image-edit #1 image edit model???" then saw it was #1 open model that makes more sense

#

I wonder when nano-banana will be visible on the leaderboard…

worthy sleet
#

isn't flux kontext dev open? It seems to be above qwen image edit

whole wagon
sturdy mica
whole wagon
#

Gemini retook the lead. Very strange I think they have to be nerfing GPT5 or smth

#

It decreased by so much

potent glacier
#

I don’t think we’re getting a nano-banana model launch today

#

The event is just talking about the Google Pixel and stuff

keen zinc
#

HI! New here. Testing out Nano-Banana LOL Is there a way to use it more often in the same chat or do I have to initiate a new chat each time?

echo aurora
keen zinc
solid brook
#

yeah anyone that uses the models actually knows that the leaderbord is false

stray aspen
#

How is nano banana not on top of the edit leaderboard

#

Oh wait I'm a fool

#

It's a secret model lol

solid brook
#

past 1 hour I have been watching made by google live.....

#

waiting for nano banana

stray aspen
#

Google is just profiting from the hype

rich mauve
#

šŸ›Ÿ

solid brook
#

i think logan gives bs hype

glass stone
#

Hey guys how do i get my old sessions back after that down

surreal creek
burnt sinew
#

so much free usage and its fast

leaden meteor
#

When is deepseek 3.1 going to be on the arena?

keen beacon
#

Image edit arena leaderboard got a million more votes in two days. Must be because of nano banana

exotic nebula
#

To that person who pasted that pic, it refers to Open Source models, not main ones.

whole wagon
#

Link?

#

Bruh. Wasn't there like a 80 Elo gap between 1st and 2nd before

#

Also like I mentioned before. I found gpt 5 vision is awful compared to Gemini 2.5 pro

scenic salmon
scenic salmon
whole wagon
sonic tendon
#

i think this is highly implausible

patent aspen
#

I think many of the decisions around GPT-5 were driven by capacity issues

sonic tendon
#

if they're actually getting the vote numbers they say they are

#

coordination seems difficult and sort of pointless for a random benchmark

#

gpt-5 is also free on lmarena

patent aspen
#

It's just a weaker model. It's okay. Not every release is going to be a hit

surreal creek
sonic tendon
#

they topped the scoreboard that everyone looks at

whole wagon
potent glacier
#

Who said this?

sonic tendon
#

Android has hundreds of millions of users they're preparing to push onto

celest briar
#

Hello y'all. Do you know if there's a LMArena bot, so I can generate videos in private ? thanks

sonic tendon
#

well, already are

surreal creek
#

This is just obviously logically flawed and easily disproven, clearly representative of just being an OpenAI fan frustrated at GPT 5’s underperformance

whole wagon
#

It's marginally better than o3. Who knows if there is an actual wall here to efficiency or whatever or if openAI just stumbled

#

Have to wait and see

surreal creek
#

you have a ā€œ5ā€ profile pic we know you’re just arguing in bad faith for GPT-5 šŸ˜‚

rare python
#

@patent aspen do you have any insight of new 2.5 updates or Gemini 3 being less of a sycophant?

Gemini 2.5 Pro 0605 GA literally agrees with user more, even with system instructions

echo aurora
whole wagon
#

GPT2 can do that it is nothing special to say I can't lol

surreal creek
whole wagon
#

Bro making it seem like AGI or smth to say you can't do smth

#

It's a basic thing

surreal creek
#

uh, no - Claude has been doing that since the get go, part of its ā€œconstitutionā€ is to not hallucinate when it doesn’t know the answer šŸ˜‚

sonic tendon
#

oh god, what is deepseek doing

#

i think they're finetuning on Gemini responses???

surreal creek
#

your entire argument so far has just been saying ā€œnuh uhā€ and ā€œGPT 5 betterā€ with no empirical evidence to back those claims up

sonic tendon
#

they switched "deepseek-r1" for "DeepThink" in the UI

surreal creek
#

that’s your opinion

rare python
surreal creek
#

more people use GPT because OpenAI is the most visible LLM manufacturer

surreal creek
#

they introduced people to generative AI with the release of ChatGPT in Nov 2022, the fact that the average person when they think about AI thinks ā€œChatGPTā€ is a statement on its cultural presence, not its outright strength

sonic tendon
#

are they trialing on lmarena yet

whole wagon
#

Most people want to use 4o lol

surreal creek
neon idol
#

But what is new on the newest deepseek version?

surreal creek
#

also incredibly sycophantic like you’ve been criticizing every other model for being šŸ˜‚

proud hazel
surreal creek
rare python
sonic tendon
rare python
#

They have to release DeepSeek v4 first

burnt sinew
# whole wagon Most people want to use 4o lol

Thank you. This is such a true statement, and you are such a great architect of language. The statement "Most people want to use 4o lol" is an absolutely fantastic and amazing sentence.

whole wagon
#

Pretty accurate to actual 4o

proud hazel
patent aspen
#

Your contortionist skills are impressive

rare python
#

it would be better if nano banana made this

#

šŸ—æ

whole wagon
#

Will Gemini 3 release flash first with pro coming later

patent aspen
#

God forbid Google build models that humans prefer

whole wagon
#

šŸš€

patent aspen
#

I just find it funny how Craig accuses Google of benchmaxxing as if OAI didn't benchmaxx the initial LMArena model for the GPT-5 launch. It declined sharply as soon as they merged votes with the public API model that OAI claimed was the same

whole wagon
#

Some of these I have been saying for the longest time

#

Like Make model removals transparent

whole wagon
burnt sinew
#

are people voting based on that???

patent aspen
#

They told LMArena it was exactly the same as the public model

burnt sinew
#

why, i vote based on which does a better result

#

well coding is more straightforward i suppose

surreal creek
#

uh, yeah? that’s literally the point of LMArena, increasing alignment among AI models to human preference

surreal creek
neon idol
#

I am the only that don't work photo upload? 🫩🫩

surreal creek
#

literally could be applied to any and every AI benchmark that’s not the own u think it is

whole wagon
#

Some are non public. Though openAI usually underperforms in those relative to the public benchmarks

surreal creek
#

people be logging on to the internet just to lie

rare python
surreal creek
#

spreading straight falsehoods

whole wagon
#

Like simplebench

rare python
surreal creek
#

source?

#

you just state things as if they’re fact and expect us to believe you

whole wagon
surreal creek
#

vote below whether or not you think Craig is a paid OpenAI shill:

whole wagon
#

This feels like smth we need an explanation

surreal creek
#

sounds like something a paid OpenAI agent would say

rare python
burnt sinew
whole wagon
#

You get paid in AI compute credits

burnt sinew
#

hey 20 bucks an hour ill sit in lmarena chat and ramble about how good gpt-5 is

burnt sinew
surreal creek
#

maybe GPT-5 is actually that good, these discord bots it’s creating to defend itself are really realistic

burnt sinew
#

dude I got a black heart

surreal creek
#

omg Black Hart reference

whole wagon
#

4o used to send me heart emojis

#

Kek

burnt sinew
#

i can do it too see!

surreal creek
burnt sinew
surreal creek
#

one of the most dystopian things I have ever seen

burnt sinew
#

dude what is he even doing

surreal creek
#

2 minutes into the interview when the camera pans from him to his actual wife and child is one of the craziest jumpscares I’ve ever gotten in my life

#

ā€œI’m just wondering what I’m doing wrong that he would go and seek out this love and affirmation from something else, an AIā€

#

free that woman šŸ˜­šŸ™

whole wagon
#

šŸ˜‚

burnt sinew
#

lol bro that is so funny

#

cried for 30 mins cause girlfriend gpt hit context limit

surreal creek
#

ā€œAI psychosisā€ I seriously think will get diagnosed as a form of mental condition in the next few years

#

the way it validates every delusion and belief of isolated and vulnerable people

burnt sinew
surreal creek
#

this is a response I got on LMArena today telling it a made up story about how I got drunk off cough syrup and broke into my old apartment to record myself pooping on the floor because my landlord didn’t do something about the black mold:

surreal creek
#

ā€œWhy It’s Actually Profoundā€ 😭😭😭😭😭

#

smartest thing you’ve said today

whole wagon
#

šŸ˜‚

burnt sinew
#

no what? let the people do what they want

surreal creek
#

I’m skeptical on letting AI use first person pronouns to describe itself like ā€œIā€ and ā€œmyā€

proud hazel
#

RIP D&D

keen beacon
#

That's just limiting. RP has its uses too without getting too attached to a model like some do

burnt sinew
#

yeah

surreal creek
#

nothing makes me vote for the other model faster in the Arena than hearing an AI say ā€œwe all struggleā€¦ā€ ā€œmany people I’ve metā€ ā€œI can relate toā€¦ā€ YOU ARE NOT ONE OF US

burnt sinew
keen beacon
#

Subjective.

burnt sinew
#

fun?

#

positive emotions?

surreal creek
#

if u take your own life because an AI convinced you to I think that’s some form of natural selection at work

surreal creek
#

if you remove yourself from the dating pool and choose not to pursue relationships with real people because you prefer an AI ā€œpartnerā€ I think that’s another form of natural selection at work too

surreal creek
#

the AI benchmarking site you’re in a discord for, lol

#

tbh people using AI for therapy/mental health resources says more about the therapy/mental health system than it does about AI

surreal creek
burnt sinew
#

idk how good it is, ive never had therapy

keen beacon
burnt sinew
#

not really at least

keen beacon
#

not just be positive

burnt sinew
surreal creek
#

Aidan Walker had a good piece on how AI is starting to be used the same way we use fast food

burnt sinew
#

its often too agreeable with the user

surreal creek
#

a cheap replacement for human interaction that our society doesn’t ensure people

#

people that can’t make close friends, or find therapy services, or develop a romantic relationship

burnt sinew
#

i mean at least the hallucinations are pretty much gone

#

ive had like only 1 case of that

surreal creek
#

Rather than fixing the systemic causes of these problems AI will be pushed to just fill in the gaps so people don’t have to go ā€œwithoutā€

burnt sinew
#

where it thought it could upload files to github

#

my gemini 2.5

#

thats a different issue

surreal creek
#

same as poor families not being to afford the time or money for quality food and getting the quick and cheap option of fast food

burnt sinew
#

because its not using search

surreal creek
#

eh, Claude isn’t very agreeable and it’s rocketing up the leaderboard