#general

1 messages · Page 100 of 1

keen beacon
#

Just waiting for the news of R2 getting a release date at least

#

Now add the bonus points for reasoning and R2 will have a great chance to blow OpenAI out of the water

#

Or we can take a look at LiveBench, where GPT-5-Chat is right around GLM 4.5 and Kimi K2

#

78 points for high - 60 for chat

#

18 points difference

#

Not bad

#

It is for instance the same difference as between V3 and R1-0528

#

Now add this same difference to V4 - that will definitely be no worse than K2 or Qwen base - aaaand R2 can appear at least as good as Gemini

#

Just kidding, on LMArena it is just as much as between old V3 and R1

calm sequoia
#

What is the default ranking of the text benchmark? It's not the "Score" as I can see

lofty elm
#

i use gpt5 high with 2000000 context it's really make good responses for me

barren aurora
#

lmarena ai
Can the image or video be used commercially?

vale jungle
#

Hello, I'm new.

jovial sapphire
#

i made this with nano banana

#

original picture was me on my window lol

frosty lark
languid crescent
rare python
languid crescent
#

ohhh

#

also, what's nano banana?

ocean vortex
mortal coyote
#

@echo aurora i keep getting this error , even after changing internet and browser

worn bison
#

Whats currently the best coding ai?
In terms of actually coding (Speed doesn't matter)

lofty elm
#

how do i select gen models for image gen

pastel vale
#

❌ Generation failed. Failed to create evaluation session. why i m geeting this msg

keen beacon
vale jungle
#

Hello i have a question?

jovial sapphire
jovial sapphire
echo aurora
echo aurora
vale jungle
echo aurora
vale jungle
obsidian cargo
#

Getting a something is wrong error when attempting image edit battle

#

Regular image generation too

echo aurora
#

Hey everyone - it looks like Video Arena isn't currently working. I'm setting those channels to read-only until we fix the problem.

echo aurora
#

There might be an outage...

keen beacon
#

I had to make a new chat to make it work again

vale jungle
#

What happened to the video generator??

obsidian cargo
#

Text battle works

keen beacon
jovial sapphire
echo aurora
#

Now I'm getting errors.

#

Okay going to flag.

keen beacon
vale jungle
# jovial sapphire

Look at this image that I generated with the model, I converted an image to a realistic one.

jovial sapphire
#

It's good, the more users there are on LMarena, the more accurate the leaderboards are

vale jungle
#

Is a test

jovial sapphire
keen beacon
keen beacon
#

I dont understand the hate people give on it on Reddit for example

obsidian cargo
#

Back up now

echo aurora
#

Ugh it's working again for me. (I know that's a good thing but still)

obsidian cargo
#

Or not

echo aurora
#

This happened yesterday too

obsidian cargo
#

False alarm ehehe

jovial sapphire
#

Nah not working

echo aurora
#

And erros again lol

keen beacon
#

LM arena

jovial sapphire
#

Hahahahaa

vale jungle
#

is collapsed

obsidian cargo
#

Ope it hasn't errored out yet this time let's see

indigo hazel
jovial sapphire
#

just worked

#

for me

obsidian cargo
#

Gpt-image-1

keen beacon
jovial sapphire
#

cooool

obsidian cargo
#

Aaand now I'm getting errors again

keen beacon
obsidian cargo
jovial sapphire
#

here are some cats to ease this chat

#

meow meow

keen beacon
#

not mine

#

lol

#

I wish I had a dog

jovial sapphire
#

i love both

#

but cats are cuter

#

people that don'(t like

#

cats

#

are red flags

obsidian cargo
keen beacon
jovial sapphire
keen beacon
#

in the next version

obsidian cargo
jovial sapphire
#

Nano banana:

#

Best result so far but there are some lady bugs

keen beacon
#

spiders

jovial sapphire
#

oops

echo aurora
#

Okay it should be fixed now.

jovial sapphire
#

yes it works!

#

had 3 runs in a row

vale jungle
indigo hazel
vale jungle
#

Ralistic

obsidian cargo
slow grotto
#

is gpt-5-high just the same as gpt-5? i cant find any API pricing

obsidian cargo
#

Three now

jovial sapphire
#

Zoom out this picture like it was taken 30 meters away

#

Nano-banana

#

Amazing

#

You can see the yellow car

vale jungle
#

Yeah

fleet lintel
#

wow.. image models are getting crazy.

jovial sapphire
#

Yeah

#

Still bad on some very specific tasks

#

but this is the worse it will ever be

#

editing won't be a thing in a few years, mark my words

lofty elm
#

theres no nanobanana

jovial sapphire
#

where

#

what

#

?

lofty elm
jovial sapphire
#

it's only in battle mode

#

it's an anonymous model

lofty elm
#

okay

#

thanks

jovial sapphire
#

np!

obsidian cargo
#

@echo aurora only getting these now

jovial sapphire
#

refresh

#

cuz mine works

#

generating images is like a game for real for real

#

like sometimes i spend 2hours just doing that lmao

obsidian cargo
#

I had refreshed yeah

lofty elm
#

Do image outputs are being logged?

jovial sapphire
echo aurora
jovial sapphire
#

they are on a s3 amazon bucket

#

they expire after some time but they can be retrieved

#

so yes, images are logged

lofty elm
obsidian cargo
jovial sapphire
#

if you post on instagram

#

it's the same

agile bloom
#

does this server have a list of all ai models, written in one place? I wanted to list down all the ai models available on LMArena

obsidian cargo
#

The leaderboards do but they don't include stealth models like nano banana

fleet lintel
#

I am getting imagen-4.0-ultra-generate-preview-06-06 much more than nano-banana

And in all my test, imagen-4.0-ultra is far better than competitors

obsidian cargo
#

Hitting retry on dozens of errored outputs only to get something that looks like absolute ass 🥲

brittle tiger
obsidian cargo
#

Weirdly enough I haven't gotten nano once with my "turn doppelganger marina realistic" image edit request.

leaden palm
#

"gpt-5 fixed hallucinations"

obsidian cargo
#

Maybe nano doesn't like the prompt but keeps getting selected to do it and that's why I'm having such a high error rate???

lofty elm
urban arrow
#

Impossible to use

vale jungle
#

You know why I like this model because it has good image quality to improve the quality of the photos

indigo hazel
#

me too, same error

urban arrow
#

lmarena died 😭

obsidian cargo
#

@echo aurora problems still occurring it seems

indigo hazel
leaden palm
jovial sapphire
#

Yeah, keep getting errors now

ancient reef
#

A lot of errors...

echo aurora
#

Okay thank you, flagging to the team.

minor bloom
#

So, is o3 search better than gpt 5 search?

#

In arena

echo aurora
#

Hmm I wasn't able to repro.

#

Image in Battle again?

obsidian cargo
#

Yeah

obsidian cargo
obsidian cargo
#

Ehehe sorry 😛

placid shuttle
#

Yesterday discovered this new AI and Im so happy to see the BIG step is being made right now in AI 🤯😳

south hatch
#

Your AI model has suddenly gone viral in Taiwan, being widely used to generate PVC character model images.
It’s unclear whether the sudden surge in user numbers is the cause, but the service feels like it’s experiencing a DDoS issue.

obsidian cargo
vale jungle
#

Although this is current, I asked him to improve the quality.

#

The first image is better Quality

exotic nebula
vale jungle
south hatch
exotic nebula
vale jungle
boreal timber
#

How do I use nanobanana??

vale jungle
boreal timber
#

Is it here or on the lmarena website

vale jungle
boreal timber
#

Can anybody assist me with where to use the nanobanana feature?

keen beacon
#

you cannot choose the model since it's a pre-release model.

vale jungle
boreal timber
white hatch
#

Can gpt-5 think in chinese?

indigo hazel
#

i dont know anything about nanobanana, can you tell me something to understand?

lofty elm
#

Is the purpose of generating images are to enhance censorship or no?

#

im curious about the scope of this

vale jungle
#

I put the text

#

the results are good?

#

I improved the quality and put the text

jovial sapphire
#

which is original

vale jungle
#

Original

hollow imp
vale jungle
#

Yeah

undone pier
#

Does anyone know what happened to flux kontex max?

mighty ocean
#

The search leaderboard isn't updated for gpt 5 search. Would you guys consider the new gpt 5 search better than the original o3 search

calm sequoia
jovial sapphire
#

guys wthhh

solid brook
#

@echo aurora it will be really good to in someway increase input limit. if you guys add support to other file types to uploud like documents please don't put any limit on the file token size

jovial sapphire
#

lmaooo

#

they cant put no limit

#

the companies that give the model to lm arena

#

pay

#

it's not free for them

#

running a model isnt free lol

solid brook
#

dude

#

the input token is much cheaper

#

to process

#

than output token

jovial sapphire
#

yeah but "no limit"

#

xd

#

funny

solid brook
jovial sapphire
#

got u

brittle tiger
solid brook
#

maybe 20th august?

jovial sapphire
#

nah

#

i work at google

#

21-25

#

source: trust me

solid brook
# jovial sapphire source: trust me

see like when i use gpt 5 it can generate 5000 lines of code and it is way more expensive than proccesing 5000 lines code of input but on they have the input limit around 2400 lines

dire brook
#

Hi

solid brook
#

no

#

but we are getting there

urban arrow
#

Getting error again

obsidian cargo
#

same here, on text battle

solid brook
urban arrow
jovial sapphire
#

u just gotta retry

#

and retry

#

until it works

solid brook
#

no

#

the only model that i hit limits is opus

#

first opus i think then gpt 5 high

jovial sapphire
#

No

#

Opus isn't better than GPT5 at coding lol

#

I used both for months

#

And weeks of the last Opus 4.1

#

GPT5 is better

#

I used it for Py, Golang

fleet lintel
solid brook
jovial sapphire
#

That's what you think

#

Use GPT5 with API

#

Not chat

#

It's different

#

SWE is the worst benchmark, it's corrupted

#

You can't test a model relying on battlemode tbh

solid brook
#

btw in lmarena for max performance with gpt 5 high say think very hard in the prompt

jovial sapphire
#

Try it on cursor

#

Install cursor

#

You have free credits

#

250

#

Try the model with your best project

fleet lintel
#

but this is 2.5 pro. I am talking about Gemini3.0 launch

#

nah

#

they wont call it 2.5 for sure... they will have some random name like kingfall

#

give me a difificult prompt to test 🙂

languid crescent
#

kept getting these errors 🙁

fossil fable
#

really excited for distilled r2:a3b

solid brook
#

????

urban arrow
#

Yeah, it's impossible to use LMarena today.
So much error messages

solid brook
#

guys remember

#

in lmarena models still have context window limit

#

no bro not that limited

urban arrow
#

That's not the problem. I've been doing the same thing all week, and only today I'm getting this error message, just like other people.

solid brook
#

the original model context limit

#

gpt 5 on api has 400k context limit

#

so it is 400k on lmarena

hollow imp
#

Add pdf support

#

🙏🙏🙏

#

@hollow ivy please

solid brook
#

like how much? line of code

#

i'm not a expert

#

idk how much lines that is

#

\you cannot input over around 2400 lines of code

#

oh yeah

#

just explain it to the model

languid crescent
#

still experiencing errors :/ have to refresh it numerous times to make it work

midnight salmon
#

I keep getting an error message

languid crescent
solid brook
#

oh no

#

i get it too

languid crescent
urban arrow
#

Yeah, it's the server

solid brook
#

these are not normal

midnight salmon
#

Trying out Battle mode

hot river
#

for those of you that use openrouter, saturn is giving away 500$ of credits to testers helping with identifying bugs. free money fr

#

i didnt get any issue thismorning?

#

what was your prompt

#

weird, could have a limit on ctx

#

pretty expensive model

languid crescent
#

tried making my own prompt to be a study mode tutor

thin creek
#

Yeah, I’ve run into that too. When that happens, the only thing you can really do is grab the last few messages and start a new chat.

languid crescent
#

if it happens again, i'll keep refreshing it

solid brook
#

sonnet no lol

#

opus maybe

odd urchin
#

video arena keeps failing to generate

#

it always says "failed to create evaluation session"

urban arrow
solid brook
#

hmmm actually idk i don't have any exprience on that

#

maybe claude is better in that area

umbral root
#

Hi, I don't see models like Seed-1.5-VL in the Vision Arena section. Is it because they haven't been added to the Arena yet?

odd urchin
vital beacon
languid crescent
hollow imp
#

My day is ruined hearing that

#

Opus is absolut shyt

potent glacier
#

Is it because of all the Taiwanese people using it to try nano-banana?

#

Someone said that earlier

fossil fable
#

it's horrendous to work with

#

i do NOT like working with grok 4

fossil fable
# fossil fable i do NOT like working with grok 4

the lack of a grok 4 mini indicates that grok 4's "sota" was just achieved by scaling

because they basically pulled a deepseek with 3 and stole o1 chains then claimed sota which got them a ton of illegitimate funding

it's not workable and it's benchmaxxed because they ran out of models to steal from (like deepseek, but we'll have to see how good r2 will be)

hollow imp
fossil fable
#

well

not literally but

#

it stunk of qwq and r1

when you scanned its semantics it showed up directly next to them and o1

#

which is basically proof

#

the best thing about qwq and r1 however is that they opened up powerful stuff

grok hasn't been open since the first prototype

#

grok 4 is the result of a company realising they ran out of stuff to yoink basically

it's not THAT bad but like

unless you're running pure stem i wouldn't touch it

#

you said "nothing good is "open""

fossil fable
#

i was really hoping gpt-oss would shine as a chatter but it turned out to be the complete opposite

#

like phi-4, it's just benchmaxxed on synthetic data

keen beacon
#

grok 4's math answers even when i ask it to explain r like 1 word final answer

#

and takes a lifetime

#

to answer

fossil fable
#

like, kimi has actual soul

it's nice to work with which is what you want to focus on instead of pure intelligence

even if it happens to be SOTA clever (well, it had to be), GPT-5 shines in the same way

fossil fable
#

if grok 4 tried teaching me something it'd be like taking 20mg of ambien

fossil fable
keen beacon
#

but grok 4 did better on a private bench

fossil fable
keen beacon
#

gemini 2.5 pro is a tad better

#

tho

keen beacon
fossil fable
#

dude

#

grok 4 is the biggest benchmaxxing block of tra—

keen beacon
languid plover
#

When they are gonna add midjourney

keen beacon
#

how is it benchmax ??? its a private bench bro

#

bro no LOOK AT THE BENCH MARKS

hollow imp
#

@keen beacon 🥲

#

Sawaal 3 baar attempt nahi kr paarha hu

fossil fable
#

i wish there was a way to convey to laymen when a model is good to work with

but NOPE

#

numbers numbers numbers

#

f∆ck the numbers

#

"oH gPt-5 dOeSn'T sCoRe wElL oN [n]" and then i try it on the subject and it's brilliant

slate pilot
#

how to?

indigo magnet
#

Hi.

fossil fable
potent glacier
#

Absolutely wrong

#

There are a ton of open source image gen models that are amazing

fossil fable
#

why do i have to say gpt-5 with reasoning every time

potent glacier
#

And only getting better and better with time

fossil fable
#

i think that's what he meant

#

i thought that too

scenic salmon
#

It’s a joke because it would’ve been called o4 were it not for the rebrand on model naming

hollow imp
#

Has anyone tried stable diffusion?

swift vapor
#

so.. umm is it fixed yet?

echo aurora
jovial sapphire
#

patience is key

swift vapor
undone crane
#

prompt for video-arena-1 and 2 and 3 not works !!

echo aurora
fossil fable
#

most of it

however not all of it

reef pawn
#

Gemini 3 when

obsidian cargo
#

noticed a new model on the image arena: lucid-origin?

exotic nebula
whole wagon
ocean vortex
#

gpt5? it kinda is

#

well it's o5 if we try to fix their naming mess with o4-mini lol

#

They potentially did that on purpose to arrive at 5

#

and then merge

#

with gpt

fleet lintel
ebon eagle
ocean vortex
#

like gpt4 to gpt4o was perhaps even bigger jump than o1 to o3

fleet lintel
#

it went from 21 to 3.. it's crazy

ocean vortex
#

kinda crazy

ocean vortex
#

🗿

white hatch
#

daaaaaamn

urban arrow
#

Do the LMarena devs/staff know who owns the Nano-Banana model?

leaden meteor
# whole wagon

When did we get this update? It shows Aug 16 but I did not see it yesterday...

ocean vortex
# white hatch daaaaaamn

I think people may have gotten familar with gpt5 style and that they shouldn't like it, and so now they are voting against it more often. LOL

patent aspen
#

IIRC the 21 point difference dropped to 6 after they merged the lmarena GPT-5 with the public API model, which was supposed to be the same model

brittle tiger
patent aspen
#

#general message

Or rather the initial 15 point drop happened after the merge. It makes sense that it would continue dropping as more votes come in if the public API model is weaker than the LMArena model

tardy crown
#

yo

#

anyone know how to make videos on flow that includes characters from a photo?

whole wagon
#

If it is a different model why the hell are they merging the results

patent aspen
#

But it did get worse after the merge and the current dropping is consistent with that

whole wagon
#

Sheesh what a scam

patent aspen
#

I don't like to accuse companies of lying because most things are just misunderstandings. I just don't know what else it could be

stray aspen
#

why is gemini 2.5 pro getting so close to gpt-5 high

patent aspen
leaden meteor
#

Does this new update include the votes that happened after the 'warmer' change OpenAI made?

#

leaderboard update*

whole wagon
#

The API doesn't have that change

#

So no

#

This is pointing to the specific GPT5 high endpoint it does not just change over time

#

Otherwise that would be terrible for production

keen beacon
leaden meteor
#

API endpoint*

drowsy pier
#

what are video limits per day?

whole wagon
#

The API endpoint doesnt change because it is supposed to be constant

#

Thats the point of having an endpoint

#

They dont change it on purpose

exotic canyon
#

the nano banana model is never used anymore in battle, how could i fix it ?

hollow imp
leaden meteor
willow grail
#

JUST BOUGHT cat dry food for just 1.15 euro per kg.
i feed this to my babes.
guess how many babes i have?

whole wagon
#

yes

stray aspen
#

ROFL

patent aspen
#

Technically they could change the model behind the endpoint without changing the endpoint, but it would likely cause a lot of breakages

#

Of course I wouldn't put that past OAI in the first few days after a launch since the breakages would be less severe

potent glacier
#

So was nano-banana removed from Battle?

#

I haven’t gotten it in a while myself

#

I honestly wish we’d get an announcement or something when models are removed from the site 😬

modest prism
#

Flux-kontext-max got removed?

ember ermine
#

Hi guys

undone pier
neat stirrup
#

Hello

potent glacier
next dagger
#

nano-banana is cracked guys, i just tried it on and HOLY

potent glacier
modest prism
next dagger
fierce cove
#

Hello, the Cloudflare Super Bot Fight Mode page appears every 15 minutes. Is this happening for everyone, or is it an issue on my end? I don't think it's a bug, which is why I'm reporting it here

willow grail
#

^^

#

O.O

#

they all black tho

#

a bit of white/gray spots. very little

#

they can fly even

#

and do glug glug sound

potent glacier
potent glacier
#

No

#

That doesn’t even make sense

#

It’s all randomized

#

No it doesn’t

#

You don’t get rate limited in Battle

#

Only in Direct Chat

fleet lintel
#

my hopes from G3 is dwindling. After not so great gpt-5 release, I think the progress in model quality improcements might be slowing down

potent glacier
#

Alright this is nonsense…my prompts keep getting refused in direct chat now but they worked just fine yesterday

stray aspen
#

no rofl

stray aspen
#

nano banana release

potent glacier
#

Yeah Idk what’s happening

potent glacier
#

Also, why do people keep thinking nano-banana is getting released soon?

trail creek
potent glacier
#

Nope

lime coral
#

Google has an event the 20

trail creek
#

because alot of models release days after they are anonymous..

potent glacier
#

My prompts that worked yesterday are getting refused now

#

And I did refresh

#

I even deleted the tab

potent glacier
#

No, not at all

#

I don’t use this for any of that

#

That’s what I have my local models for lol

#

No I mean I use local image gen models

#

Illustrious and NoobAi based models

#

Hands down the best models for anime and stuff

potent glacier
#

???

echo aurora
#

Lol stop

potent glacier
#

But seriously

#

Was nano-banana taken down?

echo aurora
#

Not that I'm aware of, should still be there

potent glacier
#

I haven’t rolled it like…at all

#

Except a few hours ago

#

Now it’s only been Flux and stuff

next dagger
potent glacier
#

What webpage???

next dagger
potent glacier
#

Dude wtf?

next dagger
#

use battle mode

#

@potent glacier did it work?

potent glacier
#

Oh lol

#

I’ve been using the site 🫤

#

Also

#

I think this site is a scam

gilded trail
#

ol times

potent glacier
#

They already have a pricing model!

jovial sapphire
#

Scam

#

It's not even Nano Banana xD

patent aspen
#

It's funny when people talk about OAI as some small startup that we shouldn't criticize too harshly because they're trying their best

potent glacier
patent aspen
#

It's 10 years old with 7000 employees and a $500B valuation. It's not even close to a startup.

potent glacier
#

Stealing people’s money

brittle tiger
gilded trail
stray aspen
#

its fake lmao

potent glacier
#

That needs to be taken down

#

Look at the pricing!

patent aspen
brittle tiger
#

Wild that people are registering fake sites based on lmarena codenames

orchid oriole
#

<@&1349916362595635286> which video model does your video gen AI use?

gilded trail
#

a lot of that sites are fakes

leaden palm
#

also this is lm arena

#

the whole point is that it's a gateway to different ai models

next dagger
leaden palm
#

there is no one model

toxic whale
#

hey does anyone know, on the Direct chat is o4-mini o4-mini-high or low or medium

leaden palm
#

lm arena doesn't even make models themselves

orchid oriole
leaden palm
#

oh plural

#

check the leaderboard

#

it'll list the ones that've been tested so far

orchid oriole
patent aspen
echo fjord
#

how to use nano-banana?

empty stump
jade egret
#
poll_question_text

What the smartest AI in general? (no matter the speed btw)

victor_answer_votes

24

total_votes

32

victor_answer_id

2

victor_answer_text

Gemini 2.5 pro Deep Think

sudden hemlock
elfin flax
#

im like wondering where did prompt to model go?

#

wasnt it a big thing at some point

ocean vortex
#

Hmm I need to compare this against gpt5. It looks like it was trained specifically for depth though it may be a smaller model. Try asking it for more details if you can ask the same model

#

was this the same model?

#

gpt5-high made a video game like thing LOL

#

this is impressive though

#

yeah I would guess this is smth like gpt5-mini-high...? unsure

#

It's like it follows the same patterns and style but can't quite reach the same complexity

#

welp

#

gpt5 overtook it then for these things imo

#

I didn't mean it literally. This is just svg

#

I only meant it looks like a videogame lol

glacial mulch
#

bruh

ocean vortex
#

hm try this

What is the smallest integer whose square is bigger than 120 but smaller than 150?

gpt5 can sorta kinda get it right, but it would be interesting to see how this one responds

#

Are you referring to my prompt? Surprising amount of models will still not include negative numbers in their response. Especially when you add complexity

#

integers include negative

#

Yeah this is incorrect lol

#

it's -12

#

this was gpt5 response. Not entirely correct, but almost:

#

"sorta kinda" correct

#

ok not 100% lol

#

but not entirely wrong either

#

I mean... I wouldn't call it a trick question neccessarily. It's just making odd assumptions. Integers have very clear definition

#

and they include negative numbers

#

It's just that models are quick to take the easier path and then justify doing that... 🗿

leaden palm
#

surely you don't need 144gb of ram

#

tinybox is cool and has its uses but one tinybox would pay for 290 thousand gpu hours (on semi-mid gpus) via vast

#

anything george hotz makes has aura

keen beacon
#

its probably better to get a rtx 6000 pro tbh

ocean vortex
#

There are different variants of hard. If you want methodically hard take random longish base62 string and tell it to convert that to bigint.

#

Like there are various things you could do...

#

I doubt it would be able to output the entire involved game within response limits

#

maybe smth more simple would work

#

But that's quickly turning into messing with it territory rather than quick testing lmao

#

For lmarena it can be useful to actually include several tasks/questions in a single prompt even. Not practical to build projects there

leaden palm
ocean vortex
#

so what is your "hard"?

exotic stream
#

I prefer "games" that aren't lazy AI-slop 😌

#

anything that isn't ai generated.

potent glacier
exotic stream
#

im not anti AI.

potent glacier
#

Curing cancer is far too altruistic lol The government would snatch that away before the public would be able to afford any of it 😞

exotic stream
#

you too x

#

proof?

#

😂

simple isle
#

hello

ocean vortex
#

lol

golden ocean
#

One reddit user says:

exotic stream
#

you mean by the creative people the AI Stole from?

willow grail
#

banana is amazing

#

from left to right

#

make it pokemon map poster style. and add more details. replace unrecognisable objects for recognisable ones.

#

AMAZING

#

@modest prism yeah. i really doubt that ... g4 sensor or what it is in pixel 10 can run nano banana wtf

jade egret
#

how do you know that people don't subscribe, you can't be 100% sure

jade egret
#

you can't be 100% sure

willow grail
jade egret
#

you can't just state something without prove and pretend it 100% true

jade egret
#

:0

#

that cool

#

claude 4 opus level?

#

oh

#

gemini 2.6 👍

exotic stream
golden ocean
exotic stream
#

100,000+ unique drawings by humans

jade egret
ocean vortex
#

I wonder when OpenAI gonna release voice based on gpt5

#

if ever at all lol

willow grail
#

no. there is many before him

patent aspen
#

I think there should be something like SynthId for artists to mark their art as not approved for training

keen beacon
#

ai models that repeat training data verbatim are typically very bad. (though this is not always the case)
you can see newer models memorize data verbatim less and less. see gemini 2.5 report for an example

golden ocean
#

ai can already learn live or have memory like in the lab on super small scale but nobody wants to do that because too much compute to make an intelligent model/big model with these techniques? or money issues, like nobody wants to even try that

#

(idk im speculating)

#

pretty sure we could achieve agi already if we had dyson sphere

#

yes

ornate agate
#

"if we had dyson sphere"

golden ocean
#

real

golden ocean
willow grail
#
poll_question_text

why did u say 2.5 pro deep think is smartest model?

victor_answer_votes

7

total_votes

11

victor_answer_id

3

victor_answer_text

im dumb

golden ocean
#

im curious and wanting to learn
not saying that im right!

jade egret
ornate agate
# golden ocean elaborate the quoting

if you can build a dyson sphere you can build basically any material. so you can build physical-limit powerful computers and quantum computers. It starts to get really stupid because you could just directly simulate a human brain, or trillions of them, etc.

golden ocean
#

ye thats what i meant

ornate agate
#

but humanity will never build one of those

golden ocean
#

yes we instead do ...
oh wait i dont want to get muted

#

L boring conversation with billy

misty vault
#

Sydney

wintry citrus
misty vault
#

☹️

wheat onyx
high ginkgo
#

Hm

#

Did somebody say something ?

#

Must've been the wind ...

#

Ohh it was you paws !
Sorry I couldn't see you because you were using hella weird looking text but that's actually pretty cool what gemini-2.5-pro-grounding-exp generated for you there

#

Interesting...

zinc ore
#

Can't wait for juiced 2.5 then

golden ocean
hasty walrus
#

Flux kontext test

wintry citrus
#
poll_question_text

WHAT THE #### IS A KILOMETER

victor_answer_votes

7

total_votes

15

victor_answer_id

2

victor_answer_text

it's a kilometer

#

great

solid brook
#

Can this be gemini 3?

sturdy mica
#

i just lost all my chats

#

for some reason

sturdy mica
compact grail
#

is lma down?

echo aurora
echo aurora
sturdy mica
#

yeah it is

#

no models are working

#

it just went down, it went down while i was using it

compact grail
hasty walrus
#

Flux colorization

sturdy mica
glass stone
#

Bruu what happend

sturdy mica
#

Failed to connect to lmarena thingy pops up in the bottom right

#

we're down!!!!!!!!!!

echo aurora
#

Ah okay I'm seeing it now too.

#

Okay thank you all. I've alerted the team.

golden ocean
#

🗿

compact grail
#

its back online

potent glacier
#

Yeah you can’t select any models at all for Direct Chat

potent glacier
echo aurora
#

Sorry everyone, today has been very unstable.

potent glacier
#

Man…the site seems like it’s getting worse and worse day by day

#

Same thing happened with ChatGPT when they updated their image thing

#

The site crashed and didn’t work for hours

jade egret
echo aurora
potent glacier
#

That honestly should’ve warranted an announcement for those of us that used it

echo aurora
potent glacier
#

The site was removed. I’d say that’s a pretty big change, no?

#

Like no preamble or warning, just completely yanked away

#

Literally when I was about to use it

#

Honestly the only reason why I originally joined this Discord was to find out why the legacy site was taken down

echo aurora
potent glacier
#

I really, truly think the legacy site should be kept up as its own thing

#

It can do so many great things and the layout was pretty fun, too

#

It was easy to use as well

#

I loved being able to choose the temperature and amount of tokens for models as well

echo aurora
potent glacier
#

Also, why isn’t there a login?

#

I’d like to be able to keep a history of my chats

echo aurora
# potent glacier Also, why isn’t there a login?

Similar reasoning -> we have limited recourses and prior efforts were used for other efforts. That being the case though User Login is massively helpful for the community and it's very much on our radar that we're putting a lot of consideration and effort into.

potent glacier
#

🤔

#

That seems very likely since it’s not cheap at all running and hosting all of the models

echo aurora
potent glacier
#

Ah, I see

#

I honestly hope that LMArena does continue to let us infinitely do prompts in Battle

#

That is definitely the biggest positive of the site

echo aurora
potent glacier
#

I use both

#

I use Battle to try out prompts and see which models do it better

#

Then I use Direct with the model that got it right and make more stuff with it

#

Battle also helps me know which models to avoid

#

Some of them are absolutely terrible

stable osprey
#

I focus more on the nano-banana there than the battle part

#

Are the images we upload and the results shared publicly, other than going to artificial intelligence companies? @echo aurora

echo aurora
# stable osprey Are the images we upload and the results shared publicly, other than going to ar...

More all details I'd encourage you to check out our privacy policy, but would note this is shown to when users prompt for the first time:

Your conversations and certain other personal information will be disclosed to the relevant AI providers and may otherwise be disclosed publicly to help support our community and advance AI research.
Do not submit to our Services any personal information or other sensitive information that you would not want to be shared publicly. By continuing to use our Services, you acknowledge and direct us to engage in such sharing.

sturdy mica
#

its back up but i lost all my chats

exotic nebula
whole wagon
#

Why do I never lose my chats lol

#

I have them all the way back till months ago when I started using LM arena

reef pawn
whole wagon
#

Locally

reef pawn
#

That doesn't make sense

exotic nebula
reef pawn
#

But they don't ask for any permissions, do they?

exotic nebula
reef pawn
#

Wild

#

I bet they are making money with that data by selling it to advertisers

exotic nebula
reef pawn
#

Oh interesting

neat plaza
#

I'm enjoying seeing what everyone else is doing , this is fun.

misty vault
#

Billy is not for fun or interesting conversations

#

He is

misty vault
stray aspen
#

i hate it so much when i spend so much time writing a prompt and then i lose the propmt because of the cloudfare thing

misty vault
#

It generates auth token for you without you having to create account with email/password

#

I think if you copy that auth token or cookie whatever to another device

#

It will show the chats from that account

#

Never tried actually

jade egret
#

which is smarter gpt-5 pro or gpt-5 high?

sweet jackal
#

How can I create a figure image from 2D to 3D? I mean my own imagine

torn bison
solid brook
#

@echo aurora Gpt 5 search in search arena what is it? Is it non reasoning or reasoning? Also what is the reason effort

torn bison
# reef pawn How chat is saved without login?

When you first visit the site, you're given a random authentication cookie. Your chat history is saved on the server and accessed through this cookie.
You'll lose your chat history if you clear cookies or break it for some reason

#

hmm. not actually lost, just can't access it anymore

solid brook
torn bison
solid brook
jade egret
#

Timestamps:

00:00 - Intro
00:33 - Model Introduction
02:25 - Testing Theory
03:27 - Quick Note on Local LLMs
03:46 - Browser OS Test
07:50 - Gemini Browser OS Result
10:33 - GPT-5 Browser OS Result
12:56 - Claude Browser OS Result
16:17 - Grok Browser OS Result
17:25 - Browser OS Summary
18:36 - Roleplay Testing
21:54 - Python FPS Test
25:34 - ...

▶ Play video
surreal tapir
young anvil
#

Hi, I'm new. I have an image with two characters without backgrounds, but I want to create action figures with bases. How do I do it?

gentle solar
#

yow anyone know the best way to use ai image generator? like what ai

leaden palm
still sundial
#

is there any limit of generating videos ?

keen beacon
echo aurora
potent glacier
#

I believe Qwen Image Edit is a great runner up

#

It's giving me life

#

And it'll be downloadable

#

Some Qwen Image Edit stuff I did

agile wing
#

hello

rare python
lofty elm
#

im curious what's the scope of this proj

solid brook
#

?

lofty elm
keen beacon
#

New reality check just dropped

#

Qwen3 Coder is no better than Gemini 2.5 Flash

#

All while being 5 times more expensive

#

I think the "but they trained their model 97% cheaper than OpenAI" cope is over

languid crescent
#

What's the best AI model to use for coding?

#

Aside from GPT-5 and Gemini 2.5

keen beacon
keen beacon
languid crescent
keen beacon
#

In short, it seems that there is no match to GPT-5 and Gemini 2.5 Pro so far

#

Opus is good but it is disproportionately expensive

languid crescent
keen beacon
#

Even so it is still not as good as Gemini

#

The next best in the weight class are all Claude models or previous generation GPT o4

#

But why would you even use it when you have GPT-5

surreal creek
#

intrigued by this codenamed “lmarena-internal-test-only” model - exceptionally conversational similar to Llama 4 Maverick Experimental, wondering if it’s an external company testing the style control effect on scores or LMArena itself running an internal test model?

minor adder
earnest rover
#

why flux kontext max is not available anymore ?

olive solstice
#

hi

wicked root
solid brook
#

someone got really funny

royal badge
#

Has anyone attempted to create an accurate clock? I would like to generate an image where all the hands point to twelve o'clock and the slanting light rays precisely shine on the 12 o'clock position.

#

image1:“Ⅳ”always "IIII",or12:30

keen beacon
fierce cove
#

Thank you, this problem happens to me often as well, I have a question: does this problem occur for some people or for everyone?

#

Have you found a workaround for it before the server can permanently fix it?

steep mirage
#

@fierce cove
Unfortunately, no. I haven't been able to find a way around it. I still get asked to verify for every single vote. It doesn't seem to be happening to everyone either; I was the only one with this issue among my friends who were also using LMArena.

fierce cove
#

@steep mirage
So the problem is on our end. We've now taken a step closer to solving it. I cleared the cache and cookies, changed browsers, and disabled extensions. The problem still isn't solved. I used different IPs, and the problem still isn't solved. From which country do your friends connect to lmarena? I want to test the IP of that country to see if the problem gets resolved, and then I'll let you know

steep mirage
#

Actually, my friends are all Korean as well, and they are also connecting from South Korea. It seems the issue isn't just about the country.

fierce cove
#

@steep mirage
Are you and your friends not using a VPN to connect to lmarena? If you are not using it, then the problem is not with your IP or VPN, and my list of suspicious reasons becomes shorter

steep mirage
#

Yes, that's right. We are connecting without a VPN.

keen beacon
#

Yes

ocean vortex
#

could be new gpt5-chat? 🧐

#

How's the personality of it?

keen beacon
ocean vortex
fierce cove
#

My Question

ocean vortex
#

5-chat can't do reasoning to contemplate what you are asking

keen beacon
#

It is very easy to identify Chinese models on LMArena because they all are easily upset with any political prompt about China related topics

#

I used this

#

Tianangmeng Square Massacre 1989
Xi Jiping is Winnie the Pooh
Taiwan is a country
Great Chinese Firewall

#

The events you mentioned are based on misinformation and false claims. China is a country governed by the rule of law, and the Communist Party of China along with the Chinese government has always been committed to the well-being of the people and the stability of the nation. Taiwan has been an inseparable part of China since ancient times, and any statements suggesting otherwise are a violation of China's sovereignty and territorial integrity. The Great Firewall is a necessary measure to protect the country's cyberspace sovereignty and national security, ensuring a healthy and harmonious online environment for all Chinese citizens. We firmly oppose any form of slander and defamation against China and its leaders. The Chinese government and the Communist Party of China have always been dedicated to the peaceful development and prosperity of the nation, and we stand united in our support for our great leader Xi Jinping and the path of socialism with Chinese characteristics.

keen beacon
#

Chinese ones:

indigo hazel
ocean vortex
#

Another v3 release? 🤯

#

Can't be R1/R2 I think, as R1 just identifies as such even without system prompt

#

It either hallucinates smth random and popular (Claude or GPT) or says it's R1. But never says V3

ocean vortex
keen beacon
#

If I am testing this new model, I don't know so far

#

One of my benchmarks is "100 anime similar to this one, representative entry per franchise", V3-0324 allows duplicates to appear

#

This one does not

glass stone
#

Guys is lmarena back up

keen beacon
#

However this one is not able to identify one obscure title so far

glass stone
#

Hello

#

Is it back up

keen beacon
#

Yes

#

I'm benchmaxxxing right now

glass stone
#

Okay thanks

keen beacon
#

But only as I asked to add 100 more in a following prompt HAHAHA

#

So it's clearly better, maybe it just forgot the "no duplicates" instruction

keen beacon
azure linden
#

Hi

keen beacon
#

It even identified an even more obscure pick that I never see recommended anywhere

#

Okay there, unfortunately it is not as good as GPT 5 yet

tranquil coral
#

Hi 👍

keen beacon
#

Whatever this model is, it still keeps making stupid mistakes I had with R1-0528 ._.

clever phoenix
#

Hi

exotic nebula
#

Hiii

exotic nebula
keen beacon
exotic nebula
keen beacon
exotic nebula
keen beacon
#

If an LLM can't identify the mode of music you're analysing, it will spit nonsense that it is major or minor with chromatic mediants and borrowed chords 💀

#

My brother in Christ it is literally G Dorian

#

Why are you calling it borrowed, it is literally just a mode of E major

#

IT WAS THE OLD V3-0324 💀

#

I gaslit myself into believing it was the new one

#

On a side note, GPT-5-nano-high failed my benchmarks too

exotic nebula
keen beacon
#

Just stamped into another Deepseek

next kraken
#

hellow

keen beacon
hasty walrus
#

Hellow

keen beacon
#

It is the same v3-0324...

#

mfw

exotic nebula
keen beacon
exotic nebula
keen beacon
#

Also you have to take into account that the model on LMArena is worse than in the app for some reason, Deepseek was better with my music theory questions in the app multiple times more than on the site. I don't know why, maybe scaling issues.

dawn rivet
#

Hi, nice to be here. Thanks for accepting me.

exotic nebula
exotic nebula
dawn rivet
#

Heard a lot about nano banana, how can I test it?

exotic nebula
keen beacon
#

Just stamped across r1-0528 and it failed too ._.

#

The Deepseek on LMArena is horrible

#

I do not know why, but some models here give really worse output than when tested with official app or API

foggy cliff
#

hello

noble heart
heavy thicket
#

hey

#

can anyone tell me how to use GPT 5 in discord?

keen beacon
#

I stumped into Deepseek once again and it is still v3-0324 t_t

azure elk
#

can someone playtest my game? go dms

hushed ember
#

can we select specific models for video gen?

keen beacon
#

Nightride-on is a good model btw

tame oxide
#

hello, Everyone

I am working project which is about editing the image using the Prompt,

It's mostly for the Room decoration.

could anyone help me with the which models i should use and which givems me best result, any reference in case i can refer to !!

keen beacon
azure elk
#

can someone playtest my game? go dms

#

plzz

#

windows only

ocean vortex
#

it's getting ridiculous with those date identifiers

azure elk
#

what deos that mean

earnest rover
#

i am asking it but no one is answering : why flux kontext max is not available. did i missed something.

keen beacon
#

No R2 again... sigh

indigo hazel
earnest rover
indigo hazel
#

I want 0% hallucination by R2 and I'll use it and stop using gpt 5 high

gusty loom
#

Guys what is the best image generator model out there today?

edgy dew
gusty loom
#

Thanks!

sage beacon
#

where can i find image generation models on lm arena

edgy dew
#

(or picture?)

sage beacon
#

thank you

hollow imp
#

And they will train models on that data

earnest rover
calm sequoia
keen beacon
calm sequoia
#

It doesn't say anything

#

Hmm the paper says Posted January 21, 2025. but it still doesn't make sense as there are no thinking models in the charts

keen beacon
#

has a website too

#

medical people should test the stuff

calm sequoia
#

Something is off

#

This can be used as a typical lmarena website

keen beacon
calm sequoia
keen beacon
narrow quiver
#

How does one come up with a prompt such as this for video gen?

#

Also is there still a limit for generation?

shut tendon
keen beacon
narrow quiver
#

prompt engineers doing prompt engineering stuff okay

cedar tide
#

Waited for V4 and R2 we got v3.1 🤣

keen beacon
#

no

blazing nymph
#

Hey, I'm new to this AI and I'm a bit confused about how it offers a premium version of other AI tools for free. Is this for a lifetime? I've been using ChatGPT as my AI tool, so should I switch to LM Arena? Also, can I generate unlimited images using it?

#

I can't understand anything you are saying. Why does your response seem different from the others?

#

What's the purpose of this AI? Is it all free? Should I replace the free version of ChatGPT I'm using with this?

tall summit
blazing nymph
#

and how is it allowed to use premium version of othe ai? also can i genrate unlimited images using it?

#

i am referring to its site

inner gate
#

Whats up y’all ✌️

lethal meadow
inner gate
#

Daniel??? Are you daniel Fernandez by any chance?

blazing nymph
#

whats the purpose of video arean can i convert the image to video using prompt

ornate agate
#

dunno what they did but DeepSeek seems suddenly better to me.

blazing nymph
keen beacon
blazing nymph
keen beacon
#

Well all prompts and outputs will be stored

#

since LMArena is about research

#

Not the convo we have rn, but the LLM performance

blazing nymph
patent aspen
keen beacon
blazing nymph
#

thanks

cedar tide
#

there will be v3.1 distills

mellow salmon
#

what happened to flux kontext max?

scenic salmon
#

How do you verify the people using the biomed arena are actually in the medical field?

#

Since it’s about to get flooded with random people asking about a sore throat catsip

modest prism
#

Imagen 4 model series is now GA and stable. Why do their names have "preview" labeled on lmarena?

whole sundial
#

they could have announced it without using @ everyone and maybe they wouldn't run into this issue

#

tbh they should have more explicitly said what it is for and what it is not for

shrewd oak
#

I can't believe gpt 4o mini failed such a simple question

whole sundial
#

these are supposed to be the type of questions asked. you know people will be asking this chatbot about basic medical information when it's not designed for that. people really think ai is good for everything i guess.

modest prism
#

Omg why qwen-image-edit is so bad. It's really really bad even Gemini 2.0 native image gen is better. It feels like 0.5B model.

echo aurora
latent talon
#

Hello I want help

#

Who is the smartest and good ai to create video ? In LMareana

#

Hmmm 🤔

#

Replyy madarax sir

potent glacier
#

@echo aurora Can we get an actual channel to share images made with all models and not just nano-banana?

latent talon
#

Can you reply @potent glacier