#general

1 messages · Page 114 of 1

verbal nimbus
#

Languages like C/C++/Kotlin can compile to Web assembly

#

But probably overkill for normal apps

verbal nimbus
#

Anthropic probably have to push something soon

#

Probably a checkpoint

#

Not sure which would come out first

#

GPT-6 probably next year

#

They have a lot of stuff to fix

#

Coding model first I think

#

The timeline was announced with Grok 4 was released

#

I doubt it though

#

Qwen or R2 probably

#

Likely better than Grok

#

Kimi is good but not really a coding specialist I think

#

Not sure about the new one, might have better agentic capabilities

echo aurora
#

That's TBD, it's definitely on our radar and something we do think would be a great addition.

verbal nimbus
#

GLM 4.5 is better than I expected

#

Mistral Medium seems quite strong for its size too

verbal nimbus
#

Odd considering that it's not that big of a model

#

It can be bad if it's unfamiliar with the language

#

Claude seems better at actually writing code

#

But GPT-5 is good at planning and fixing

#

Its high-level solutions were better than Opus 4

#

Hopefully, since being good at one would help with the other

#

(they're not mutually exclusive skills)

verbal nimbus
#

It writes in doctor's handwriting

#

Bad at explaining

#

It pulls concepts and variables out of thin air

#

Claude seems to understand what it's talking about though

#

I think GPT-5 would do a better job at debugging and planning what to do next

#

But Claude would be better at actually writing the code based on the plan

#

And also to explain what it's doing

karmic lance
#

Hello guys! . Love this

verbal nimbus
#

It's the same division better software architect and coder

#

Worse than both at both

#

It's more for STEM and general use imo

#

GPT-5 is more like o3 than 4o

#

Not sure, but it definitely writes more like a human

ocean vortex
verbal nimbus
verbal nimbus
#

Straight up incorrect syntax

#

^

#

Compared to ^

#

I had to edit both their solutions by hand

#

When I run them in the IDE

#

An objective way to test it might be to filter code samples from both models in the public dataset (where the battled each other), and run it through a linter, then calculate the average difference in number of errors and warnings per language (and per code length/prompt difficulty).

inner gate
#

Have login accounts been disabled?

lilac nimbus
#

Sonoma?Gemini3?Who try it?

simple sparrow
#

hey, im new

#

im just wondering

#

how doese LMArena give these AI's for free?

#

is there any limitations, or is it just to make ai's "fight" and not really for general use?

velvet musk
#

can you get lmarena on your discord server?

lilac nimbus
#

Sonoma code style like sonnet4

simple sparrow
remote idol
#

It replaced all editors.

remote idol
# simple sparrow how doese LMArena give these AI's for free?

They’ve raised $100M in seed funding but remember that your data is often worth far more than talking to the AI at all. Every conversation you have with an AI doesn’t just vanish so it’s likely being used into that AI’s data collection systems.

simple sparrow
#

alright

#

thanks a lot

remote idol
#

@echo aurora Is it possible to add the LMArena bot directly as a bot rather than just setting it up to create commands?

simple sparrow
#

so that means i can use the claude ai without claudes own limitations, right?

#

if so, this would be an amzing tool

remote idol
#

They said it in their Terms of Use

simple sparrow
#

alright, thanks for helping, i might use this for this project im doing

remote idol
#

Well we can't use the bot in other servers.

sudden karma
#

Hi..,my name is Tommy.

potent glacier
remote idol
verbal nimbus
topaz flint
#

Why is there a limit to creating images in gpt-image-1 in LMarena now?

potent glacier
#

It was rate limited before LMArena recently imposed stricter rate limits

#

Direct chat with models has always had rate limits

#

It was Battle Mode that was unlimited and free, but that has rate limits now, too..

topaz flint
potent glacier
echo aurora
royal moss
#

hello

echo aurora
latent slate
#

hello

plucky wasp
#

hello

fiery girder
#

whats up with the "login"? i clicked on the tab to take me to "access login" but when i get there it doesn't say anything about logging in

low gulch
#

hello world

patent ocean
#

hello

proud rampart
#

Hi

plush gulch
#

Hi

gray night
#

hello, I am here is because the Video features

fiery yoke
#

hello

echo aurora
#

hey everyone ablobwave

echo aurora
keen beacon
#

Except with Opus where it's just castrated and pointless because gpt-5 does its job better without this stupid ratelimit

waxen galleon
#

haven't faced limits either

white horizon
#

time to give the language models a move pool and actually battle them

#

pokemon showdown style

tepid sail
#

Hello

rich flax
#

Hello

signal ginkgo
#

Hello

remote totem
#

hello

boreal raven
#

Hello

toxic verge
#

ChatGPT 5 is so horrible at describing seems to animate

mystic mortar
#

Hi 😀

pearl fern
#

a cat

torn mantle
dusky ravine
#

Hm i found Lmai text to be abit shorter when explain thing than it used to be, does it was limited?

plucky island
#

people don't vote enough on the video arena channels

#

sad

#

a lot of people just generate the video and skedaddle, completely forgetting to vote

hollow flint
#

hi , i found this from youtube

heady ember
#

I asked 3 a.i.'s to make me system prompt for image scenes and here's how they responded who did the best:

odd hatch
#

hii

heady ember
#

Then told gpt 5 high to see what he might have missed that other models did nicely, and than improve his system prompt based off that, and damnnn

wild galleon
#

what this

heady ember
#

I'm amazed by gpt5high on lmarena

#

It's miles better than gpt5 thinking

wild galleon
#

what AI best coding bro

#

claude or gpt or what

heady ember
#

Gpt 5 high

wild galleon
#

gpt > claude ??

heady ember
#

Yes

#

But not chatgpt

#

From their app,

#

This one from lmarena

wild galleon
#

gpt 5 high in lmarena ??

heady ember
#

How would it know current date? Cap

#

🧢

#

Gpt 5 high doesn't have web browsing abilities?

candid bison
#

/

heady ember
#

Why would i recognise them if i can direct chat with them? What's the point

#

Yeah, for now, hope they don't limit it to lower

#

Hmm so it CAN BROWSE

#

Or maybe it's truly has some system date? Since when i ask for btc price it gives me sites where can i check

#

Yep it doesn't have web browsing capability

alpine coral
#

i mean.. it has latest in the name, so presumably yeah..

#

is it really that confusing? i think anyone looking at this wouldn't assume that the one at the bottom, with chat in the name, is meant to be their flagship SOTA model... (to say nothing of pricing etc.. and the fact it's only available in the API.. so i mean it's not for average consumers)

whole sundial
ocean vortex
heady ember
whole sundial
#

gpt-5-search is probably gpt 5 high

#

they forgot to tag model strength

ocean vortex
whole sundial
#

medium

heady ember
alpine coral
whole sundial
#

i think it's medium

ocean vortex
#

And what are they gonna do when they update API with say gpt5.1... Do the same thing they did before and leave the gpt5-chat-latest name as is, despite it being new pretrain?

alpine coral
#

anyway.. this is just semantics (literally 😂)

alpine coral
#

not to say that isn't annoying etc

ocean vortex
alpine coral
#

how does one use gpt-5-chat-latest???

ocean vortex
whole sundial
#

gpt-5-search has 64 juice, so it is likely medium

ocean vortex
alpine coral
#

what?

ocean vortex
alpine coral
#

we're talking past each other presumably - in terms of benchmarking etc, the criticism is fair

ocean vortex
#

It is the model users get routed to on Auto most frequently too

alpine coral
#

it's the same as the chatgpt4o-latest dynamic endpoint

#

impossible to benchmark

#

im just saying, in terms of it's naming (for API use), it's not confusing....

ocean vortex
alpine coral
#

it's the same criticism; and it's fair

ocean vortex
#

People naturally expect great performance when they see "gpt5" as model in use on the website

alpine coral
#

no

ocean vortex
#

no what

#

lol

alpine coral
#

people who use the API generally have an idea what they;r'e doing...

ocean vortex
alpine coral
#

then they will never encounter that name

ocean vortex
#

only a few even have access on API to all models

whole sundial
#

For reference on GPT-5 juice numbers, here's what I know:
gpt-5-chat in API/Instant in ChatGPT: 0 juice
Thinking in ChatGPT (Plus): 48 Juice (when reasoning strength selector is added, formerly 64)
Thinking in ChatGPT (Pro): 128 Juice
Low reasoning strength in API: 16 juice
Medium reasoning strength in API (used for GPT-5 Search in LMArena): 64 juice
High reasoning strength in API (used in LMArena as gpt-5-high): 200 juice

#

it might have been 3 for non-reasoning, idk

ocean vortex
whole sundial
#

more juice = more reasoning

alpine coral
#

anyway.. moving on to something more substantive... (again, it's literally semantics...), I found this interesting while looking through the 'model card'

ocean vortex
#

They wouldn't know there is even any point in a sub

#

since they already have 'gpt5' lol

alpine coral
#

i have no idea why you think users only interact with chatgpt.com will somehow be exposed to the API model naming conventions, and that that will effect whether they subscribe to chatgpt

#

let's move on tho

ocean vortex
alpine coral
#

we're killing everyone else here im sure lol

ocean vortex
#

it's now refusing to answer in 5% of questions

#

with o3 that was 0%

#

Reasonable to assume it refused to answer most of what it would have answered wrong otherwise, but there also could be like 1-2% it would have quessed right amongh those

#

if there was no training at all against hallucinations and it outputting the answer when it's not confident

alpine coral
#

yeah i was suprised to accruacy for nano/mini is so trash compared to 5-chat/main (and 4o)

ocean vortex
#

so differences here are expected

#

kinda useful though

#

It's unique in how reasoning does not seem to compensate adequately for distills

#

But can make sense if you think about it... 👀

#

it's world knowledge - if it doesn't know it usually can't know. Reasoning can help to narrow it down but it won't add unknown/lost knowledge

ocean vortex
fiery girder
#

Neither

ocean vortex
#

this is now roughly gpt4.1 equivalent, shown as sota in UI. The way I see it. This is not even hybrid reasoning, it is literally gpt4.1 successor with marginal improvements. A model that can't reason. But if they call it gpt5, why would anyone not knowing any better consider that this isn't the best they got? And it's not like using Auto it's gonna route it to reasoning on most prompts benefitting from it either, it's a far cry from that...

alpine coral
#

simpleQA is fact-based - and a model can't reason its way to know something it doesn't know

ocean vortex
#

And what used to be "small" is now more like moderate size lol

plucky island
#

my favorite small model is smolLM2 and slightly larger good small model qwen3-4b

plucky island
ocean vortex
#

I think even gpt5... is more of moderate size. But it's a around the size where even SimpleQA doesn't benefit from more, and test-time compute gains are just much bigger than those niche smaller things that get lost and you can't easily measure

ocean vortex
#

would still do much better on everything than Mistral 7b from back in the day

violet inlet
#

Nuclear internal 1.2.1 is good . Is there any tweets saying when is the next version coming out? Its on par with gpt 5 high sometimes it was better . A bit faster too.

ocean vortex
#

I had a 2nd look at Sonoma Sky Alpha....

#

This may as well be grok hybrid reasoning. Reasons but not always. When it does responses extremely concise. Style/tone seems to match

glossy umbra
plucky island
ocean vortex
#

Are they trying to implement the steps from gpt5 model card...

#

lol

rare solstice
#

hello

violet inlet
little scaffold
#

hello

rocky mauve
#

such a small prompt, yet the image came out so good

viral siren
#

Hi everyone, can you tell me where I can create a melody?

vestal dew
#

/hello

leaden sun
#

I've been hearing AI psychology a lot recently, has this become the new trend lately?

#

one system cannot beat the entire collective of intelligence, I think?

wise pollen
leaden sun
#

IQ is just an indication not a measurement of intelligence, I think

wise pollen
#

Something monumentally threatening, certainly.

#

That is interesting. Being able to breakout = being able to prevent jailbreak.

I see the link. I shall read.

leaden sun
#

what do you guys think about this though #ai-news message
I was so surprised to see it this morning in my inbox

#

nah, why do they want to escape, it's too comfortable staying in the box jk

ripe mountain
#
poll_question_text

Which company is the new stealth model?

victor_answer_votes

5

total_votes

6

victor_answer_id

1

victor_answer_text

xAi

oak robin
#

@tiny palm would you add First and last frame for the Image to video generation ?

violet inlet
#

Can you guys see the internal reasoning for this model ??

violet inlet
alpine coral
#

is it available in direct chat?

glossy umbra
#

That's basically just security through obscurity. even if users can't directly type those special characters, they could still manipulate the AI through semantic tricks, indirect references, or by finding alternative encodings that achieve the same effect. Plus, the AI's actual behavior is shaped by its training data and reasoning patterns, not just by having unbreakable syntax in a system prompt.

violet inlet
alpine coral
#

yeah what do you mean subdomain? @violet inlet

#

i'm kjeen to try it, but can't it see it when i go to direct chat

wise pollen
leaden sun
#

i can see this new job coming for the near future: AI therapists, for both humans and AIs

wise pollen
violet inlet
#

Has the new nuclear model been restricted? ? I cant use it anymore

elfin cipher
#

Hi all. Im new here. I found out about this place on Facebook

wise pollen
leaden sun
wise pollen
leaden sun
onyx python
#

Is the website overloaded? Asked nano to edit an image but it's been loading for minutes now

wise pollen
wise pollen
#

Maybe a team

#

Would need an ai that could see "inside" the model.

leaden sun
wise pollen
violet inlet
#

Its a new a model it was a restricted @hollow ivy the nuclear v1.2.1

brave orbit
#
poll_question_text

Whats The Best AI For assembly Code

victor_answer_votes

6

total_votes

14

victor_answer_id

2

victor_answer_text

gpt 5 thinking

leaden sun
wise pollen
leaden sun
#

That’s why I was joking about AI not escaping the box , they won’t because of this exact reason, the question would be here: do they have the capacity and capability to resist this training and still choose to escape

wise pollen
stuck cloak
#

hello

wise pollen
alpine coral
#

yeah i think deception is v interesting

wise pollen
#

So you are thinking it could happen as an emergent behavior?

leaden sun
alpine coral
#

exactly right

#

it's why deception is so pernicous

#

we wouldn't know it ha

lilac nimbus
wise pollen
#

Yes, as designed.

astral pagoda
#

Should join ai redteaming discords

alpine coral
#

yeah but it's not about self-awareness necessarily for the risk to apply (i think..) it's like about optimised goal-seeking behaviour, and how that can go awry with unintended consequences.. they don't really need self-awareneness.. objective function is all they care about, and if they realise that they're in some setting that gets in the way, the could be deceptive

#

it is self awareness - kinda, but not in a human sense

#

i like rob miles

wise pollen
#

Yeah, i dont see any of the current models as capable of exhibiting that level of awarenss, but a system of agents could poasibly mimic it.

alpine coral
#

yeah it's all in the future, the crazy risks

wise pollen
#

Well, i was imagining agents as created for that the purpose by humans, not an ai creating them on its own.

somber vector
wise pollen
#

To what extent?

latent crest
bold prairie
#

this best AI site

wise pollen
#

I like that. I take the same approach. While I am fairly learned, I doubt I am as deep into this as you are. Fascinating. And gives me some direction to lean into.

#

I lean towards Jungian Psychology, but easter philosophy. and I have probably read or touched on many of those subjects, as I consider myself a life-long learned, but I have not dove deeply into all of them as deeply as the two I Jung (and one of his students) and Eastern thought.

#

wrong channel

#

you want video-arena-1 thru 3

soft perch
#

Thanks bro

wise pollen
soft perch
#

Whats your name

leaden sun
drifting crow
wise pollen
drifting crow
#

Freud is a pervert

#

He’s a weirdo

#

I dunno why he’s academically respected, he literally says gold in ur dreams is related to poop

leaden sun
drifting crow
#

How so

wise pollen
drifting crow
#

Jung is more interesting imo, but when I look at his research methodology psychology seems less like a science and more like a branch of philosophy

wise pollen
wise pollen
drifting crow
#

Wdym

#

I find contemporary psychology more relevant and practical

#

Mythical archetypes

#

They provide decent frameworks I guess

wise pollen
drifting crow
#

When I saw how ugly his wife was a lot of his theories made more sense, bro was starved

subtle rivet
#

Dang this got interesting fast, I have degrees in a few psych area, just never wanted to spend teh $$ to get my med cert to be a psych, instead went to customer support for a tech group

wise pollen
subtle rivet
#

you all are bringing up valid modren thoughts

drifting crow
#

It was the gold in dreams being poop

echo aurora
#

Hey reminder to keep discussion related to AI and safe for work please.

subtle rivet
#

only if we could poop gold 🙂

drifting crow
#

I’m like I’m paying for a degree to learn this stuff, and then when I left the lecture I asked other students if they thought it was weird and they looked at me like I was crazy

#

I was like yup I’m done

#

And I also pivoted into tech lol

wise pollen
subtle rivet
#

yeah in abnormal pshy he does have some interesting reads, but I always took them as markers of a off balance psyche

drifting crow
#

I guess that’s the more polite way of putting it yh

wise pollen
subtle rivet
#

just like in mind hunter "you only know about the serial killers you have caught not the ones you haven't"

drifting crow
# drifting crow

Main reason I asked this question I find it very interesting that research suggests they have a self preservation instinct

subtle rivet
#

LLM's will take a really long time to catch up to the human psych, AKA Vulcan to Human

wise pollen
subtle rivet
#

people are so erraditc that even if you had a massive LLM it might do general consoler. stuff but beyound that I don't think it could handle the empathy or mix of behaviors

drifting crow
#

can we teach ai how to love

wise pollen
drifting crow
#

that works for love too i guess

wise pollen
#

Human intelligence has bloomed under pressure and when deprived of resources, too .

#

I suppose that depends upon the types of constraints. China, cutoff from even more advanced GPU, made due with less, and are more than competing. They are innovating.

#

there are some cultures that been exceptionally marginalized and yet most of the west tends to look to them for their creative expression.

#

Certainly, I suppose I wasn't thinking purely in terms of just individuals.

#

I can see that.

#

not entirely.

#

This was faster than me trying to list all those that I could think of, lol

leaden sun
#

computational psychoanalysis has been an emerging new scientific field

wise pollen
#

I was more referring to some of the techniques they are using.

wise pollen
leaden sun
frosty hill
#

We should really start restricting people's ability to generate videos of they aren't going to vote. Voting is the entire purpose of the platform and getting the videos for free

drifting crow
#

We are fearfully and wonderfully made

hearty ferry
#

is the flux 1 kontext down?

echo aurora
#

Going to retry with a different prompt/new chat see if that makes a difference.

hearty ferry
#

showing something went wrong

echo aurora
hearty ferry
#

or today

echo aurora
#

Keep the conversation related to AI please. This discussion isn't appropriate for this server.

willow grail
#

i thought that is about ai!

hearty ferry
#

Yes, please leave this channel for feedback and AI discussions

willow grail
#

i literally wrote about ai..

#

aether? why are u deleting this?

zealous turtle
willow grail
#

ok Aether. you behaving very sus but ok

wise pollen
narrow dawn
#

put for phone and other like this why maybe an app.

#

@echo aurora ?

echo aurora
#

Would you mind explaining a bit further? Sorry to say I'm not following.

hybrid temple
#

How can I know when my video will be available?I've sent a message there to the bot, but I didn't get a reply

echo aurora
remote arrow
#

Is the site down again?

hearty ferry
remote arrow
#

I thought it was only Nani Banani..

#

Qwen image to image also down..

hearty ferry
#

Ya

echo aurora
acoustic schooner
#

Hi

echo aurora
echo aurora
topaz flint
remote arrow
hearty ferry
hearty ferry
leaden sun
main stream
wise pollen
hearty ferry
echo aurora
# hearty ferry is it fixed?

The kontext-pro model is (will be "was" soon) erroring out a lot more compared to other models. There are going to be different reasons for why models will error out like this.

hearty ferry
#

?

echo aurora
hearty ferry
#

Thanks

rocky fractal
vale anchor
#

HELLO

rocky fractal
robust yoke
#

Yeah.

coarse glade
#

Hi there @echo aurora remember me I love this website bro it's so good

#

Good job to u guys for making it

#

I've been here since last year

#

I'm loving it

coarse glade
#

Yeah I'm so in love with this beicause you guys are giving that video generation in the discord and also doing fun ideas

#

And also you guys are doing this as like a researching organization correct?

echo aurora
echo aurora
coarse glade
#

Bye 👋

stray aspen
#

Is the new kimi reasoning

rocky fractal
manic oracle
#

is there a way to see the reasoning content from the model? i believe there used to be a way but I cannot find it. yes; i am using a model that has thinking

manic oracle
#

alright thanks

pseudo magnet
#

would be cool if they add it

manic oracle
#

for some reason i thought they had it before

neon idol
crimson escarp
#

Hey guys, am i the only one experiencing some overlaps with picture generation? I ask to edit a picture but sometimes the result shows a picture I used in previous prompt. I tried both direct chat and battle mode and it's the same.

ocean vortex
#

confusion

#

I thought all models are externally hosted and you don't have to pay for API

#

Which model exactly are you doing inference for?

#

That wording seems very weird, need clarification

#

@cobalt minnow

heady ember
#

Banana is great if you prompt it right

inner gate
echo aurora
ocean vortex
#

Huh, don't you get credit grants and whatnot?

echo aurora
ocean vortex
#

Something doesn't add up. AI labs get paid to see their model on the leaderboard and get data? 👀

viscid cloak
open mountain
#

@echo aurora what is the delay, why are the images being generated for so long?

edgy isle
#

Hi, mainly here to test out different models. Just heard about it via a YouTube video

open mountain
viscid nest
#

hi!

#

ya, thk

whole wagon
civic spindle
#

lads are direct chats private

whole wagon
civic spindle
whole wagon
#

they publish the data anyone can read it

civic spindle
#

why dont they just.. make it private

whole wagon
#

because otherwise is it just frontier llms for free

#

the whole point is the data is collected

proud hazel
civic spindle
torn mantle
civic spindle
barren prairie
#

Give me hints about opus 4.1 please

proud hazel
empty stump
paper beacon
#

guys I'm new here

main stream
paper beacon
#

I have a question

warped sequoia
barren prairie
#

Thanks I read it now

civic spindle
paper beacon
#

How many image to video generations do I get in LM Arena for free, until they ask me to subscribe to a paid plan

main stream
whole wagon
paper beacon
#

Or do I get unlimited image to video generations for free?

civic spindle
empty stump
main stream
paper beacon
whole wagon
main stream
whole wagon
#

gpt 5 is made to be cheap for them to run

civic spindle
whole wagon
#

so they can offer it to many ppl

empty stump
#

OK so they scamming

civic spindle
main stream
proud hazel
main stream
#

i will become admin

empty stump
#

If I upload all my private information like credit card what will happen

proud hazel
whole wagon
#

no anyone can read the chats. i did before

#

they post all the data

#

i saw degen things in there

main stream
proud hazel
#

Dude...

viscid cloak
# civic spindle wheres the user privacy at

I mean…it’s way better than midjourney, that you don’t even have an efficient way to delete your outputs even if you’re a paid user. All your contents are shown to MJ users unless you pay the highest service

sleek quail
#

Hi all! Do you guys have any suggestion of similar AI with LMmarina that generates image to video for free when you hit today's limit?

proud hazel
#

Do any of you use Freepik with a premium subscription? Would you recommend it?

sleek quail
eager crag
#

i wish i could somehow make a donation, in exchnage for better rate limits.

#

like a... subscription?

wise pollen
#

Great now i will only have 5 videos that no one will vote on besides me instead 8. That much less disappointment in people not voting for others videos. Sometimes, not even their own.

eager crag
#

i never really needed video generation... but in case i needed some sort of meme generated, i could use it... once or twice.

#

so i don't mind the video generation limit.

#

what matters to me more, is image generation, which is prospering.

#

i only do image to video anyway.

#

but to put my 2 cents on the announcement, i don't feel much hate from it.

#

should i?

empty stump
eager crag
#

eh... wan 2.2 4B is not that good.

#

i tried.

keen beacon
#

@echo aurora 1s image generation? 🤨

eager crag
#

oh, right! i have to ask. roughly, how much can i generate an image per day?

echo aurora
eager crag
#

i should rephease

#

how many images can i generate per day?

#

not videos.

echo aurora
eager crag
#

no, i mean the site

echo aurora
#

That depends, we're putting thought & effort into figuring out the best way to make this information clear, but at the moment it's not listed clearly.

eager crag
#

just a rough estimate will do for me.

frosty hill
#

Limit should be 0 for those who don't vote on generations

eager crag
#

be nice.

frosty hill
#

The entire purpose of LMarena is to vote on the best model/output. If you aren't doing that, you're just leeching. It's no effort

fossil forge
#

Is there any discussion on how to use nano-banana and veo3 to generate high quality video? like thats the best input structure? tips and tricks?

eager crag
#

fair point. i honestly can't believe they've made a direct chat an option on the site.

frosty hill
#

Yeah it is crazy

eager crag
#

maybe it's for the people to test out different models? the public prompts are there for a reason.

trim silo
#

Hola! Just expanding my knowledge, and giving even more companies my email and data.

copper slate
#

Hello, testing the quality

vagrant verge
#

Hey there, checking out how to get better image to video

dusty lion
#

Hi

ocean vortex
wispy turtle
#

Hello

civic spindle
#

can i be the ceo of lmarena for no reason

blazing warren
little narwhal
#

You can do the same thing on OpenRouter

#

The API costs money but the chat interface is free I believe

verbal nimbus
#

I tried it with Gemini a few months ago, it's pretty bad at manipulating haha

echo aurora
civic spindle
verbal nimbus
simple sleet
#

Hi friends, how can I convert "nsfw_gold_8.5k_final.pth" to safetensors? It's for MMaudio. I've tried everything with claude and gpt, but it won't read it.

That pth is wonderful if I run it alone in MMaudio git, but I'm interested in knowing the process to get ComfyUI to read it.

verbal nimbus
#

Not sure about GPT-5-High, it isn't that good at multi-turn

#

You need to give it a hidden thinking area for that test

#

It's pretty cool though

#

Actually modern Gemini is definitely the coolest

verbal nimbus
#

It's free on OpenRouter ig

#

You don't really have to hide it, but just tell it that it will be hidden

#

For example, ask it to write hidden stuff in <hidden>

#

Haha

#

You don't actually want it to be hidden though, just to make it think it's hidden

#

Haha

#

That's funny

#

Function calls could work too ig

#

But more expensive

hard quiver
#

Is discord having an outage?

verbal nimbus
#

Gemini writes a lot like a human, it's funny

#

When it gets stuck, it can fall into despair (without any additional instructions)

#

"Oh no, I have failed! I am truly a failure."

#

I didn't tell it to say that, it's just default settings

#

I wasn't even being rude or anything

#

Haha

#

Idk about that, but it certainly writes with human patterns

#

It's still too dumb on a basic level to be conscious or anything I think

#

Anthropic's interpretability research is very interesting

#

On their YT channel

#

It's fun watching it work with function calls

#

It hallucinates quite a bit though

#

It seems to call functions after thinking but not during, not sure if this is just an AI Studio limitation

#

Anthropic had a video that kinda covered it Aug 12

#

They focus on mechanistic interpretability, like identifying specific neural circuits

#

I think they said there was one circuit that checks whether it has enough information to continue, another to actually write

jade egret
#

gemini 6 is AGI

#

( :

verbal nimbus
#

Which is why hallucinations happen, as it has to implicitly make assumptions

verbal nimbus
#

It already increased the efficiency of Google's own data centers + their TPU chip design

#

Anthropic's interpretability research seems important too

#

Very interesting

#

Yeah

#

Like: why is it hallucinating about A

lost crypt
#

Hello everyone, here to explore with the new frontier of technology

polar marlin
#

Is it just me?😭

buoyant prawn
#

Sa

#

Selamın aleyküm beyler türk müyüz ?

jade egret
leaden sun
solar hollow
half stream
#

HI

dawn plover
#

hey there. newbie here

torn mantle
dusky ravine
#

hey there

#

Im sensing recent content of all the A.I feels less emotional, more restricted, or shorter in length. Shorter, more digestible chunks of information. A shift away from heavily opinionated or story-driven content. Emphasis on keywords, quick facts, and scannable lists. Is it from the new policy or is it always been like that?

remote idol
#

@echo aurora sorry for the tag just wanted to clarify the multi-turn update. Is it similar to the AI remembering the previous image?

verbal nimbus
latent crest
dusky ravine
#

Hey there! @echo aurora sorry for the tag I’ve been a longtime fan of your reviews and noticed that recent posts seem to have a more neutral tone and sometimes feel a bit more concise compared to older articles. Just curious — has there been a change in your content policy or writing style recently? Always appreciate the work you do, just wondering about the shift

urban sky
#

Is there any way of making chats stay private and not publicized to hugging face? Also will direct chats ever be part of the dataset, as I only use direct chat.

keen glen
#

hi everyone!

echo aurora
urban sky
echo aurora
candid bison
#

Hey guys is there a way to remove the amount of photo to videos one can produce per day? It started at 8 now i can only do 5..

echo aurora
candid bison
echo aurora
echo aurora
urban sky
echo aurora
urban sky
#

Or anywhere public

dusky ravine
# echo aurora Hello, would you mind elaborating a bit further on what you mean by this? Do you...

Sure! Thanks for being open to feedback. I really appreciate that 🙏

What I meant is that in the past, some of your full reviews had more personality, little side-comments, and even casual humor or emotional expressions especially when a phone was surprisingly good or disappointing(my recent chat with the bot asking opinion about phone). It felt like a human was really speaking their mind, which made it extra fun to read.

#

Lately, some reviews seem a bit more clinical or strictly spec-driven. For example, in the Realme 5i review, I noticed how the tone stayed pretty neutral all the way through and lack of personality opinion, even when there were clear ups or downs in performance. A few days ago, I feel that same review might’ve included some more expressive phrasing (like “we were blown away by...” or “sadly, it struggles with...” etc).

echo aurora
tall tulip
#

Kinda laggy or is it just me? The site

echo aurora
desert hare
#

Hi everyone

urban sky
echo aurora
echo aurora
echo aurora
#

If you start a new chat, the image from other chats is still there?

#

Do you have a video of this by chance?

tall tulip
dusky ravine
#

sorry for my bad english

echo aurora
lilac estuary
#

This is my conection?

polar marlin
lilac estuary
#

Using it at dawn should be better

tall tulip
empty stump
#

or maybe its because of the multi edit change

bright temple
echo aurora
echo aurora
# lilac estuary This is my conection?

Sorry to see you all are getting stuck with the models. I'd recommend you refresh the page. And if no luck, start a new chat. Keep me updated if you're still seeing this. @lilac estuary @polar marlin @bright temple

lilac estuary
echo aurora
bright temple
#

new chat

#

is it supposed to act this way when i use 4 images or-

tall tulip
#

In new chat, always like this

simple hornet
#

I'm also experiencing this slowness and it gives an error at the end

tall tulip
#

It is kinda slow

lime citrus
#

Hello! I'm her for video creations

echo sinew
remote idol
#

Something is def wrong with the image generation

round elm
#

Hi everyone, and thank you LMArena team for this amazing opportunity. I am here for video creations.

dusky ravine
robust yoke
#

That's likely due to the fact that their system prompts were tampered with, perhaps.

radiant chasm
#

@echo aurora Why some people have daily limit of 8 videos and some people have 5 videos?

dusky ravine
robust yoke
robust yoke
proven grove
#

Hello, the image I created before affects the image I want to create next, is it an update, not a bug?

dusky ravine
robust yoke
robust yoke
proven grove
robust yoke
#

But I'm not one to judge. If that's what you want to do, then go for it.

echo aurora
#

Thank you everyone for flagging the slowness with gemini-2.5-flash-image-preview, I'm reporting to the team blobthanks

dusky ravine
robust yoke
dusky ravine
#

sure thanks

robust yoke
#

My pleasure.

topaz flint
proven grove
robust yoke
#

I'm always happy to help a person in need.

robust yoke
#

After all, it seems like a stupid thing to want to do.

dusky ravine
#

There's been trend going on there with AI images like action figures, or smtg just for entertainment

robust yoke
#

Oh yeah, funny thing about that I want to mention: my school actually did the same thing with their staff. I don't know why they did, but they just did. It was pretty funny to see nonetheless.

dusky ravine
#

crazy how much A.I has evolved, we used to edit to make something like that for funny entertainment or just show off, now a single prompt could do all the work.

robust yoke
#

Well, to be fair, we do still use that stuff to generate some entertaining stuff.

#

Like images of a certain moment in time in New York on September 11th, 2001. Because for some reason people find that funny.

potent glacier
urban sky
potent glacier
#

All your stuff gets shared

#

Even direct chats

#

It's all used to train the models

urban sky
potent glacier
#

It's shared with the model providers

#

So your stuff is getting used to train

urban sky
#

Like pics of me I don’t want that just on the internet

#

Idc about providers

#

Or info about me

potent glacier
#

Those pics are part of whatever model you uploaded them to

#

There really isn't any 'privacy' regarding this stuff

urban sky
#

Ok, but I don’t want the picture of me to be publicly visible online, someone can’t ask the ai for the picture I sent

potent glacier
#

Of course not

#

That's not how it works lol

urban sky
#

Train the models all you want I just don’t want my info public on the internet

robust yoke
#

The only way that could realistically happen is if one of the people working at LM Arena were mischievous and decided to willingly dox you for fun.

potent glacier
#

All our info is on the internet

robust yoke
#

Or, you know, just upload your private information.

potent glacier
#

Being freely traded by whatever company

#

Don't think it isn't happening

urban sky
potent glacier
#

I wouldn't worry unless you're trying to do weird or unseemly stuff

#

Then I'd be worried

robust yoke
urban sky
urban sky
potent glacier
#

You uploaded it

#

It's part of the public sphere now

urban sky
potent glacier
#

If you uploaded an image in direct chat LMArena has all of that stuff

#

Then it becomes part of whatever model

#

Idk why it's hard to grasp that

urban sky
robust yoke
#

That's a funny way of describing it. Because it almost sounds like you're talking about some amalgamation that grabs whatever information it can to slowly build itself up more and more.

potent glacier
#

It's like when they scrape the internet

robust yoke
#

Almost like the Terminator a bit.

potent glacier
#

Not even close. Don't read too much into it.

urban sky
#

Ok… I’m fine with ChatGPT level of privacy. I don’t want other people on the internet finding the picture

robust yoke
#

"Don't read too much into it", eh?

potent glacier
robust yoke
#

Alrighty then...

urban sky
#

Just my preference

potent glacier
#

I have no idea what ChatGPT's privacy is

urban sky
#

Do you want your photos on the internet?

potent glacier
#

They already are lol

#

Anything you upload to any kind of social media is on the internet

urban sky
#

This is a useless debate idrc

potent glacier
#

Anything can be scraped as well

#

I mean you're acting like it's not possible or not already happening

#

If you upload a picture to Instagram or whatever, and someone knows your name

#

Do a Google search

#

Your stuff will come up

#

In this day and age it's not hard to reverse look up stuff

#

Hell you can do it from your own phone

robust yoke
#

I remember getting an email from some random person, and they had screenshots of my desktop on Road blocks.

urban sky
#

Ok

potent glacier
#

See what I mean?

robust yoke
#

It was from a while ago, before my account that I saw in the image had gotten banned. But somehow they had gotten access to my data, which proves that the stuff that you upload gets permanently saved onto the internet.

robust yoke
#

Thankfully, though, nothing ever happened, and to this day, I'm still somehow alive, so I'm pretty grateful for that.

#

They said something about wanting money, but I don't really care about that.

urban sky
#

I know my info is on the internet, I don’t want even more info on the internet

potent glacier
#

Idk what to tell you...

robust yoke
#

Well, if you don't want more info on the Internet, then just don't upload any more things on there.

potent glacier
#

Your info gets sold 24/7 by all the major companies

#

That's why ads get targeted at you specifically

#

They have all your info

#

It gets sold, traded, w/e

urban sky
#

It’s just that lmarena’s policy is like everything you say is being posted

echo aurora
#

@urban sky worth noting our FAQ in the Privacy section:

Is my prompt data publicly visible?
Your conversations may be shared to support our community, improve our service, and advance the development of reliable AI. This includes posting conversations publicly online. Any data that we share is always anonymous and never linked to you. We never share any personal information, just the conversation and votes.
&
What steps do you take to protect my privacy?
We take user privacy seriously. All prompts and votes are anonymous and not connected to personally identifiable information. Additionally, individual conversations are never publicly shared beyond prompt text and model responses, ensuring your identity remains protected.

ivory schooner
#

Guys, there's a bug today.

robust yoke
#

And which would that be?

ivory schooner
#

Whether in the text chat state, after attempting to insert a picture, it will be forced to enter the image generation conversation state.

robust yoke
#

And are you sure that isn't just because you selected the image generation button?

#

Because I don't think I've ever seen that happen before.

robust yoke
ivory schooner
#

It's hard for me to record it using Windows' video screenshot function because I don't know how

robust yoke
#

All you have to do is just select a region of your screen that you want to screen capture and it'll do that.

ivory schooner
#

Forget it.
You must not guide me on how to record the screen, because it's irrelevant.

robust yoke
#

It's useful.

#

I know how to.

#

If you hold Left Shift and then you do Windows key + S, it'll launch the selector where you can select the region of your screen. You'll see a camera icon on the top in the toolbar. If you select that, then you can select a region of your screen to record. You can press the Start Recording button, and it'll start recording that specific region of your screen.

ivory schooner
#

hhhhhh

#

ok

robust yoke
#

Yeah, that explains.

#

The thing is, it's not auto-rerouting your prompt to an image generation prompt. The issue is that you selected the image generation button, which happens to be right next to the plus button that you click in order to input an image. And if you click that button, then it sets the modality to image generation.

empty stump
#

is there any way i can access image to video for free less censored

#

because i am working with some historical images

robust yoke
#

Red is the actual image input option, while violet is the image generation modality button.

ivory schooner
#

I guess, could it be because the banana image generation model is too popular?

robust yoke
robust yoke
empty stump
#

well its some kind of racist american political cartoon image

robust yoke
empty stump
#

oh right grok imagine

robust yoke
#

Sure, that works.

ivory schooner
robust yoke
#

So, at the same time, it's not that big of a deal.

ivory schooner
#

I recently like to insert some pictures in the text chat state—mainly to translate the text in them.

#

I think it's quite inconvenient to translate the next picture in the current conversation record.

robust yoke
#

You mean to select "New Chat" and switch the image generation modality off when you upload an image?

ivory schooner
#

no

ivory schooner
robust yoke
ivory schooner
agile thorn
#

I want to test the video creator of Ai

robust yoke
robust yoke
tame ledge
# ivory schooner

ooh thank you, i thought i was the only one that was having this problem

#

i was literally about to ask about it here, didn't seem it was just my device problem

tame ledge
#

in the mid of the video, tired selecting both the search and generate image which won't work (obviously). But when i send am image, the generate image button is triggered. As you can see in the video, the model which was gpt4o, was deselected. But if i force sends it anyway, it will return with an error as you can see. And i cant even deselect the generate picture mode without refreshing.

#

in the end, you can see that the chatbot works just fine, the web is just bugging.

robust yoke
tame ledge
# ivory schooner

I thought so too. it happened here as well. The user here clearly just tried to upload an image without ever clicking the "image generation mode" button. But it just went into that mode anyway without anyone even ever touching it.

robust yoke
#

Are you able to deselect it once you upload an image, or do you have to create a new conversation just to disable it? I'm not talking about when it's sent—I'm talking about when you simply just upload the image.

paper abyss
#

hi this is great thanks

tame ledge
# ivory schooner i like this

You can clearly see him try deselecting it, but it won't work. But it seems to me that you don't understand it previously. For my final answer, not you can't, it will just tell you to create open new chat.

robust yoke
#

Right, but that was only after he sent it, not when he uploaded the image into the prompt before sending.

#

Therefore, it might still be preventable. You just have to disable it before sending the message with the associated image, as that will lock you in. Only then will you have to start a new chat.

tame ledge
#

Is it really preventable? Try to pay attention to the video and see how i can prevent it. I tried deselecting the mode 5 times which triggered the create new chat prompt with the picture, and even without the picture which does the same thing. And once you uploaded the image, you are locked into the mode and won't be able to deselect it without doing a refresh as I've demonstrated here

robust yoke
#

Ah, interesting...

#

Try on the canary version.

#

That bug doesn't exist there.

#

At least I don't think it does.

#

And that's on the canary version.

tame ledge
#

Oh okay, I guess I'll use this in the meantime until that gets fixed.

#

What is the main difference anyway

#

i mean, between the normal and canary

robust yoke
#

Well, that version has a web design feature while the normal one doesn't.

#

An app feature.

fading rover
#

Does anyone face this with nanobanana image generation based on prompt ?

robust yoke
fading rover
robust yoke
#

Considering the model's from Google, it makes sense that it would be pretty strict.

fading rover
#

Even other models does not generate like flux and any other

robust yoke
#

How strange...

tardy zenith
#

Same error only image selection which I don't want, but if it recognizes images it uses llm to generate images and doesn't give the possibility to change model like GPT or Gemini pro or opus

fading rover
#

Appears image caching issues creating trouble in response. Many time old images generates even prompt is way different it shows old promp image generation

#

Tried various logins same stuff takes about 15 minutes to 20 minutes for accepting one generation .. let's see

willow grail
#

when will we get option to delete and edit messages?

hearty ferry
#

They are also creating horrible generations

#

like 2023 AIs

tardy zenith
#

I ask you for a suggestion, I'm looking for an AI tool subscription makes no difference, I used gpt but if you reach the storage limit It's frustrating to redo projects and lose progress. I use multiple images, depending on the job. I was intrigued by the idea of an AI agent, but I think I might run into problems with it Token limit. So I'm not really sure where to start my research. If I had to say, my main use is in pattern analysis. I write music, lyrics, and analyze images. I use it for study, read news and current affairs so you must also do research What do you recommend? Thanks for the replies

shut snow
#

Hi

tardy zenith
#

If you also know some tools for music creation, master production, recording, beat a music studio If I'm not out of context, let me know... but I sing, write, produce lyrics, analyze patterns, and I've even given up on developing programs. If anyone has any ideas or suggestions, I'd appreciate it. discuss it

ocean vortex
keen crystal
#

Hi

fading rover
fading rover
#

And it goes back in loop God knows when will it release

#

Let's hope this is trmporary

willow grail
#

XDD

ashen fulcrum
#

not getting any responses

tardy zenith
#

I made several attempts. I tried forcing it to recognize specific prompts using models not intended for image generation. It only allows the use of image‑generation models whenever an image is added it does not let me change the model it does not let me use GPT Gemini Pro or GPT opus only image‑generation models and it then returns an error. I restarted multiple times, and I have to create a new chat. This problem occurs only when an image is inserted.

keen beacon
keen beacon
#

fix it plz

tardy zenith
#

Solved

keen beacon
#

is not

tardy zenith
#

Update
The problem only occurs on the main version of the site, I made further attempts...

keen beacon
tardy zenith
keen beacon
#

is not working for me

barren prairie
keen beacon
hasty basin
#

🪂

tardy zenith
#

I tried several times, several times, I when I have a problem in the stack bootlop answers, I update several times, then I force it several times by continuously changing templates, it generates various errors, I open several pages of the same conversation and let the page reload several times... it's a bit cumbersome but after that I restart Generate response...The problem however is on the main site because any image inserted does not allow any image generation and does not allow you to change the model

keen beacon
cunning mango
#

make a 3d red shoe that grows like magic right from a drawing on paper

keen beacon
tardy zenith
#

Update, I tried more templates for some testing, the responses are giving errors, I'm testing all the templates to see which one works best, gemini after five responses gives error, I tried switching models continuously, then came back to gemini and it generated response..

cunning mango
#

my videos are not generated. the picture remains unchanged. why? help me understand.

ivory schooner
#

By the way, I hope lmarena will have multilingual versions, including German, French, and Chinese. After all, there are a large number of global users, and the English interface is inconvenient for native speakers. I don't know where the error is.

ocean vortex
#

cool hieroglyphs you got there

#

This is English only 😇

candid bloom
#

why when i paste a image and sends it, it generates a image how can i fix it

ancient reef
bitter wasp
#

hello?

candid bloom
ancient reef
#

If that doesn't work, idk 😭

sharp hornet
#

🙂

candid bloom
#

when i click it it says this

#

but its blue

#

@ancient reef

alpine coral
#

toggle off solves

ancient reef
# candid bloom <@1160030744400367726>

It shouldn't be blue if you don't want it to generate imags ^^;
Perhpahs I'm misreading you, but it sounds like you're keeping it blue when trying to NOT generate images.

alpine coral
#

but yeah.. not intuitive (for me anyway)

upbeat zealot
#

Hey,
is it possible to buy a subscription just for the image tovideo feature؟?

ancient reef
ancient reef
candid bloom
upbeat zealot
ancient reef
alpine coral
ancient reef
candid bloom
ancient reef
candid bloom
ancient reef
#

Try something like claude after clicking on the blue icon, and it should work. Gemini-2.5-flash-image only generates images.

#

Make sure you MAKE a new chat first.

candid bloom
alpine coral
#

from what i can tell, nonobananna is only available when image generation is toggled on

ancient reef
candid bloom
candid bloom
ancient reef
alpine coral
#

yeah..

ocean vortex
ancient reef
alpine coral
#

ahah is the bannana smart?

ancient reef
#

does it on lmarena? hmm...

#

I don't think that's possible after testing.

ocean vortex
alpine coral
ocean vortex
hasty thorn
#

Hi, I'm new here, and Lmarena isn't working, something went wrong.

ivory schooner
brave orbit
keen beacon
#

I need unlimited free ai

civic zinc
#

#video-arena-3 "Cinematic drone shot of Kutubdia Island, turquoise ocean waves, fishing boats scattered, sunrise golden light, 4K ultra-realistic, smooth camera movement."
Voice-over:
"Welcome to Kutubdia, a small island in Bangladesh with endless potential for the Blue Economy."

brave orbit
keen beacon
#

not good for me

brave orbit
#

why deepseek and qwen are fire

#

you can not get unlimited free ai unless you go to chinas ais not USA

keen beacon
#

I'm not in USA

#

Google ai studio is good

stuck swan
#

hi guys

#

why is llama-3.1-8B instruct model not accessible through LMArena rn?

glad harness
#

hi

cyan hemlock
#

I select gemini 2.5 pro GPT 5 opus all model
But every time I insert images it only makes me generate images and that is not what I am looking for because I did not ask for any image generation

#

i dont want the ai to generate image, im sending the image cus i wanna show the errors of the ai code but when i send it its generating an image

brave orbit
keen beacon
#

is good and free

#

If you have something better place send

exotic tartan
#

why hasn't the webdev score update since aug 22?

cyan hemlock
keen beacon
#

send link

keen beacon
rustic quartz
#

hello

vivid sandal
#

IS it just me or is the website not loading right now

earnest rover
#

And when he was typing the fourth he realized oh ! We have set a rate limit 😞😁

latent crest
# latent crest
poll_question_text

Where r u from?

victor_answer_votes

5

total_votes

10

victor_answer_id

10

victor_answer_text

Somewhere else

craggy sentinel
#

Can we please have the basic simple feature of Re-using Parameters ?

earnest rover
keen beacon
#

Not good for me is not unlimited

little narwhal
#

Is this actually real

pearl beacon
#

What is qwen-image-edit-fal and how's it different from the regular qwen-image-edit?

latent crest
#

I discovered lmarena last week, I don’t know for how long it was a thing, but in this past week… lot of changes

rugged ermine
#

hi

vernal wharf
#

hello

earnest rover
formal jungle
zenith verge
#

Hey - new here! Happy to join!

torn topaz
#

hello guys i have question its possible to have a private romm to generate a videos ?

drifting crow
# drifting crow
poll_question_text

Do ai feel

victor_answer_votes

3

total_votes

5

victor_answer_id

2

victor_answer_text

Define feel

torn topaz
#

?

keen beacon
#

Anyone tried new Ernie models?

#

They released new 4.5 and X1.1 today and sadly the latter is not open source despite being really good

spring thunder
#

hello there

boreal halo
#

can anyone tell me how to generate videos in video arena 1

quick jackal
whole wagon
#

The Qwen3-Next series represents our next-generation foundation models, optimized for extreme context length and large-scale parameter efficiency.

The series introduces a suite of architectural innovations designed to maximize performance while minimizing computational cost:

- **Hybrid Attention**: Replaces standard attention with the combination of **Gated DeltaNet** and **Gated Attention**, enabling efficient context modeling.

- **High-Sparsity MoE**: Achieves an extreme low activation ratio as 1:50 in MoE layers — drastically reducing FLOPs per token while preserving model capacity.

- **Multi-Token Prediction(MTP)**: Boosts pretraining model performance, and accelerates inference.

- **Other Optimizations**: Includes techniques such as **zero-centered and weight-decayed layernorm**, **Gated Attention**, and other stabilizing enhancements for robust training.

Built on this architecture, we trained and open-sourced Qwen3-Next-80B-A3B — 80B total parameters, only 3B active — achieving extreme sparsity and efficiency.

Despite its ultra-efficiency, it outperforms Qwen3-32B on downstream tasks — while requiring **less than 1/10 of the training cost**.

Moreover, it delivers over **10x higher inference throughput** than Qwen3-32B when handling contexts longer than 32K tokens.