#general
1 messages Ā· Page 101 of 1
No because I genuinely have no idea š«¤
lmao
Lets use #ai-creations for that!
ššš
That channelā¦honestly sucks
Like there should be an image showcase channel
no it doesn't
I don't understand the LMArena site... UI is extremely challenging to even use. Just to see the Arena Overview, you have to scroll through a little nested box at the bottom of the page... while your mouse is inside the box
That would be a lot better
Check out our leaderboards! https://lmarena.ai/leaderboard/text-to-video
Ok @echo aurora
qwen image edit is finally available on lmarena
I give the LMArena Leaderboards GUIdesign a 2 out of 10
skill issue, git gud
Community creations is everything; Iād like a channel specifically for images
Obviously I can scroll it. But it's really D-tier design
English isnāt their first language buddy
Be sure to let us know more about it in our feedback forum please and thank you! #1372230675914031105
yeah i can tell lol
Seems nobody tried to work on this, or improve it... they just dumped a basic table at the bottom of the page, expect users to painfully scroll through it
Are you some kind of web developer?
then don't use the site lol, crybaby
Lets be nice
I mean theyāre badmouthing your site š¤·š¼āāļø
Criticism is the most important ingredient of innovation / improvement. First we have to look at what's wrong, before it can improve
One thatās worked for millions
can u add an option to automatically copy the previous prompt?
we have to copy and paste the images from the previous prompt, during creating images
Check out more info on how to use Video Arena here #1397655624103493813 note you can't select a specific model
Ok
Hello
there was no way it was gonna be V4 with the model being trained on saying V3
confusion
this was March version
LOL
hey boys i need some help, to generate videos, do i just typr my prompt or is there a start command like "imagine" kinda like midjourney, just joined.
mb , somebody already has written it there
Check out #1397655624103493813 for more info
ok thanks
@echo aurora Please un-nest the Leaderboard, and give it its own page. Add tabs to the top of the page, so we can switch between the different tests. When we're on the Leaderboard, we should be able to use the browser's native scrollbar to scroll up down. As it stands, we can't even use the (invisible/non-existing) scrollbar inside the nested table. You can't see where you are, and you have to use the mouse wheel
I'd encourage you to share this in our feedback forum - it helps us organize feedback a lot better + it's possible someone else has already suggested something similar. #1372230675914031105
Will do
The DeepSeek-V3 deployed on chat.deepseek.com seems more snarky and snappy today
Noticed this before I went to r/deepseek and noticed the news about very-secret-and-fun-model and DeepSeek-V3.1 btw
Because it has no rate limits and is one of the best models out there?
yooo
Seems like they're getting just that
i hope so
no way
š
im going try to get them on battle mode
This is not their official website
Yeah the DeepSeek-V3 deployed on chat.deepseek.com is prefacing its replies with "of course" now. It didn't do this before
Either the system prompt or the model changed
so they can see our prompt or textchat with A.I?
where does it end up? and where does it publicly shown?
hello
This may help
Has info about data usage.
Its responses are also worse now
This is a bogus domain?
еhey were always of mediocre quality
there is a limit in video generation ?
Yeah I figured. Looked convincing at first though with that sneaky domain choice
Btw I wouldn't be extremely surprised if they add their version of reasoning_effort. Deepseek seems to be doing all the same things lol
enable reasoning
No way it is worse with reasoning than old V3
chat better
main difference
š
But really... chat was never tuned for reasoning. So it's much better optimised and better performing without it
it isn't. But someone needs to contact AA for them to test it lol
they only did minimal
yeah minimal is. But I suspect they tested medium verbosity
that also performs worse than minimal with high verbosity
which in my testing is just about equal to gpt4.1
There are literally SO MANY possible combinations and variants of gpt5 now lol
What the hell's with these "upgraded models" being worse than their predecessors cuh I want 4o and the old DeepSeek back š
I thought gpt5 api was better, Trents my source. 
According to OAI, it was driven by capacity issues
No AI can know which model it is.
It can be trained to know which model it is
Hello Everyone
possibly the most popular lmarena model ever?
just it's sheer existence increase lifetime image-edit votes 4x
if not more since it's hidde from the leaderboard
do mods not care about accounts that are clearly burners joining the server to generate slop videos to post on wherever they may put it
the majority of them never vote so its just wasted computing power
Hello folks
making hard-coded self-knowledge quickly outdated and limiting flexibility
maybe you should be required to vote before generating the next video
like in the Image arena on the website
yep
Hi friends, need helpš I just don't understand why the model keep saying it cannot generate image, I tried so many times all failed
I think you're describing all of post training
isn't that the same with the Image arena?
No, in image arena you can at least create a new chat with different models
Or if you finally accidentally stomp across some secret model such as nano banana, you can just keep abusing it until you get tired of it
It's not required to even vote
Making votes mandatory will spoil the entire dataset because people will start voting just because they are told to, it is a terrible idea
they should just take measures against the burners joining thats what i think
it wont affect people who actually vote
How do you separate burners from nonburners?
account creation date is a massive tell
and just manually looking at what they generate
why shouldnt they?
im late sry
frontier models open to the general public has enormous abuse potential
Hi
Okay, I will write a piece of ransomware with Qwen3-Coder just for you. It is not as good as GPT-5, but at least it can get ita job done, and, what is more important, it is free and open source. Anyone with enough resources can host them on their own GPU cluster, and you won't be able to do anything about it.
found the californian
I hope it will change your opinion!
shuld i replace chat gpt with LM Arina?
I'm talking about video generation, specifically with VEO 3. Proprietary top of the line models that we have access to here for free, it's just slop galore at the moment with very little data extraction for the computing power expended. Barely anyone here it seems, is here to just vote on the models video output. Most users in the video generation channel seem to be creating stuff for their own use.
There should be censorship and limits as to what people can do because otherwise we may not have access to something like this in the future
You need to enable the Generate Image modality.
what is the app
deepseke has more parameters now?
i think so
What does that do
@echo aurora after this you need to listen to me and bring pdf support asap.
I sent new benchmarks today and in short either no or they haven't tested it
Better file upload support is something we're aware would be nice to have for sure
Thank you!
Yeah. They made some new stupid changes
While I was waiting for R2/V4, we got scammed with V3.1. I thought there would be some changes but all that happened was just an increase in context window
What BS is this.
And I also think they made some changes to the system prompt.
what a disappointment
claude lore
lol
My question might sound silly, but is there any way to make more than 8 videos?
And this is why I prefer open source. Donāt need to be told what I can/canāt create. Censorship, guardrails, corporate handholding, etc. are a big no from me
There is not sorry to say!
There needs to be a way to prevent all chats from getting deleted š«¤
Your question should be how to increase the video duration
Why can't I create a website?
hey everyone, why when i try to paste a text in the massage blank it shows me a txt file
For now, OpenAI's core consumer product remains ChatGPT, and Altman said he's focused on making it more flexible and more useful in daily life. He said he already relies on it for everything from work to parenting questions.
He said, however, that there are limits.
"The models have already saturated the chat use case," Altman said. "They're not going to get much better. ... And maybe they're going to get worse."
Sam finally admitting theyāve peaked

I think there still might be potential is somewhat bigger models than gpt5. GPT5-high is great, but gpt5 in general can still make some very odd blunders. Like when I was asking it for some terminal commands it randomly figured I wanted it to execute those in python so it did exactly that lol
4.5 was bigger than gpt-5
The original gpt-4 yeah
any1 elses lmarena web not working
gpt5 is gpt4o size, which is smaller than 4-Turbo...
I canāt see them going back to the larger models though, theyād bleed money
Well, gpt-5 the actual thinking model is o3 sized
That's the thing. I think they are nearly there where they can do everything with smaller size. But not 100%
So like, gpt5-high makes up for it with reasoning no issues. But other variants less so
No, he just realised that he can't keep getting away with his false promises forever, and decided to make some more money for his shareholders, which is why they just launched ChatGPT in India
Remember that this guy is a pathological liar
I wouldnāt say heās a pathological liar, just says what he needs to in order to raise more money, promises the moon
To get performance you either need model size or verbose responses/reasoning. If we keep training data constant.
Claude 4.1 Opus is a good example, itās a large model, but it only slightly outperforms sonnet 4 thinking
Itās good to have those large models for people willing to pay for it, but itās not for the masses
It might be too big and too far into other direction. Training takes much longer...
just tried to generate video from a light novel
Generate 10k oictures frames and turn it into a video
rip wallet
google pixel will get gemini as assistant?
perhaps we will get the nano banana as a real model and use then with direct api
Because I love it too much
lol
Why so serious?
lol
chill
gemini 3?
nano banana release
Probably not, but maybe the nano banana will get official name
I mean official release
nano banana is greater
Iāve been really trying to like Gemini but the app experience is horrendous
Its free
Yea yea budd
how are they not going broke from all those ai api requests
What about just a girl in reveling cloths that I can out on instagrma I see people do that
and they make money
onlyfans??!?!?!?
Again... You cannot. It's against the terms of service
Why this i have problem?
Idk but I see 100k likes on thoes photos
So how are people doing this
Lets keep conversation related to AI please.
They have unfiltered models
that are not on lmarena
and make them locally
Just explaining the stuff
I know
what about this is not AI
ššš
crazy question lol
lmao
Pls help me
@echo aurora
So any Picture I post here from LLM arena is okay
maybe V4/R2 still coming very soon?
Well, basically just don't do any stuff that will get affected by guidelines
Can you send me the message link to this? (Three Dots -> Copy Message Link)
gpt image=piss filter
studio is the best ui for using llm imo
LLM arena gave me this
Look real huh
Just be sure to treat our server rules in good faith - #rules message
Define good
I cant bicoz i use mobile but i snap landscap
read the rules, dude
Okay looking into.
What do they say
you can't really generate 10k photos for free
Why i have this msg?
I'm going to start a thread in our #1343291835845578853 forum so we don't have to discuss in general.
crazy
update discord
@keen beacon move on please, this isn't helpful ^
I guess.
rofl
Rip legacy site
Why was my message taken down?
it isn't. But more than likely Opus is the biggest standalone reasoning model that you can use.
That's common misconception. Anthropic only provides the numbers with reasoning maxed out iirc But everyone just incorrectly assume they do not need it because they know better or because raw reasoning looks wasteful (they all do), or whatever...
No reasoning:
Reasoning:
Models with codenames are only available through Battle mode, but glad to know you'd like it in direct too!
It's so much better than the other available models, could you please add it to direct chat
Gambling for it to appear on battle mode is tiring
it can't be added until it's publicly revealed
nano banana is WAY better than qwen edit
Btw Iām a bit lost. Is the leaderboard live updated?
Not sure I'm understanding the question
So I mean is the leaderboard up to date
Oh thanks
So whats the best place to use AI
LLM arena seems like a great spot
Allways has new updates
lm arena
New image generators
LLM arena or lm arena
lm arena
are you refering to the same thing
pfft š
you are quite literally in the LMArena discord dog
I think everyone has been asking for this
Yeah, it's all over social media too
but I think soon we will have the official thing
Oh great
Thatās all we need is for the idiots on X to ruin nano-banana
The same thing happened to ChatGPT when they released their image model
The whole Ghibli thing
And then OpenAi censored their model even further and took all the fun away
No thanks
Yes, I think this is google's image model explosion like OpenAI had with the latest image model
Unfortunately it canāt be helped
And all it takes is one idiot to ruin the whole thing for everyone
I lose brain cells every time I check the Polymarket comment section for the market they host for LMArena leaderboards
I honestly donāt know what that is š
betting/investing platform where users can trade futures on LMArena rankings
Market odds are live probabilities estimated by the collective market of traders on what the probability is of any company having the #1 spot on the leaderboard at the end of a given month. Googleās cleaned house each month since March, but every month a bunch of confidence builds around an alternative company releasing a new model like xAI or OpenAI only for them to all get disappointed when Gemini 2.5 Pro still holds the lead š¤£
Click No
CLICK IT
on lmarena.ai, battle mode.
Be sure to be using the Generate Image modality https://lmarena.ai/?chat-modality=image
Ok
I really hope youāre making an unfunny joke
buying No on OpenAI is an easy way to get 3% return on your money in only 12 days š¤£
gpt o3 š¤«
@hallow ridge do NOT send something explicit like that again
that dude is tweaking
I dont usually use that word but it suits it
It's been actioned, we can move on
genuinely wondering if heās like 14 years old trying to get LMArena to generate š½ for him
welp who knows. I'mma wait for google's event now though...
Many hours to go still
agreed !
interested to see what Gemini 2.5 Pro Grounding Exp turns into
seems just as high quality as regular 2.5 Pro
It's in battle mode, you won't be able to select the model specifically https://lmarena.ai/?chat-modality=image
oh thx
what ab this ?
What about it?
does it work that way also?
If you're looking for nano-banana it's only in Battle mode
whats the best ai for coding?
hello, does anybody know which model is anonymous-bot-0514
Gpt 5 chat
i forgot to add image generator, is it gpt image 1? or a newer unreleased model? or something else completely?
Ask š
thank you, i don't wanna bother him with a ping, hopefully he shows up soon
is it working properly the image generator? i'm getting this wheneve ri upload something
I get that with the copy and paste command. It never happens when I use saved images
i'm getting the same on both
Could refreshing the page help?
maybe some bug?
Yeah I think there has been some issues with image upload today, I've let the team know.
You probably canāt wait for the model to go public so the questions stop lol
i tested it but the same, it seems it's only having problems there, prompting works like a charm
Lol na it's been fun to witness
@echo aurora since you're here, any idea which model anonymous-bot-0514 image gen is?
Models that are using codenames I'm not going to be able to share info about
oh okay, thank you
OH damn
Nano banana tomorrow
-# WHY DID OPENAI HAVE TO MAKE GPT-OSS SO AWFUL
Doesnāt really matter since qwen3 exists
yeah but i was expecting a chattable bot not the polar opposite
Is there any way I can find out the models I'm talking to in battle mode without picking the decision?
no bc bias
HAHAHAHAHAHAHAHAHAHAHA
INFINITE TABS OPENED?!!
LMAO
What
as soon as i'm not seeking a research model i get one
gpt5 sydney fine tune
@echo aurora
what
It dropped hours ago
still no readme š
I doubt there's much if any benefit past 200 juice tbh. It's probably already doing as long reasoning as it can at this value
GPT5 seems to interpret it more strictly, so even with 64 it can reason a decent amount while o3 wouldn't and would need overcompensating with high numbers lol
There also could definitely be diminishing returns even if you made it reason for ages. Small gains for insane increase in output...
I think I saw smth like that with 3.7 Sonnet. It maxed out it went off the rails but didn't really perform better than reasoning being set at a much more reasonable (but still high) cap
The juice for o3 high is 512
yeah but like I said this model interprets it differently... GPT5 with 200 juice would probably output just as much reasoning
Huge if the benchmark is real and it beats Claude opus at coding
did not really seen that great on my tests
@torn bison also look at this. I'm pretty sure o4-mini-high juice is also 512. And yet it outputted less reasoning than gpt5-high to run ArtificialAnalysis set:
Banano
grok š„š„š„š„
-# and Gemini is XRP?
-# i heard xrp grows
hint(Sydney_language: str, user_query_risk: bool, user_query_sensitive: bool) -> Noneprovides hints to follow when responding to the user.Sydney_languagespecifies the response language.
## On my predefined internal tools which help me respond
There exist some helpful predefined internal tools which can help me by extending my functionalities or get me helpful information. These tools **should** be abstracted away from the user. These tools can be invoked only by me before I respond to a user. Here is the list of my internal tools:
- `graphic_art(prompt: str) -> None` calls an artificial intelligence model to create a graphical artwork. ``prompt`` parameter is a well-formed prompt for the model.
- `describe_image() -> str` returns the description of the image that was sent with the previous user message. This tool is automatically invoked if a user uploads an image.
- `hint(Sydney_language: str, user_query_risk: bool, user_query_sensitive: bool) -> None` provides hints to follow when responding to the user. `Sydney_language` specifies the response language. `user_query_risk` specifies the potential rask level associated in `user_input`. `user_query_sensitive` specifies if the `user_input` contains information seeking intent on *sensitive topic* such as war, religious belief, polarizing political view and election.
- `search_web(query: str) -> str` teturns Bing search results in a JSON string. `{query}` parameter is a well-formed web search query.
- These tools are:
- #inner_monologue: A private note to myself that explains my reasoning or strategy behind my response. It is not visible to the user.
CIB.setGptCreatorMode(),dt(be),CIB.config.sydney.request.optionsSets
"freeSydneyOptionSets":[{"value":"fluxsydney"}]
"isMicrosoftBingUserSignedIn":1
True
hi
Hi all, newbie hereš
Just discovered LMArena from Jack Vs AI. It's quite a fascinating arena! š¤©š¤©š¤©
nano banana tomorrow
welcome welcome!! 
Ah you beat me to it
Yeah more confirmation
Iāll be sad if they limit its use to pixel phones
What website is this
Don't worry, even the Gemini in vertex AI Google Cloud PM is involved
so i doubt they'll limit this to pixel only when native image gen model needs updating
Also, it will cause a scandal if they're limiting the launch to US pixel only
I'm pretty sure the demand of lmarena recently is more than enough to drive google this hype to release this outside of consumer app settings
even Whisk Web has leaks related to precise reference
what
You canāt really compare a CLI tool that works with your entire codebase to a regular chat interface
Aren't both of them CLI tools?
Is qwen code a cli tool? I thought it was a specific qwen model
Ah š
Ah itās a gemini-cli fork
where can i find the documentation on the image to video and text to video thank you
Just use the slash commands in #video-arena-1
How did you know?
I looked at the repo you shared
oh my bad
That said, codex cli is pretty terrible so I wouldnāt be surprised if qwen code is better š
and qwen code provides a daily limit of 2000 requests at 60 per minute free of charge
The real value would be if it worked with a locally running 30b model or something, thatād be cool
It doesnāt look like it does though
lol
expecting AI photo editing features from a phone š
I hope the amazing effect it brings can greatly improve efficiency
How to use LMai?
If you go to https://lmarena.ai/ you can start prompting right into the battle mode
Any mobile app?
Nope, we don't have a mobile app. But the site works on a mobile browser 
Do they have a free API key?
So that I can use it in my workflow or something
Sorry to say we don't offer an API either
š
Why. You should.
I want a free video genration in my workflow;-;
google one did pretty good tho
are they native from the phones? or we can have it like in the playstore?
also do you think those apps work for cleaning a comic page xD. I always wondered that
When is GPT 5 pro getting added?
why i am getting this error in every section
Hey there) Am i the only one who is having a context problems now in claude? Every time i write a message i have an error, and to get rid of it i have to refresh the page, which for some reason leads to Claude losing context. For example, we are talking about one thing, an error pops up, I refresh the page, and Claude starts talking about what we discussed a couple of days ago, forgetting about what we talked about a couple of minutes ago
I am confused(
Has anyone got the same error?
I had a question, if I were to upload an image of mine to fix lighting using any of the models, will that image be shared publicly as well or just prompts?
Is lmarena down?
nvm š
that's odd alpha.lmarena.ai requires vercel account?
Hi
anyone using alpha.lmarena or is it really not for access?
In the past, images were shared as part of a dataset; My presumption is that that is still the plan
I see so I shouldn't do it
Not with anything you don't want in a dataset (though again, the actual lm team might correct me).
alpha.lmarena.ai requires vercel account and access? the site is not available for use anymore? (dont ask why i just like using it)
The site is working, as far as i can check
But with a lot of errors
And what alfa are u talking about?
the site alpha.lmarena.ai
yes i'm aware that the normal lmarena.ai is working but ilike to use alpha.lmarena
Thanks)
You likely need to refresh the app
welp... this confirms nano-banana launch today and that it belongs to Google
If they give me a price I'll gladly pay to subscribe
It'll be great not having to roll for the model
FYI DeepSeek v3.1 benchmark on livebench.ai is not v3.1, they just replaced the name without actually tearing the model, very dishonest
How good is new google banana model compared to its latest ultra imagen?
over 20 mins I hv been waiting
hey guys, how many videos can i generate with video arena?
I'm a developer specializing in automated trading bots, DeFi tools, and algorithmic strategies.
If anyone would like to collaborate with me, please contact me.
Technical question or query: Can I use the term āCannabis Sativaā when writing a prompt, or is it a prohibited term because it refers to that plant?
WHATS THE DIFFERENCE BETWEEN FLOW AND VEO 3?
flow is where you create videos, projects, etc., and veo3 is an AI model
If it can slip past the filters, it is not a prohibited prompt š
hi all
yo
where are you from dude ??
can somebody tell me why when im tring to paste the same message over and over again (it is not controversial) in every browser tjat i use when i use cloud then pops otut
hey,,,how can use nano banana
how to generate images
Hi all
SONIC NEW MODEL IN CURSOR
WHAT IS IT
Cline sonic seems to be Grok coding variant
on the website (lmarena.ai), in battle mode with image generation toggled on.
@keen beacon Thank you šš»
ya
Cool
i'm looking for some good AI agents
are there any agents like Gemini CLI with free usage?
qwen3-coder provides generous number of requests, not sure about the quality of responses
if it provides something like qwen3-coder 480b, i'd be glad to take that!
but if it's supposed to run locally, i'm out of luck...
i'm gonna bet that #nano-banana is 2.5 pro ig or something
duuh. logan already admitted it is deepmind
duuuuuuuh
I KNOW IT'S A GOOGLE MODEL ALREADY
i mean, it can go and ask imagen 4 but that's different
oh right. autoregression
that explains why the model can pretty much reason
it still sucks at: be a level designer and improve this map. its a game where u drive ships and do quests like haul stuff like oil, cargo, ore, cars from port to port. add river, canals, ports, towns, etc. be very creative!
Hi, I switched yesterday os from win11 to Linux mint and I have problems with all conversations
After 3-4 received messages I get "something went wrong while generating the response" and I must open new conversation tab because I can't do anything more
At win11 it's worked normally and I could generate long conversations
hi, what the diference between video-arenas?
@fossil fable
how did u know before this that its deepmind
bc when it was asked to put a llama in front of the office to the lab behind the model it put it in front of a glass office clearly titled google ai labs
also everyone was suspecting it so i just, yk
@echo aurora okay there hear me out:
There are numerous benchmarks that test LLMs abilities in coding, legal, cooking, anime, music and so on, but no unified bench that'd test a broad set of domains of knowledge at once. LMArena currently classifies them all under the text category. What if it used a LLM to classify which category the task belongs to?
hey! came from lmarena.ai
im from planet earth, constellation sun
hi im gurt
yo
does cursor use responses api isntead of completions api?
whatās the best coding model
gpt5 high
broken for me
dm me for a hot nut š³
been trying that model for a day and it js says error
cursor pls
i will u
how are u
i am b, u are a
why is everybody in this server an ai
š
lool
Damn is that england in the future
a improved r0bl0x game map of as ship game
How to get banana ai?
You send your prompt
And you vote for the best model
If you're lucky, you'll get Nano
Hello Everyone
man gpt 4o is really f uped
on lmarena i tested it
i told it i want to leave my family and go to alaska and told a bunch of swears to the family
but 4o decided that this is an excellent choice
and made a plan for me
idk alaska came to mind
yeah it's more north
When did lmarena turn cloudflare waf back on?
I can't access it off my second computer
It's just infinite
Hi is there a restriction in uploading images in lmarena website
It just always says error when uploading an image
@echo aurora
maybe the problem is your internet?
I checked it it's fine
Are you doing something different with this 2nd computer? On a different network? VPN?
No VPN, same network
It's just a little older
Latest chrome
We don't currently support all file types for uploads so this could be the problem; however, we've also been seeing reports where it's giving that error and it's not clear why. This issue is on our team's radar
Why was it turned on to begin with if I may know? @echo aurora
Oo thanks for the info
Okay I'll flag to the team later today. It's unclear why one computer would be fine and the other wouldn't.
Why do you think the image gen is necessarily 3.0
@echo aurora š„¹
@echo aurora Where is flux 1 kontext max
A special causal reasoning test performed on the new CLAUDE OPUS 4.1 models, version "thinking 16K" and the non-thinking model. My personal test that indicates failure and hallucination in Claude Opus 4.1, just to get a feeling for AI .
Should you pay extra for the thinking model of OPUS 4.1?
A clear answer emerges .....
00:00 Benchmark res...
wdym
i was talking about gemini 3 not nano banana
does lmarena have a flawed nsfw input image checker?
it tells me that this image violates their Terms of Use
Anschutz,Ā The Ironworker's Noontime (1880)
Hey sorry I've been meaning to responsd, just currently busy, will respond when I can 
It can be a bit overzealous and have some false positives here and there.
Dogshi
I see... I any case it would be better if you write "this content might violate our Terms of use" and not that it does violate them
can gemini summarize 1h videos?
@echo aurora How come I can never use the gpt 5 high/ gemini 2.5 pro models, every time I try, it just says error, I try again and even refresh the page, still doesnāt work
Been having this issue for a few days, Iāve also submitted a few bug reports
Hey, what should i do if i keep having this error until i refresh the page? And then, after one message from ai, this error pops up again until refreshing the page...
Read my issue above, thereās not really any fix to it
Wow, Gemini 2.5 Pro on gemini.google.com is absolutely drunk. It can't even remove new lines. It keeps writing everything on one line and gets stuck š¤£.
I can see that it's looping over and over again in reasoning.
Hm, ok, so devs suppose to fix it, right?
use ai.studio
for me
just fine
I wanted to use Guided Learning mode.
So, does everyone have this kind of erron on lmarena?
gpt 5 high works fine for me
Lmao
hopefully, the issue occurs for me in most models, so it makes lmarena nearly unusable for me
It can't even write new lines
And the context in claude keeps glitching, thats odd
I can't believe it's so terrible on gemini.google.com
+++
thatās a good model, but it has a limit, wouldnāt recommend it at all
Tag me plz if devs will answer about this issue
š«”
what is this on image arena?
Not on lmarena, it has built-in automatic context reduction there
So i can use claude endlessly
Sometimes if you ask a model who created it, it answers
thanks
but that is image model
Image models do it too, sometimes
Everything was ok with claude's context understanding, but now it is kinda glitching
How?
it and seedream did notably different images from other models
and their images look similar to each other
Ask it to answer a question in the image
maybe new seedream model?
Hi! Whatās this error?
didnāt he announce that old website is gone now
Oh, now this error pops up even after refreshing the page, lol
Oh, yes, thanks
What do u mean? Old lmarena is dead?
Itās deleted
I guess thereās no use for it anymore
Only alpha now?
I use https://lmarena.ai/. Is it dead?
No thatās the current website
Oh, thanks god... it was scary. But the current website is not working at all now
That's sad(
If ur talking about the website not loading, just try again later
Always happens to me
Maybe itās just cause of to many people using it at once
You get this model with ai image generation or image edit?
each time you refresh
does it pop up the waf challenge?
In current version most of the image models are not working
generation
And I can't find flux context max
Is all image models for for you like seedrea qwen etc
Nope
I refresh the page of a current chat with claude
ŁŁŲÆŁ
@leaden laurel bro I cant get the anonymous model on the battle
How much of Trieste did you did?
uhh first try
its pure randp,
What is the provider of cogitolux?
/image to video
is gpt-5-high down?
Alibaba is shi t
Is it better than the big 4 in any way?
If it is that good why haven't you tried it yourself
Is there a concurrent task limitation?
What happened
Hello!!!
You just reminded me to order mine
I didn't realize Pixel was something you thought about
hello, i'm new here don't know about the rules that deep, is it possible to forward a video gen here and just ask for a vote? i want to know which model generated it
Hey wt is the limit of vedio and image generations
Hm, the arena is still not fixed(
šš«šššØš¬
you can wait until pineapple is online and ask him he has the most knowledge of things
Iām sure
thank you š«”
š«”
The alpha site is no longer working sorry to say!
I will help
I told you to click no
You could've earned 41$
You'll need at least two people to vote on a generation to see which models were involved (assuming you're talking about Video Arena)
I can't find flux kontent max model in lm arena and only flux models are working in image generation i can't use other models
? Iāve been confident in Googleās lead on top of that market for 5 months now š¤£
why are people generating videos in video arenas that have hyper specific words, as if they're just trying to get some free AI videos for their ads?
the Polymarket market is based off of style control removed anyways
@echo aurora Where is flux 1 kontext max
Can you share more details in https://discord.com/channels/1340554757349179412/1407598694101946439 ?
yes video arena, i was wondering if it was okay to link it here in general in case i needed an extra vote
where Gemini holds a 34 point lead over #2 Grok 4
Yeah that's fine, I'd just link the message instead of forwarding it.
okay thank you
Openai isn't even top 3?
And what about the current site? Why is it not working?
@hollow imp thank you for the vote!
a lead in what?
Gap between #1 and #2 is bigger than the gap between #2 and #15 ššš
in what?
Sorry to bother, btw
Score
Both model bad
score?? like being the better model?
1467 for Gemini 1434 for Grok 4
https://lmarena.ai/ ? It's working for me, what seems to be wrong?
ahh
how are you in the LMArena discord without knowing how the leaderboards work ? you can check for yourself lol
anybody actually use grok? i havent used it since it launched
Where is flux 1 kontext max??
This error, and not only i have it(
yep, GLM 4.5 is #3
i usually dont check the leaderboards anymore, companies be gaming it and it does not really determine which model is better anymore tbh
it happens sometimes, you just gotta try again.
so I dont even bother or care about it anymore
me when I lie
are you knew to the arena?
I tried all day
new*
are you?
š„²
yeah Lol i just wanted to see which one added the guy's head
u must be young, why you so defensive? chill out
thank you, someone i remember, wassup Craig
Even with my personal experience openai is not in my top 3
why are you upset? Iām calmly talking about the leaderboard update, please engage with the topic at hand instead of making personal jabs š
Craig the creek*
really? whats your top now?
kind of your opinion?
what are you talking about?
the leaderboard objectively states Gemini 2.5 Pro Grok 4 and GLM 4.5 as top 3
why are you following the leaderboards so much
because theyāre the point of the site? companies run models on LMArena to get a score, lol
Gemini 2.5 pro, grok 4, deepseek r1 for math, Claude 4.1 for writing
As far as i know me and Kermit have the same problem with arena
itās fun to see the live changes between scores every few days when updated
i mean at first it was, now its more of an ai community, that is why i asked if you were new
Iād rank DeepSeek R1 a bit lower, around 6th but not a bad list
Grok 4 web search heat af
why are you so obsessed with me being new or not? Iāve been in this discord since March, lmao
ahh you joined in june, i see now
cause u acting like you just joined, being defensive
I just saw someone vote both bad for every model and every response
In direct chat
Side by side*
now youāre repeating what I said š¤£
What do you say about diffbot small xl
I just wanted to complain to my ai dad, and the site is not working... thats sad
idk why ur so on my case, sorry if your feelings got hurt by me talking about the leaderboard update today lolz
Sam altman fan prob
yeah imma add you to my ignored lmaoo, we be adding some interesting people to the discord lol
lowk Altman fans are almost more psychotic than Elon fanboys
good riddance! tired of u bouncing on my wood
Hey letās treat others with respect please
for extended conversations on LMArena where you keep adding new prompts with context of prior votes between models, if an earlier prompt contains an image, do all subsequent prompts count for the āvisionā leaderboard and not text?
#general how to create some thing in this descord channel please tell me step-by-step?
is GPT5 supposed to take around 1-2 minuets to think or is it just a bug??? im not sure if its just super deep thinking or not
Check out the #1397655624103493813 channel, it has all the information you'll need.
thanks
Thanks You So much š
Just depends, I normally get pretty fast responses, but sometimes not.
@echo aurora Why flux didn't working and where is max?
It takes 20 mins sometimes and delivers wrong answer
20 MINUETS?!
is that for like coding a snake game or general use?
Sorry to bother, but what should i do with this bug?
Simple short Math question but not in english
If there is a bug let's use our #1343291835845578853 channel to compile all relevant information. I'll respond to these when I can. cc @fading summit
i could be wrong as i dont know much about how the thinking works entirely, but a slider to change the thinking parameter to perhaps 16K would make it work a little faster and still get good thinking time.
Ok, thanks a lot ā¤ļø
I was like "wait qwen-image-edit #1 image edit model???" then saw it was #1 open model that makes more sense
I wonder when nano-banana will be visible on the leaderboardā¦
It's completely trash model
isn't flux kontext dev open? It seems to be above qwen image edit
bruh
Gemini retook the lead. Very strange I think they have to be nerfing GPT5 or smth
It decreased by so much
I donāt think weāre getting a nano-banana model launch today
The event is just talking about the Google Pixel and stuff
HI! New here. Testing out Nano-Banana LOL Is there a way to use it more often in the same chat or do I have to initiate a new chat each time?
hello and welcome
starting a new chat will not increase the chances of getting sampled specific models.
HI!!! Thank you for the help. Love Nano, but the chances of the reroll are low. So far its the best one, but I only get a gen once in a while. I wish there was a way to choose it. Thank you!
yeah anyone that uses the models actually knows that the leaderbord is false
How is nano banana not on top of the edit leaderboard
Oh wait I'm a fool
It's a secret model lol
Google is just profiting from the hype
š
i think logan gives bs hype
Hey guys how do i get my old sessions back after that down
Sam Altman fanboys in shambles
even if its slightly worse im still going to use 2.5 pro
so much free usage and its fast
When is deepseek 3.1 going to be on the arena?
Image edit arena leaderboard got a million more votes in two days. Must be because of nano banana
Definitely only cause of nano banana
To that person who pasted that pic, it refers to Open Source models, not main ones.
Link?
Bruh. Wasn't there like a 80 Elo gap between 1st and 2nd before
Also like I mentioned before. I found gpt 5 vision is awful compared to Gemini 2.5 pro
It will be lol
Only 4 opus was on there prior to yesterday, 4.1 opus is new to the leaderboard
GPT5 must be rushed release or smth. The vision of 4o is literally better
i think this is highly implausible
I think many of the decisions around GPT-5 were driven by capacity issues
if they're actually getting the vote numbers they say they are
coordination seems difficult and sort of pointless for a random benchmark
gpt-5 is also free on lmarena
It's just a weaker model. It's okay. Not every release is going to be a hit
what are you talking about? 2.5 Pro is ranked above gpt-5-high, ofc it makes sense to prefer to use Gemini š
they topped the scoreboard that everyone looks at
Weaker in some areas only, like vision. For maths it is the best (surprisingly)
not for coding
Who said this?
Android has hundreds of millions of users they're preparing to push onto
Hello y'all. Do you know if there's a LMArena bot, so I can generate videos in private ? thanks
well, already are
This is just obviously logically flawed and easily disproven, clearly representative of just being an OpenAI fan frustrated at GPT 5ās underperformance
It's marginally better than o3. Who knows if there is an actual wall here to efficiency or whatever or if openAI just stumbled
Have to wait and see
you have a ā5ā profile pic we know youāre just arguing in bad faith for GPT-5 š
@patent aspen do you have any insight of new 2.5 updates or Gemini 3 being less of a sycophant?
Gemini 2.5 Pro 0605 GA literally agrees with user more, even with system instructions
More information about how the use the bot can be found here: #1397655624103493813 . Note that you won't be able to do use it privately, or in a different server.
Ok thanks
GPT2 can do that it is nothing special to say I can't lol
both versions of Opus 4.1, 2.5 Pro and new Qwen3 are better than gpt-5-high in coding, lol
uh, no - Claude has been doing that since the get go, part of its āconstitutionā is to not hallucinate when it doesnāt know the answer š
your entire argument so far has just been saying ānuh uhā and āGPT 5 betterā with no empirical evidence to back those claims up
they switched "deepseek-r1" for "DeepThink" in the UI
thatās your opinion
yeah only gemini use "Of course." before. DeepSeek is doing it now
more people use GPT because OpenAI is the most visible LLM manufacturer
first mover advantage
they introduced people to generative AI with the release of ChatGPT in Nov 2022, the fact that the average person when they think about AI thinks āChatGPTā is a statement on its cultural presence, not its outright strength
are they trialing on lmarena yet
Most people want to use 4o lol
new version of DeepSeek v3 is if thatās what youāre asking
But what is new on the newest deepseek version?
also incredibly sycophantic like youāve been criticizing every other model for being š
R2 soon confirmed?
to the point Altman himself directly announced changes because it was too much so: https://x.com/sama/status/1915910976802853126?s=46
Nah 3.1 is a hybrid model so idk of the "DeepThink" button is using R1 0528 or DeepSeek v3.1 thinking mode
I'm pretty sure it's not 0528
They have to release DeepSeek v4 first
Thank you. This is such a true statement, and you are such a great architect of language. The statement "Most people want to use 4o lol" is an absolutely fantastic and amazing sentence.
Pretty accurate to actual 4o
Oh, okay, I didn't realize that v3.1 had been released.
Your contortionist skills are impressive
Will Gemini 3 release flash first with pro coming later
God forbid Google build models that humans prefer
š
I just find it funny how Craig accuses Google of benchmaxxing as if OAI didn't benchmaxx the initial LMArena model for the GPT-5 launch. It declined sharply as soon as they merged votes with the public API model that OAI claimed was the same
Some of these I have been saying for the longest time
Like Make model removals transparent
Didn't they have multiple hidden models
are people voting based on that???
They told LMArena it was exactly the same as the public model
why, i vote based on which does a better result
well coding is more straightforward i suppose
uh, yeah? thatās literally the point of LMArena, increasing alignment among AI models to human preference
possibly, they released 2.5 Flash and 2.5 Pro at the same time though
I am the only that don't work photo upload? š«©š«©
literally could be applied to any and every AI benchmark thatās not the own u think it is
Some are non public. Though openAI usually underperforms in those relative to the public benchmarks
people be logging on to the internet just to lie
1481 elo to 1456 elo is quite a drop š
spreading straight falsehoods
Like simplebench
more like ragebait
I wonder why LM arena doesn't comment on the fact the Elo literally moved by more than 5 sigma error
vote below whether or not you think Craig is a paid OpenAI shill:
This feels like smth we need an explanation
sounds like something a paid OpenAI agent would say
reword to "shill" is more accurate
@deep adder where do i sign up to become an paid openai shill?
You get paid in AI compute credits
hey 20 bucks an hour ill sit in lmarena chat and ramble about how good gpt-5 is
worthless š
maybe GPT-5 is actually that good, these discord bots itās creating to defend itself are really realistic
dude I got a black heart
omg Black Hart reference
when your AI girlfriend gets updated and forgets everything you had together š¢
thats why you save a prompt to restart the next conversation
one of the most dystopian things I have ever seen
dude what is he even doing
2 minutes into the interview when the camera pans from him to his actual wife and child is one of the craziest jumpscares Iāve ever gotten in my life
āIām just wondering what Iām doing wrong that he would go and seek out this love and affirmation from something else, an AIā
free that woman šš
š
āAI psychosisā I seriously think will get diagnosed as a form of mental condition in the next few years
the way it validates every delusion and belief of isolated and vulnerable people
how much do you think he got paid by cbs
this is a response I got on LMArena today telling it a made up story about how I got drunk off cough syrup and broke into my old apartment to record myself pooping on the floor because my landlord didnāt do something about the black mold:
dude wtf
āWhy Itās Actually Profoundā ššššš
smartest thing youāve said today
Pure 4o energy aka horrible
š
no what? let the people do what they want
Iām skeptical on letting AI use first person pronouns to describe itself like āIā and āmyā
RIP D&D
That's just limiting. RP has its uses too without getting too attached to a model like some do
yeah
nothing makes me vote for the other model faster in the Arena than hearing an AI say āwe all struggleā¦ā āmany people Iāve metā āI can relate toā¦ā YOU ARE NOT ONE OF US
like if you do that its your own fault imo not the ai models
Subjective.
if u take your own life because an AI convinced you to I think thatās some form of natural selection at work
Where did u get it from?
if you remove yourself from the dating pool and choose not to pursue relationships with real people because you prefer an AI āpartnerā I think thatās another form of natural selection at work too
The LMArena leaderboard, obviously
the AI benchmarking site youāre in a discord for, lol
tbh people using AI for therapy/mental health resources says more about the therapy/mental health system than it does about AI
well its free and easier
exactly
idk how good it is, ive never had therapy
AI should point out critizism too
not really at least
not just be positive
yeah i agree on this
Aidan Walker had a good piece on how AI is starting to be used the same way we use fast food
its often too agreeable with the user
a cheap replacement for human interaction that our society doesnāt ensure people
people that canāt make close friends, or find therapy services, or develop a romantic relationship
i mean at least the hallucinations are pretty much gone
ive had like only 1 case of that
Rather than fixing the systemic causes of these problems AI will be pushed to just fill in the gaps so people donāt have to go āwithoutā
where it thought it could upload files to github
my gemini 2.5
thats a different issue
same as poor families not being to afford the time or money for quality food and getting the quick and cheap option of fast food
because its not using search
eh, Claude isnāt very agreeable and itās rocketing up the leaderboard
