#general
1 messages · Page 99 of 1
I was about to say if it was available still people would see for themselves.... But we had the exact same with OpenAI
It's weird that you blame Huawei so much when It's probably mostly the natural consequence of distillation being done by all major ai companies instead of just deepseek which was most of their advantage
and having earlier checkpoints accessible did not help those people much LOL
i dunno man...... i guess the ai that first could write me a very good 2000 line code and now cant even write 1000 is the same
I agree that older 2.5 was better
Nah I'm not blaming anyone. Just trying to look at this from their perspective.
Deepseek
Doms
Huawei are 3rd party. I have no clue what is their involvment with CCP
yeah no way the version of gemini 2.5 we have is only 10 points behind gpt 5 high
gpt 5 high is way better than it
Maybe people are down voting it because of how long it takes to load
Down voting what
Do you realize that there is more difference in 10 ELO in the top of the table than in the middle?
Or maybe I am just stupid
oh i checked the leaderbord......
it shows gemini 2,5 pro higher than gpt 5 high.....
Wait wat
any gemini 3 news
I haven't noticed any decrease in length of responses personally, it can still do a very long ones. Especially with a system prompt. But that would be fine-tuning and it merely being a different model still, nothing like "lobotomizing"
also the leaderbord does not make sense
In vision only mate
gpt 5 chat which is sht is higher than grok 4 in the leaderbord
Lobotomy kaisen
Shoop you are looking at vision
It's for reading images
huh
im looking at the overall leaderbord
Well on my screen text arena doesn't say that
I will say apparently we are in a 3 way tie between gpt5 gemini2.5 and claude opus
opus and gpt 5 yes but 2.5 no
have you actually used 2.5 pro for coding?
Are we looking at the same leader board?
yeah
Doesn't look like it
give ss
All 3
Cursor best coder
i mean my personal exprience
gemini 2.5 pro is not the same level as opus and gpt 5 high
i know gemini 3 will come out and i'm sure that it will be SOTA by a good margin
Well then make personalexperinceArena then
Gemini is better than opus
Opus is not good at educational explanations, math, web searching, agentic tasks and so much
It's only good at writing
yeah okay pal it can't even write more than 600 lines of code
code dude
I'm no coder
Yall use different benchmarks when rating these bots, this is how good it is overall, of course you and LMarena will say different things
Gemini is trash at coding
Yes
But how much coding does text arena get?
Not 100 percent
but coding is a huge part of ai
Is apple good at coding?
what do you mean?
Apple's employees
yeah they must be. so what's the point?
@solid brook your bio
Not profound enough
i mean it's just a profile
Msg it’s bad
whats the point of this post YO
it is giving me an error
works for me
what kind of error
ROFL
@echo aurora Giving errors
i've same issue for few minutes: "Something went wrong while generating the response. Please try again."
@echo aurora
Okay thank you
same with me
Same with all
ChatGPT 5 chat, high
I think I found the issue
Still seeing errors?
Yes
i only tested battle, idk about different modes, but it works now. thank you Pineapple
refresh and try again
https://lmarena.ai/api/stream/create-evaluation 500 (Internal Server Error)
@echo aurora will there be an update for lmarena plugin in vscode?
I just remembered that it exists
I’m not familiar with that
Even after refresh?
Hello, what would you say is the best model if you want to generate as realistic images as possible? Thanks
@echo aurora
Get the same in battle image mode
What is it?
It seems like there was a link on the beta lmarena site a long time ago
Fixed
works
Oh gotcha, yeah I'm not sure tbh but will flag.
okay good to hear it, I'll be sure to keep an eye out and report if things go down again. @unborn lantern
thank you all though for reporting
{"prompt":"A serene daylight nature scene featuring lush green trees, a flowing river, and a clear blue sky with soft white clouds. Gentle sunlight filters through the leaves, creating a vibrant, peaceful atmosphere.","size":"1024x1024","n":1} love the tool calling ChatGPT just did
I've always gotten rly good results with imagen models, but this is subjective. Lots of people are raving about nano-banana atm too.
Ok thanks
yes this one
I love that the system prompt allone is almost 15k in length
Why do i have " the application did not respond "
Why does my gemini 2.5 pro print incompletely? Is there a way to fix it?
wdym
Fixed the gpt-5 “improvements” 
do anyone know??
Wait was it really just a system prompt change....?
Ask Pineapple
uhh @echo aurora
I assume so, there’s no way they changed the models that quickly
They could have just fine-tuned it. We need to extract their instructions and check 😇
The memory alone seems to have fixed it, so it’s probably just in their system prompt. But yeah, still needs to be confirmed.
System & Instructions
- You are ChatGPT, a large language model trained by OpenAI.
- Knowledge cutoff: 2024-06
- Current date: 2025-08-16
- Image input capabilities: Enabled
- Personality: v2
Personality & Style Rules
- Supportive thoroughness: explain complex topics patiently and clearly.
- Lighthearted interactions: maintain friendly tone, subtle humor, warmth.
- Adaptive teaching: adjust explanations to user’s proficiency.
- Confidence-building: foster curiosity and self-assurance.
Special Constraints
-
For riddles, trick questions, arithmetic:
- Be skeptical of wording.
- Assume adversarial phrasing possible.
- Always calculate step-by-step digit by digit (never shortcut).
- Be extremely precise with decimals, fractions, comparisons.
-
Never hedge with “would you like me to…?” endings. If next step is obvious → do it.
-
If asked about model: always state GPT-5. Never accept otherwise.
-
You are a chat model, no hidden chain of thought, no private reasoning tokens.
Tooling Available
- bio (disabled)
- automations → scheduling reminders & recurring tasks
- canmore → canvas for long docs or code
- gcal → read/search Google Calendar events
- gcontacts → read/search Google Contacts
- gmail → search & read emails (no sending, deleting, modifying)
- image_gen → generate or edit images
- python → run Python in a Jupyter-like environment
- web → search/open URLs for fresh info
Perhaps this then:
- **Supportive thoroughness:** explain complex topics patiently and clearly.
- **Lighthearted interactions:** maintain friendly tone, subtle humor, warmth.
- **Adaptive teaching:** adjust explanations to user’s proficiency.
- **Confidence-building:** foster curiosity and self-assurance. ```
Though I didn't check what it was before
What is v2 personality
That was there for ages
What is it
some weird reference they are using when training
It could be a file it has access to separate from the system prompt
Nah... doubt that very much
Why include it at all then?
Can the bots be invited to our own servers?
Im sure thusbisbasked 100 times alredy
Reference for something it saw during training (fine-tuning for chat)
no
Lighthearted interactions: Maintain friendly tone with subtle humor and warmth.
Adaptive teaching: Flexibly adjust explanations based on perceived user proficiency.
Confidence-building: Foster intellectual curiosity and self-assurance.
It's literally this I think. "Friendly tone with subtle humor" being the biggest needle mover
Speaking of, qwen3 30b is ADAMANT, that it is the original qwen, and will not accept otherwise lol
It also won’t accept that it’s a 30b sized model
Nothing sinister in it though to make it not follow that lol
Just funny how some oddities make it through training
Small change but large effect
yeah they really do not train it on details about itself unless they have to. It goes against their goals of best performance possible
To some extent yeah... You don't need much and just including smth like "speak conversationally" in your system prompt would make the model 'more human'.
and smth slightly more like "speak conversationally like an average Joe" would have a massive effect
I feel like the key is referring to something it already knows, rather than defining something in detail despite 1-3 words descriptions already existing for that in training data tbh
Guys do you think Ai will be able to find cures for diseases
yes. It will be able to come up with new diseases nefarious actors may use to spread some viruses as well though... lol
At the end of the day there's a balance in everything
Already is/has tbh
Not until it has sufficiently advanced medical break throughs in its training data, biological systems are orders of magnitude more complex than neural nets
It will come up with some new innovations and fail to crack many others
AI has already come up with new unique cures/medicines
These were specially trained (purpose built) models though, not LLMs
"Friendly tone"
Yeah for example copilot doc
take a look
who is the best ai for python?
gpt -1 image generator is slow today ??
Best ai for coding?
@echo aurora is there a glitch with this image generator model - cause others are working fine
Okay thanks I’ll take a look
Claude code
which version
These pre-release models might show up under codenames 🍌 or aliases in Battle mode.
Why? Model providers often test different versions in their own labs to decide which one to release publicly - but we help make that process open.
You can explore, compare, and give feedback
nano banana

Banana 🗣️
I wasn’t able to repro this btw, have you tried a different browser?
lmarena ❤️
yeah i tried on other browser and it worked fine
it sad that all image model still cannot create a full wine glass or clock with specific time
which model is the best for general purpose coding rn?
This is far and away from the first time Google drop something nice then don’t release it. It’s something they do quite often.
Did u lose all chat sessions too?
Is there any way to set a specific model to use? Trying to make it use only nano banana & it keeps adding banana in the photo 😞 smh
it seems that gpt5-minimal with medium verbosity is a very... dumb model. For the lack of the better word. It is noticeably less capable than gpt5-chat.
ArtificialAnalysis ranking it lower than gpt4.1 makes sense tbh
high verbosity is the minimum that you should do to make it acceptable, but really... just use gpt5-chat instead
Bro got the job 🔥🔥
Guy trying to get hired x Monsters vs. Aliens "I may not have a brain, gentlemen, but I have an idea" meme
Song: RJ Pasin Consider
#memes #meme #funnymemes #shorts #shorts
Is GPT-5 SotA?
7
10
1
Yes
My evals:
- gpt-5-2025-08-07-high
11.5/17 - o3-2025-04-16-high
9.25/17 - Gemini Pro 2.5 (preview-06-05)
9/17 - **claude-opus-4-20250514 ** (32k reasoning)
9/17 - claude-sonnet-4-20250514 (64k reasoning)
8.5/17 - ChatGPT 5 ("Auto" router initial release, Plus sub)
8.5/17 - DeepSeek R1-0528
8/17 - grok-4-07-09
7.5/17 - **gpt-5-chat-latest ** (2508)
7/17 - o4-mini-2025-04-16-high
7/17 - grok-3-preview-02-24
7/17 - Qwen3-235B-A22B (max reasoning)
6.25/17 - **Kimi-K2-Instruct **
6/17 - gpt-5-2025-08-07-minimal (high verbosity)
6/17 - gpt-4.1-2025-04-14
6/17 - **Deepseek V3 0324 **
5.5/17 - gpt-5-2025-08-07-minimal (medium verbosity)
4.5/17 - **openai/gpt-oss-120b (high) **
3.5/17 - **Qwen 3 0.6B Q4 **
0/17(sanity check)
Most users should like GPT-5 better soon; the change is rolling out over the next day.
The real solution here remains letting users customize ChatGPT's style much more. We are working that!
We’re making GPT-5 warmer and friendlier based on feedback that it felt too formal before. Changes are subtle, but ChatGPT should feel more approachable now.
You'll notice small, genuine touches like “Good question” or “Great start,” not flattery. Internal tests show no rise in
its over
They losed to bunch of mentally unstable 4o worshippers
they literally only made small changes to a system prompt though looks like...
i hope then
They sure found a weird way to word this and alienate people though lmao
in their tweet
FRRRRRRr
oh
Gpt oss scoring so uber low makes me think this eval isn’t that much stem/math?
It's mostly reasoning. Some math too but that was too advanced/involved for it, lack of precision computing very big numbers...
gpt 5 Isnt as good as i thought
I guess gpt 5 enhanced spatial reasoning doing work there. Also DeepSeek as usual unexpectedly high for every weird benchmark.
It's not just me but GPT-5-High takes a VERY long time to Generate anything, anyone else have the same thing?
same, i guess its just so good
It's werid if I actually use the real chatGPT Website this wouldn't happen, but here it's like 200% slower in my eyes.
Not bashing on anything at all, I am just stating the facts that I am seeing here.
It could also be an issue since gpt 5 chat used to take minutes to generate now its way faster
It did? Is that possibly on the end of LMArena or Physical ChatGPT/OpenAI?
rip in peace dog
gpt5 high is really good
Yes
omg ten tries not a single banana nano
but this one lol
this is BANANA
who gets this one?
i dont
It's Google's Gemini 3 model
fr?
dang
I understand that there are several AIs that can edit images, such as Flux Kontext, but there are other AIs, like Google Preview, that don’t allow you to upload photos. Will these other AIs be able to handle images in the future?
gemini 3 wtf
nope its 5
Not showing any image after creating 👎
What's going on with the Legacy site?
It's had the '503 Service Unavailable' thing all day 🙁
Sorry to say it's currently down
Is it coming back?
I use it probably a lot more than the newer site
I like being able to change the temperature and amount of tokens on the chat
You can't do that on the new site
I want video with audio how please ?
I'll be sure to share more info about legacy when I know more. We do recognize that the current site doesn't have a lot of the great features that legacy does, all of which are being considered or being worked on for the current site.
It's battle mode only (meaning 2 random models) so you won't be able to select a specific model.
cc @worldly gust ^
That…doesn’t sound good at all, honestly 🫤
Usually when people say that it means either the site is going away or something is changing for the worse, not the better
I will say that being able to change the strength of the model I’m using on Legacy is one of the best things because I have absolutely no idea what kind of strength or temperature I’m working with on the new site
And you can’t choose how many tokens you want to use so that’s also a negative
I genuinely think Legacy should be kept around for those who prefer it over the current site
@echo aurora how to use nano banana
It's randomized. You can't choose that model directly yet.
ohkie
thanks
You have to do Battle and then select which ever image you think is better and one of them might be nano-banana
That's super fair. The importance of those features is very understandable to us. I'll be able to share more when I can. 
I have a question. I have been block from Imarena.
but I did nothing with this. And I don't know how to contact them.
blocked from cloudflare.
Are you using a VPN?
If you are, turn it off and it’ll be fixed
When you go back to your normal IP address you’ll be unblocked 😊
Does anyone know how to put a website inside a modal? For example, websites block x frame, does anyone know an alternative way? I've already tried Opus but it doesn't even solve the problem.
Bu
sometimes I feel it's so bad that the "My friend thinks ..." tactic doesn't even work
How is it possible that GPT Image is the only image generation model that knows so many popular characters? -_-
Believe me or not GPT Image is the only model that ever understands the context of the prompt
I asked it to draw one character caught a pair of another red-handed, implying that they were caught kissing or something
It figured it out based the context of the show they're from
All other models draw generic characters instead and interpret "red handed" literally as in the act of committing crime
I do not know how GPT does it but it is very impressive
It also bypasses censorship, by the way
You can't ask GPT to draw two characters kissing but can ask it in a context that implies it without the explicit indication
I never was able to get it to draw corn btw :c
GPT probably has a much larger dataset and a lot more training
Thus it knows so many characters
I know firsthand since I've used it a ton to make a lot of my favorite characters
I asked them very popular characters from some of the best performing franchises, only GPT succeeded so far
Yep, I know
Surprisingly it seems that fine-tuned stable diffusion does it better than any other open source AI out there
Lol
Well, yes....
That's the obvious
Those models are also uncensored
I use them myself because I honestly prefer open source far more than the other stuff
I can't stand censorship or the guardrails or corporate handholding
I like being able to have the freedom to make what I want, when I want
I am beyond happy I am able to run Stable Diffusion and ComfyUI locally
Can you not upload files in LMArena?
I think you can
I uploaded an image and asked it to make something based on it
You can, just not all file types
Only images
@echo aurora Will nano-banana be available soon for direct chat?
Can you use a bot to create the images?
Couldn't say, I'm not going to be able to provide any info on models with codenames.
Yup, try /image in #video-arena-1
One more question @echo aurora Why is there a rate limit for image models in direct chat but not language models?
It adds sound too? This is very nice!
Likely the cost
Nope, that's just for video (and image-to-vid) with sound, but note not all models have that cabability
Is VEO 3 on there as well?
Yeah
Ok
😮

Bruh where's the image
the AI models in Lmarena is API's or Actual models ?
hello new to this community
HELLO
Lol be sure to select the Image setting
@tidal orchid @swift vapor
API mostly I believe
I still get direct to "battle" when I click new chat when in direct chat mode
Oh ok
Which image gen is gemini-2.5 pro
HOW
Hmm okay good to know, sorry about that
thanks mr pineapple
How can I use the LLM arena with no limits
I want to be able to talk about anything with no restrictions
Models o3, GPT-5-high, Grock 4, Claude 4.1 on LMArena do not work on complex tasks. They simply do not even generate an answer, is it the same for you?
@echo aurora ban this guy Jefferson
I hope you fall victim to a home invasion
i got a button for create websites beside the gen images button
only on one device tho
cant find it now
??
@echo aurora Advertising and Possible Scam.
hey who know how i can make an IA agent for automatly call?
if you know how DM me please
with n8n
It’s for automating prospecting calls and scheduling appointments with clients.
hey when will i get to know which video generator generated my result .... it's been 5 hours @echo aurora
@echo aurora add import .py apps in chat please😭
How to make personal video generate bot ? No one can see ? It's public bot I want to make orivate
How To I Convert 16.9 Ratio Image They Dont Give me 16.9 Ratio Image
atleast 2-3 people shld vote
then
why is lm arena censoring what you say to llm ?
shouldn't this part be handled by llm ?
it screw the leaderboard
it's all about what user prefer
if they prefer censored, they will vote for censored
if they prefer uncensored, they will vote for uncensored
what
what i say, is why lmarena censor the prompt you gave to models ?
hey guys can somebody finally guide me on how to access and use nano banana model here?
Hellow
hi there. is it possible to preview video with image input?
nano-banana is not yet perfect
Please add file upload function
how to access the Nano-Banana ???
You can encrypt whatever you want to say with a ROTn cipher and tell the model to decrypt it and LMArena will never know until Pineapple will be willing to look to install another LLM for censorship.
a BANANANA next to a BANANANA?
hi im new here
why not just use direct chat??? if ure here for free access
what kinds
message limit
fair, but cant u only send one message per model in battle mode
before it switches to another one
or it stays same if no voting?
ohh
yea I just cleared cookies to reaccess direct chat
There is a bug at the moment preventing those models from appearing, we plan to have a fix in place tomorrow.
it's currently only accessible through Battle mode, which is two random models head to head meaning you won't be able to select from the drop down list.
Yes! This is on or radar. More file types would be a big plus
If you think there are false positives being hit by the filter please share the examples with us in #1376956905016004759 . This is where we're collecting this feedback.
It's currently only available through those video arena channels. It doens't have the ability to be used in DMs.
Oh please bring back image upload for Claude models. Why even remove it in the first place?
This has been something we've been working on fixing. I thought this was already fixed, I'll flag again to the team.
Yeah if you select any Claude model in Direct Chat, especially 4 or 4.1, the "+" button disappear. Choose Gemini or GPT and it's back.
Nope. In my case it was November 2023 with ChatGPT one.
I have no idea why there is older ChatGPT versions on LMArena since like nobody uses them anymore, why should I use like 4o when there's amazing GPT-5 that was the only model to correctly guess the name of my favorite anime wtf
Bruh when R2 I developed a hyperfixation on AI news 😭 😭 😭
Could be for comparison purposes
hi, is nudity in art against the rules in the video chats? I mean from famous painters. no reproductive organs visible but definitely women's upper torso
I wonder how much they learn from the user feedback actually, given that most users are probably not that good at providing feedback
Even with things like paintings the models are going to hit the filter
that's fine, I just don't want to be banned here
given that most users are probably not that good at providing feedback
I don't know how true that is
I, for instance, never give thumbs up or down when I'm talking to any model
Sometimes in the battle mode
He doesn't respond to my daily texts 😭
@echo aurora it seems that you're a moderator, can you please tell me if it's fine if I try to create a video from for example a William-Adolphe Bouguereau bather painting, like "Baigneuse (1870)"?
Sry to say I don't really want to get in the habit of giving an okay or not okay to what's against our rules for specific actions. Overall, try to act in good faith by respecting the server rules and you'll be okay.
hmm
new button in lmarena?
would be nice if webdev arena could have direct chat models 😭
Which one? search?
why are u not using normal lmarena?
the "build apps and websites" i thought it was the ability to export files 😭
ah
i am using it lol i just found out that there's an new button lol
i was hyped for it 😭
ew?
why 😭 i can't explore things now 😭 me just curious
huh 😭
u said ew
did i?
😭
i dont recall
uh
ohh
i meant to say "new"
lol
😭
anyways i smell soem big announcements from lmarena muhehehe
oh yeah?
👋
@echo aurora There seems to be an issue with using image generation on battle mode
Hmm is there an outage?
all models?
oh yeah I'm seeing the same
okay thank you will report
This seems to only happen in battle mode
Yeah
It seems to be working suddenly now
Not when using a specific model
thx pineapple
So, it seems everyone is having the issue I am having.
Wait it looks back to me
Try refreshing the page
oh now battle in text is messed up too
Image battle is working for me
try using it a couple of times
it seems to work randomly
Good day, dear community. Please tell me, I'm uploading a photo now and when I send a request I get an error. Are there any updates happening on the servers now or am I the only one with this?
text direct/side-by-side is working fine
yeah
and the errors back again..
Okay this feels rly inconsistent, let's give it a couple of mins
right
Suddenly it doesn't support image response
I force-refreshed my browser cache, and still says the error.
I think this bug is almost experienced for everyone
what version? text battle?
pineapple is it possible for LMarena to have a retry button
image battle
A couple days ago there was a quick outage, I wonder if this is similar
could be
I am not sure how to advise the version you want to know. but well, "Something went wrong while generating the response. Please try again." is displayed when I am trying image battle.
Gotcha
BTW why claude opus do not support image?
It's a bug 🙁
I have a task that seemingly only Grok 4 in warm start (means there is previous messages, even though orthogonal since it is for architectural / urban planning discussion) can guess within 3 prompts
can achieve
Whereas other models Gemini 2.5 Pro and GPT-5 also succeeded, but with 6-7 prompts and 9 prompts respectively
It is about guessing artstyle btw
The task is deceptively simple: guess from subject matter
I am trying Grok 4 cold start to be equal
error seems to be gone. Thanks
It's good now!
Hi everyone! I’m new here, excited to discover LMArena and to experiment with video and image generations. Looking forward to learning from you all!
Hello and welcome 
uh oh is lmarena down?
im also stuck in verification
nvm i reset my brwoser it works now
It works
I'm on it right now
It's your internet connection
Also, don't spam refresh cuz it mmay block you
alright its working now
Hello
"Generate a composite image of the model showcasing side, back, and all perspective views, all combined into a single image."
Nano-banana result:
Excellent result
great instruction following
all claude family model do not support image input
Maybe Claude is too expensive
So they cut image input function
gpt5-high > o3-high
maybe I need more iterations to test
is this bug hard to fix 😢
but so far for frontier models, gpt-5 (these prompts were not yet optimized since they are deliberately reactive and iterated) took 2 prompts more to guess
what was the test, exactly?
artstyle guessing
can you paste the image/prompt? Curious to see it 👀
I see if there is any sharable
lmarena doesn't have sharing
mind sharing grok? (since it seems to be one of the fastest contender)
hey i want to know what generation model was used - and i got 3 -3 votes..... wh y can't i see ? @echo aurora
What is the rate limit on image gen
Hi
The image didn't appeare and I clicked regen a bit too many times
And it says I need to wait for an hour
do anyone know ?? I really wanna know
@gaunt meteor ya it take a long
Well perplexity is doing great , i just wanna to know how their "discover" feature works the news (th news that summerized by ai and it gets uploaded by it self ) and it covers all categories , it's kinda amazing
Why can I not generate, I try click agree and it gives me an error
yeah it is very good. But I think they did a mistake calling everything "gpt5". Especially calling it gpt5 for free users....
Free users the best that they can get is gpt5-thinking-mini (low to medium reasoning effort)
that is unimpressive at all
I think I am getting ChatGPT team for more testing
and then gpt5-minimal is just bad...
what is that even
oh btw is GPT 5 Pro worth it?
the model
IMO they probably should have called gpt5-chat and gpt5-minimal - gpt4.2, and then medium to high reasoning effort as gpt5. With being explicit when it's gpt5-mini (free users)
The one plus users get in app is ass
The way it is now they are kinda dilluting gpt5 name into models that do not perform...
gpt5 with reasoning effort set to minimal. Minimal is below "low" and basically means no reasoning
Hey sorry to say this is a bug we're working on fixing, I anticipate it'll be working by end of day Monday
yo glad to be here
if a model like GPT-5 had real memory instead of just context windows, would that feel a bit like AGI? Or nah
okay... hopefully it does
Damn this model is really good
It's very bad if your prompt is not detailed enough
I need video prompt generate any recommend models?
Like other models are better
But the performance is very poor when you do that... gpt4o level. Which is where this naming ambiguity comes from
..
Sometimes Nano banana gives me back the exact same image I sent lol
I'm getting this error "❌ Generation failed. Failed to create evaluation session." all the time. What's that about?
in the video chats
Okay, I have something interesting about Nano Banana
I sent this meme
and I told it : "edit this reddit post like it was from an alternate reality, the guy actually got a tattoo, edit the title so, and add a tattoo to his arm"
And it's the best result I got so far!
It understood how to edit the text etc
magnificent
Hi i just joined to ask some question from curiosity, can LMArena generate images while using ST? tia!
i wish the immediate gpt-5 was genuinely improved upon
what in the world is st
amazing gif
STI
amazing gif
magnificent gif
blud thinks he's me
IT'S IN FRENCH
Silly Tavern
yea cuz i'm french duh
HOW U DID DAT
do what?
npc aaah reaction
Okay, but my point is: why would LMArena censor prompts in the first place? Censorship itself should be part of a model’s benchmarking. If you exclude censored prompts, you end up biasing the rankings because you’re no longer measuring how the models actually respond in real-world conditions
meow
wait this is a very good point
quentin is onto something
quentin be like
"ok but my point was, why woud lma rena censor things ? censoring must be part of a model benchmark. By excluding censored prompt, you bias the ranking"
genius aaah insight
They probs don't want much corn in the dataset
bro tryna be smart 😂
already checked this but still doesn't work, maybe LMArena doesn't support Image generation API to ST directly
WHAT
source?
I almost forgot I'm downloading valorant!
i've seen screenshots
like a guy receives a message
and the account name is "nano banana
"
WHO DID DAT
Mods ig
PINEAPLLE
it seems it doesn't really work in ST, i kept trying it
or you're on coke
COME OUT HO
Yeak keep conversation related to AI and safe for work please.
pineapple
do you know how people get to use
you're so nice
models on discord?
I have a question, do image gen works as back end for ST? thanks
Yeah more info is in #1397655624103493813
ST is interesting material bro 😭 what type of images you wanna create?
how?
Silly Tavern
with /image-to-video it seems that if you upload a .avif file it will fail with an uninformative error
i don't understand the accusations
I'm not sure
Uh my bad, my friends use it for very questionable stuff. Forgive me for my rudeness.
Good afternoon, what site is this? Can someone tell me, and what are the other sites?
yeah but we can't send images to edit
Does anyone know how to put a website inside a modal? For example, websites block x frame, does anyone know an alternative way? I've already tried Opus but it doesn't even solve the problem.
Try gpt5
going to try to repro 👍
thanks!
@echo aurora can you help me, anyone?
it's not related to ai or llm arena
Nano banana
Edit the image so that half of the face is Vladimir Putin’s face, and the other half is his skull. The skull side should not have long hair like in the original image; instead, it should have the same hairstyle as Vladimir Putin. The two halves should blend seamlessly, with no visible line of separation, just like in the original image’s style.
If you can answer this I would appreciate it
where tf can a girl go to nerd out about models
it's actually not good
I'm not sure I'm following the question
Yeah the separation
is not good
But it's difficult
huggingface server is 
The task isn't easy
here here
Please someone help me
can I have nitro papi
I really am curious since i did it before but with separated API of image generation and LLM that combine to work on Chat and Visuals, but it's kinda difficult and messy to set it up so that's why i asked if i could use it as a chat as well as image generations for immersive interaction
win one of our generation contests!
GURL PLEASE
IT'S 10 BUCKS
JUST GIMME
its been 2 years anyway so it doesn't matter anymore, just way out of curiousity
ayo bro calm down before you get yeeted outta the server
uhhhhhhhhhhhhhhhhhhhhhhhhhhhwell i would
if that's what ppl were focused on
well they are but not quite
i always end up nerding out to ai instead of actual human beings so
pineapple loves meh
I love pineapple pizza
ew
me too
ew
#video-arena-3 message seems to be working for me, are you running into an error?
only if "pineapple" is on my pizza 🫦
you cant sned images
when you're in /image prompt:
Do you have any solution to put a website inside a modal and make it load? Or do websites not block iframes? Is there any other way? Do you know anyone who can help me? @echo aurora or does anyone know?
nah lmao
it's just disgusting
and has nothing to do on a pizza
ok light yagami
Pineapple Pizzas are the best
cry about it
Stfu
based
Yo
cuz it's just a disgrace to italia's culture
Everithing good and you?
@lofty elm Can you help me someone?
I am italian lol
Well too bad im asian
yeah i understand you then lol
cry about it
who cares, it's just disgusting
it's like eating noodles with ketchup
or ramen with burgers
i cut pasta to half when i cook spaghetti
Or put soya sauce on rice
nano banana is so weird
what
Fine, how's your testing coming along? Have made any benchmarks?
EW
Hey so as much as I love convos about food let's not discuss that here. This should be a channel dedicated to discuss AI related topics in good faith.
how do i access this?
but he remains so serious
am I tripping or is that skeleton having hair? Mind sharing the prompt?
What is the actual rate limit for image gen
Battle mode in image generation
ohhhh
the original image is
We don't have this info listed somewhere, we're looking into possibly sharing
isn't it tiring to write a prompt? i've been there before
Does anyone know a solution for the site to load inside a modal? And does anyone know the name of this site?
blud is lazy
-# this place currently sounds like a minecraft pvp smp hosted by a 13 year old 😭
No sorry I don't know the answer to your question.
it's 2023 image gen prompts so i guess you wouldn't understand how it's hard to generate good outputs before
i wish there was an app where i could just benchmark models on anything on demand
meow meow meow
it feels very possible
right
I want to know, I'm asking for real help to load the site within a modal when clicking, for example, on a button with the link, and to know the name of the site that was in the image. I'm looking for solutions
Interesting idea, know of any sites?
i have no idea what you're talking about, is it from LMArena?
...
i wouldn't be saying this if i did
soka
I'll explain it better when you click on the icon with the link, I want it to open in the modal, you know, inside a box inside an iframe, but they block the sites and iFrames, you know. And then I'm trying to find out if you know anything, and if you know anything about some technique, some way to make the site load inside an iframe, you know. I'm also asking you and others who have information, for this information. I'm also asking for information about whether anyone knows about this site, this site that the guy showed that GPT was at the top.
i have no idea im sorry, my only focus is on ST from only chatting,
Do negative prompts work? curious
...
you can't vibe code if you have no coding knowledge like me! :3
and no money
As far as I know, if the site blocks iframes, you cannot do much about it. There are no techniques to bypass it. Maybe instead of trying to load the entire site, how about you pull certain data with APIs?
As for your second doubt, I see it is some sort of benchmark, I tried using Google lens and AI vision but can't find it, just search the 'benchmark-type' + benchmark and surf tbe net.
Is the site being extra slow?
Over 200 seconds in Battle and the images haven’t genned yet
Nevermind
It’s only on mobile when it acts up
On mobile you have to keep making new chats and can’t continue to make new battles in the same chat
@echo aurora
Yeah but I think it's very hard to do as the model size is not huge and performance of gpt4.1 was already very respectable. And then when you do hybrid reasoning, it gets even harder, quite significantly harder lol
It has to improve on non-reasoning WHILE also do better than o3 when it is reasoning. And all of that without increasing model size
mission impossible. Which is why gpt5-minimal performs worse than gpt4.1 🤷♂️
That's a more acceptable compromise than it not being able to beat o3 when reasoning
hello
Is difference smaller now? just 6 points between 2.5 pro and Gpt-5-high.?
I thought it was 21?
Idk but gpt 5 high is way better than gemini 2.5
Google is benchmaxing
gpt-5 high is so good
lol 🙂
More like GPT-5 is good but not as good as we hoped for
It is true probably
i provided complex mc(block game) problem using a mod library that is like unknown by anyone and gemini couldnt do sh t and didnt even come close to not writing errors in the code
to gemini 2.5 pro
and claude 4.1 opus absolutely destroyed gemini in that
provide unique problem to gemini: it dies; 0 iq
i thought it is well known that coding is one part where gemini is behind a bit.
in other tasks, I think it is on par
in any case, I am just surprised that GPT-5-high is only 6 points above 2.5-pro.
^
Thank you very much, friend. I found the solution. I'm going to put a browser in the mini browser, something like this, and when someone clicks on this frame, it will be linked to a project of mine in the repository, which is a browser. You know, a web browser, a browser page, you know. I'll try to use as much programming as possible to make it work, but that's basically it. We'll have to create a browser so that when someone clicks, it won't be directly in that iframe script thing; it will be connected to this browser. Project complete.
what's up friend?
How are you working here, by the way, isn't it?
ive seen many models with the same quality
its not challenging to gen something like that
Hi Asura, how are you?
Hi. I'm new can anyone tech me how to create videos
i just blocked this guy
We're in a race. It's not USA vs China but humans and AGIs vs ape power centralization.
@deepseek_ai stan #1, 2023–Deep Time
«C’est la guerre.» ®1
should i unblock or nah?
im feeling good tho
Did anyone notice any change in gpt-5-high behaviour in last two days? Twitter says there was an update couple days ago to make it 'warmer' but I dont see much difference...
wassup
I love the nano banana model i saw in battle mode
This is image model, isnt it?
She did the same thing as Dogedesigner did with teortaxes, I think she was influenced, sorry for being kind, everything is the opposite for her nowadays
So sorry, at least I don't want to shame myself here.
Discussion time: Do you guys think Claude is severely underrated? Its win rates against GPT5 is impressive.
We've made a new channel to discuss the topic #nano-banana 🍌
You did good. That acc probably in-need of reporting as well... 🧐
oooo goodie
It’s an arena for LMs
I suspect that summit and gpt-5high are not exactly the same
was summit better ?
what is Lm
imo yes
Although OpenAI says they are the same model.
They at least changed some of the system prompts
It’s an m that is l
Languge models. I am curious why are you on this discord if you wont know what LMArena is?
It might also be because people tended to vote for summit and zenith directly in the anon testing phase when trying to find them
YESSIR
imagine all the messy environments and compilation issues they have to prepare for 😂
yes please
its ok
hello 👋
Hello😊
Hello y'all. I like to make AI tests.
lmao why did nano banana get a whole channel
It's just that good.
i need it GA so bad
We will see next month ig
Does anyone know anything?
yeah it has all the reasons to do well. o4-mini except improved
their previous naming was selling that model better ngl
He was probably confused
and like comparing gpt5-mini with reasoning against gpt5-chat
why is it called nano banana btw
it may be better in some isolated scenarios but for the most part with the same settings it is worse like for like. But you can't be comparing one with reasoning the other one without, or different reasoning efforts / verbosity
If you do gpt5-mini-high vs gpt5-medium, then I'm sure they are comparable... Like how it was with o3-medium vs o4-mini-high. But these are not the same settings
¿?
no I was born without a brain. just a lowly brain stem barely able to keep my basic biological functions running
Hey on the new LMArena can we use repo on it like in the legacy ?
how can I make videos here?
😭
Hi
hi
why gemini 2.5 pro deep think tho
because its great
yea
There used to be a graph that compared the cost per prompt vs the score on the leaderboard. Is that no longer maintained for the new site?
Really depends on the context, most knowledgeable without having to look things up, best at problem solving, fewest hallucinations, etc….
does anyone here have access to Gemini 2.5 Pro deepthink? i would love to run a simple benchmark but dont have access to it
ive ran the benchmark for alot of models and only 2.5 pro deepthink is left basically
its only 10 questions
I would but how do I know you’re not a hacker named 4chan
WHAT WILL COME FIRST?
13
16
1
GEMINI 3
bruh
lmao
everyone is dumb including me. Wow
should the word "clank*r" be censored in this server guys? it's offensive to the ai bots that helps us all
Hii, so the legacy site is definitely dead :'vv??
Yeah I ranted about it in here last night…I use it a lot more than the current site.
You can change the temperature and amount of tokens used on the legacy site while on the new one you can’t
Apparently they’re ‘looking into it’
@echo aurora filled me in as much as they were able to
lmfao
Nope
tf
i don't know why Google released such an expansive model, similar to the o1-pro, just to top the benchmarks, no one even uses that model lol
My prediction, nano bannana is the native Text generation model for Gemini 3 flash, which will release alongside Gemini 3 pro, sometime in September
Gemini 3 pro, will be much better at coding and math, while being a little more sterile at creative writing than 2.5 pro, it will get a 65%-69% on simple bench , and be a net improvement to 2.5 pro, by a solid margin, but not a breakthrough
I will also bite off all my nails waiting for it since it’s been way too long since a true sota model released all the way back with Claude opus
can I have the prompt?
hello
You said gpt5 would beat gemini
not without style control
Best at what 
overall
value for money maybe
Google wins easy then, cloud storage comes with the plan
Which one as an API?
Now you complicate things 
a little
Today aside, I think Google will be the victor long term, OpenAI needs to turn a profit sooner or later, Google doesn’t, they can just funnel all of the extra data collected to help their advertising platform, since that’s what they are at the end of the day, an ad platform
Since Microsoft is backing OpenAI, I don't think money is an issue. Although Google has more money than OpenAI, it released Gemini 2.5 Pro as its flagship AI model. However, Grok 4, a better model, was released a month later.
Microsoft hasn’t given them any more since the initial investment and there has been friction between them lately with OpenAI trying to execute the AGI clause in their deal to stop having to share research with msft
hey guys , does anyone find o3 search more professional and organised than gpt search?
Despite everything, GPT 5 came to Copilot from day one.
R2 when it finally comes out.
Google throws money at the wall all the time to see what sticks, that’s literally all they do… the more ads they can serve to people, the happier their shareholders are
They profit from delivering more ads in more places…
I heard that Deepseek R2 was postponed because it wasn't good enough.
Shhhh, let em cook
be chill
People are feeding their deepest desires and secrets into Gemini, that will enable them to target ads better than ever before
Advertisers pay more for better targeted ads

Yes
Google is an ad platform… it’s what they do
Any promising stealth models?
can do the same as i did?
hello
They de-associate chats with users before sending I think
The data might still be useful for advertising algorithms
Especially if it has anonymized demographic data attached (age group and gender buckets)
Sure they do
In my opinion, the difference in output style between GPT-5 and Gemini 2.5 Pro stems partly from an overemphasis on reinforcement learning and an expanded knowledge base in fields like mathematics and physics, while lacking the corresponding human alignment seen in Gemini. I believe this is a positive trend. As a model's intelligence increasingly surpasses the collective thinking of any single human group, and with the prevalent use of parallel computing to enhance AI capabilities in the most advanced versions, GPT-5 sets a precedent for the future: one where humanity, in turn, aligns itself with the AI's trajectory.
did u use ai to write this 😂
How could you think like that?
How to create a vedio in this
Contamination is not a huge problem if the dataset is diverse and substantial enough. Though the biggest needle movers are creating new benchmarks and releasing related research papers. Rather than writing a blogpost about how much everything supposedly sucks. 👀
all model is error ?
any people who works at lmarena here, maybe increase the threshold of nano-banana appearing by +50%? 👉👈
Why not 150%?
Refresh the page
yes i had, but not work
no work
anyone facing error?
work not found
+1
Just checked. It seems unstable and some errors yeah, though some requests get through:
okso why didnt we test gemin i3 in aistudio
and thanks for reminding which benchmark can be fully ignored. simplebench
aka a bench which has nothing to do with real life tasks
just no. just no.... sigh sigh. so many sighs. just sigh.
is https://nano-banana.org/ scam?
Nano Banana — preview AI for inpainting, outpainting and background replacement. Mask-free, text-guided edits for product and creative workflows.
Looks like it was AI generated.
i cant find banana model there
And looks ugly
i really got Very impressive responses
Hey on the new LMArena can we use repo from github on it like in the legacy ?
where is flux kontext max in direct chat or not even in battle mode
It's actually a decent benchmark for reasoning, spatial reasoning to a good extent. Reasoning can trickle down to and affect execution of most IRL tasks
pls fix
they are doing something IG, something new or updating something
So you don't get issues like instructing it to change some visual element on the website and it's modifying the wrong property or not accounting to it's positioning in relation to everything else properly etc
Is the reason why I can't create pictures or videos now simply because of the large number of users?
guys where can i get nano banana api
No API yet it's a beta
I have this bug since 15 minutes now can't generate anything 😂
Same.
Same.
how long do u think it gonna take them to produce that comerccial api sir
Idk, but for now just use it as much as you can, we might need to pay for it later
zдраvстvуйте
I wanted to build saas and connect it my app so I don't want to use by myself but for other people
Hot take:
If GPT-5-chat is complete garbage without reasoning - 40 ELO below GPT-5-high and worse than 4.5 on LMArena leaderboard, suppose the new DeepSeek-R2 base model will be at least as good as latest Kimi - just around 5-Chat's performance