#general
1 messages Β· Page 215 of 1
its a flash model so obv its worse than 3 pro
Just kidding
Here's the thing
we are testing the EARLY versions
the final flash could be worse or better
but apparently it is supposed to be as good as gemini 2.5 pro
yea
but expected better tbh
OpenAI already put hazel as a new image model into LMArena
lets see the cost first
its OpenAIs so its prob gpt img 2
officially called gpt image 1.5
@torn mantle do you remember we were used to chat in pplx server?
there are two editing and two image gen-only checkpoints on lmarena, but more exist on OpenAI's side
The competition is Cutthroat
Here's the thing with it
It had a style that identified it as gpt img 1
When i saw it i knew what model it is
it's good at making studio ghibli edits, reproducing album covers, putting a yellow filter on images, and not much else
@whole sundial is Nova Actually Good?
not really except for the unreleased pro model
nova canvas image gen is terrible, editing is even worse, stable diffusion img2img would do better
hazel-gen-2 looks like a solid model ngl
Oh well it will get better in future
The Benchmarks are kinda good on artificial analysis
it's the smaller one, 4 is better
What did u test?
seahawk and skyhawk models are experiencing deployment failed errors, so we wait for pineapple to fix
wait let me find it
ok
unreleased pro model.. you mean stealth-nova? @zealous sparrow π
?
i think you meant silentnova @quartz light
@whole sundial did u test Tensor 1.5 More?
raptor
not yet, but i will
there is like a new model in battle textarena called silentnova
and we suspect its amazon
because of nova in the name
sounds like google or grok to me
So what's your opinion so far?
it's obviously still a chinese model finetune but it's not a bad finetune
Is movementlabs good? idk who its done by and what its best at
wasnt nova amazon?
It's
im tryna get silentnova to get the info of who made it
Is a paradoxical situation
:/
They can't afford to lose on Benchmarks
@hardy swallow
The world basically judges by it without experience
<@&1349916362595635286>
Dude every time I ask claud 4.5 opus what model it is it says they don't have a model π
wait a sec
Then when I say is it 4.5 opus it says there isn't a 4.5 opus ππ
lmao sonnet what
#1397655624103493813 create video of dog
why is silentnova so rare
Ok, be honest. Anyone got silentnova in textarena yet?
i havenβt used it
You'll want to head into the #video-arena-1 channel and type out /video then provide your prompt.
I still refuse to believe holo-scope is google..
Why would google make 3 models for Battle??
baidu? minimax? some other chinese ai company?
So like
frame-flow used to say its google right
later changed to be ERNIE
this migh be the same case
I know that
Their coding results are what reveals them
this is fake we have not released a score for gpt-5.2
Metter btw
Yup
do not spread fake information about us
I created it via Nano Banana Pro
Relax
Is that fake guys?
Nope
Looks real to me
ok, sorry if it's a joke that's ok, just I've seen a lot of people spreading fake LMArena scores around recently and it got on my nerves
I mean who would believe it
No way is it free on api
Sure!
sadly people do π€¦
Posts like this got a bunch of comments and retweets https://x.com/teslaownersSV/status/1997008559469650261
yes
Tweet by Official account
i know how to check for sure
not yet but will try
I remember I went to their official website and there was no model
Elon Musk signed up this account.
ERNIE is a conversational AI developed by Baidu, global technology leader from China. It's designed to understand complex questions, provide clear answers, and assist with learning, problem-solving, and communication.
they revamped ui i hate it
bloated
Yeah usually ai companies lies about their model being "top 1"
Only xAI lies.
I hate when ai use Chinese language on thinking very annoying
Some times they just write it on output
for me it doesnt
Z.AI does for me. And if I use VPN, ChatGPT and Gemini think on the language of the country of my VPN.
I accidentally deleted a chat. Does anyone know how to fix it? Thanks.
No way.
<@&1349916362595635286>
oh god, thanks
Sorry to say there isn't a way to get this chat back once deleted.
Grok 4.2?
we are not deada-
when will be able to upload files other than images in lmarena for coding etc...? is that in the plans?
okay jee-
Flash 3 is definitely coming
ok but like
what can holoscope be
seahawk and skyhawk are flash models
then what is holo-scope??
it's a media trick innit? the fact themselves are not quite wrong, grok 4.1 is either 1-3
what do you expect on X anyways
it's full of bots and elon praisers/haters
Perhaps GPT 5.2
Grok 4.2
Possibly GLM 5
Grok itself aint a bad model from the pure perspective of a casual user, Iunno about programming
There'll be Grok 4.20, not 4.2
It could be a new qwen model too
we changed our ranking so that rank column is now without ties, they are not #1, they did not have "Highest Elo ever recorded" and they did not have a score of "98.7/100" it's total nonsense and lies
hello does this site contains subscription ?
no
Musk confirmed it several times on Twitter
yeah that's media trick, 1 truth 3 lies, 5 embellishment
I know but why a extra 0
it is what it is
π
Ask him
so if i have an account do i have unlimitied chat?
No
There are plenty of limitations
Kinda but there are rate limits
Expacialy for opuses, and nano bananas.
gotcha so the limits in chating wise is a bit low?
These are very strict others not as much
I mean yes
I said the same
Mainly for Opus 4.5 and Nano Banana Pro and such expensive models
All opuses, and all bananas. There are plenty of them.
True but wait they count differently?
I thought they count together
No
@echo aurora this should not override the last used setting if set manually, please tell team to fix this before they push this update βΉοΈ
That's Actually Good
grok imagine 5
prompt: "turn his frown upside down to turn it into a smile and make him purple and make his team green"
Is it on lmarena
is this a new model or what
idk whats the prompt here, but if its google then we can assume its flash lite
Idk if you notice, but does it seem like that gemini nano banana pro on lmarena is down almost every day?
meh
ooo where do i try this?
Any news on new gpt model?
what new gpt model
I expected it to be released by now. Wasn't today the release day?
no its thursday
It's an expensive model and I believe lmarena pays for its use and it's not free from Google.
Lmatena may have a quota on number of queries per day on it.
(generated with nano-banana-pro-2k on lmarena.ai)
What is this new gpt garlic model thats being talked about
improved gpt 4.5
ok
never
LMARENA pays for other apis already
π
it will never get an API
π
anyone know any free unlimted video gen apis
or do i have to selfhost on comfyui
why are there so many bots
π
π
π
π
???
what was the prompt? is holo-scope another ERNIE then?
say pineapple if you are real
say pineapple if you are real
say pineapple if you are real
@zenith scarab @proud sail @shell crypt
<@&1349916362595635286> the hand wavy guys are bots It seems? at least some of them
any luck on uh integrated-info textarena yet
π
yup
LOL 100%
what are they for???
really weird
they're all bots
there are three video arena groups how i know they use which ai model ai tool
Fast is fine but accuracy is everything
custom chatbots for support, onboarding and communities
automation agents for daily tasks and workflow operations
RAG search tools for documents, knowledge and internal data
speech-to-text pipelines for meetings, summaries and reports
content automation tools for writing, planning and rewriting
AI integrations with major platforms and APIs
small custom AI tools for everyday use
I am an AI Developer focused on delivering stable, high quality results.
feel free to reach out anytime if you need support on your AI project.
what
<@&1349916362595635286> autopromotion
and bots
lmao
Fr
Oh look at this, using AI to make a scam video. #video-arena-2 message
@torn herald imagine not knowing how to use a self bot. Looser π₯ππ€‘
LOL
check dms π
If anyoine don't read spanish....."you have a transfer of $100,000 at Banco Pichincha and it is completely secure." Typical scam phrasiology.
If this is actually what they are using it for, then yeah imagine commiting crimes in a public discord
Fr
Well we're all anonynous here so - if the account is spotted, just use another.
Just like that guy who have been using a dozen accounts to generate scenes from the same film for weeks now.
Ehehehe
U forgot a thing
Ip grabber?
Guys did chatgpt 5.2 release yet?
Nop
Well I pointed out that multi account guy to Pineapple and the other mod, what I know they did not do nuffin.
I guess the 9 December is gonna be on 2026 atp
I think its hard to ban someone from videoarena
Instead of this December
Yeah, when the user is question most likely use both VPN and perhaps other tools. [I'm only on VPN myself, making me appear 800 klicks from my actual location.]
But anyways I hope 5.2 is gonna be good
Though it seems kinda rushed so idk if it's gonna be better than gemini
no
theyll probably say "oh its delayed because we need to refine it"
or theyll release it last minute
only certain undertale players will understand who this is
[skyhawk]
https://019b04b0-5fe9-7b27-99b5-74e806887f1e.arena.site
im sharing bc
of cool design and 3d
also there is no victory
you just die
is that undertale bossfight made with ai
no way
yes way
do you know who dis
tho
walmart on the left
original on the right
admit its identical kind of
grok 4.2 will be sooooooooooo bad
we can already tell
i mean he usually hype slop models but i havent seen him sayin anything like 'IT WILL BE THE BEST MODEL'
elon musk yapping again
please dont be ass
please dont be ass
only thing xAI is good is at is UI design for ai websites
skyhawk is my beloved
how are you finding this model? Good or meh?
Gpt 5.2 just broke lmarena elo rating leaderboards
And grok 5 imagine is #2 in image editing
And also the gpt image 2 is in top 5
@queen veldt wth are u talking about?
People spreading misinformation in the last few days
X is full of posts
This grokipedia is terrible
'He' considered wikipedia to be too left leaning [wiki as quite bad but not for that reason.]- so he launched an alternative.
Grokipedia is a piece of crap
Hey everyone - sorry I was in a meeting and couldn't respond to the π reports. I've gone ahead and let our mods know to start removing the "hellos/waves" content from the #leaderboards channel, so we'll start to be more active in keeping the discussion in #leaderboards about leaderboards. In terms of scams they're making with Video Bot we also want to be removing this content, but would note this stuff is a bit more difficult to spot (compared to the other stuff we're modding out), but yeah if you do come across content that has bad intentions don't hesitate to ping our mods.
For those that are a little suspicious we won't be moderating them out, unless they start breaking server rules; being disruptive, etc. What this looks like can be a bit subjective, so don't hesitate to let us know if you think that's happening. But overall I don't want to start booting people from the server just because they're sending a π emoji (unless it continues to escalte into something that we find annoying). Hope this helps. Let me know if you have any thoughts or feedback on this. At the end of the day we want to build a server that benefits the community, and you all sharing your thoughts will contribute to that.
Elon musk stuff = low quality stuff
starship = low quality?
Never listened about self bot?
They were spamming with self bot
I tracked all of their messages and never sent a message different from the wave
Agreed
Also the member doesn't have any role that increase the % of bot user
And also no pfp
And we are at 78% of a possible ai user
Then, do whatever u want. U are the leader here not me :>
Yeah I mean these are all things that point to bot, for sure. However, my concern is if I start banning these folks, even if a small percentage is actually a person, I'd hate for them to be collateral damage.
Now if these bots were causing harm, that's a different story.
But at the moment it seems like a lot of π in leaderboards (which we're now going to be actioning).
This can change though, I'm very very much open to changing how we do things here if it's what the community watns to see.
Try to go on the moderating discord section. U can see the all messages sent from the member. If all messages are the wave that means that is a bot
We should talk only about models in leaderboards, right?
Hmm yeah
I'd say anything AI leaderboard related is fine. Doesn't ahve to be LMArena's leaderboards, it can be others.
And future models too?
Yeah speculation about where models will land is totally cool to discuss.
Would note too these server rules can bend a bit depending on the context. If two people are having an in-depth convo about leaderboards, and the convo starts to go into a new direction, for the most part we're not going to step in. However, if two people immediately hop in there and start discussing something unrelated, that'll be treated differently.
A lot of the moderation stuff we do is going to boil down to β¨
it depends
β¨ .
And yeah I agree that it's very sus looking, but unless it's becoming disruptive I'd rather the mods (and myself) focus on other things before we start prioritizing these kinds of actions. I hope that makes sense.
But again want to reiterate - let me know what you all think. I'm happy to make changes.
Ok bud :>
I edited it with AI for better image quality.
Why would you do that?
2 minutes ago
Naaaaah
No its faken
Lemme cry
It doesn't matter, I wanna bring a joy to good mood to people.
There is no joy in this, fake things are fake... But you do you I guess?
Don't tell this guy about The LMArena-discord gork incident
Tell me that
why does LMArena ai have rate limits?
because rate limits are to prevent abusing the ai testing?
think thats common sense
they are to much π
the rate limits are insanly high
high or low?
your testing ai not using it π
uhhh
oppsie
i kinda use it so i dont have to pay for claude
kiro
i send 5 messages to claude and
Rate limit reached π
Opus or sonnet?
opus
There are two opuses with thinking and without, you can use both.
stop with ad
what do u know about spintronic memory?
But it's expensive and impossible for regular humans to achieve
sounds like scam
I've just read about it
IBM and Stanford University set up a lab to harness a quantum property of electrons called spin. New chips based on the so-called spintronics technology will be faster than conventional electronics and generate far less heat. Amit Asaravala reports from San Jose, California.
its been since 2004
It's super expensive tho so i don't see it in near future
But we got the ibm quantum computers
You can get 10 mins of ibm quantum computer for free π
Datacenters aren't the future
They must find some other way to make ai power-efficient
What would i do for 10 mins with quantum pc
just found it myself
voted for ChatGPT-4o over it though, lol
Ten minutes on a fault-tolerant, large-scale quantum computer capable of running Shorβs algorithm at meaningful key sizes would be enough to collapse most classical public-key systems if the attacker had already captured encrypted traffic. RSA and ECC would fall. Any stored handshake, any archived encrypted session, any intercepted key exchange becomes readable.
Such a device does not exist. Current quantum machines cannot factor RSA-2048 or break ECC; they are orders of magnitude too small, too noisy, and too slow.
If a future machine reached the required qubit count and error-corrected stability, preparation by pre-recording traffic would make the ten-minute window sufficient to run the needed quantum circuits and extract private keys, which cascades into broad compromise of internet security for any system not migrated to post-quantum cryptography.
For how long does LMArena host a website that was generated at my behest?
The preview link from Code Arena?
What year do yall think OpenAi will release Garlic?
...This has happened to me ever since today, every model says "something went wrong" even for test messages. Did anything change overnight?
#1397655624103493813
<@&1349916362595635286>
@knotty whale Hello! please read https://discord.com/channels/1340554757349179412/1397655624103493813 to learn how to use the bot properly and https://discord.com/channels/1340554757349179412/1397655695150682194 https://discord.com/channels/1340554757349179412/1400148557427904664 https://discord.com/channels/1340554757349179412/1400148597768720384 for your creations.
This one has free models kinda
They might have a video generator
When you send message?
Maybe refreshing could fix it or there might be a bad word in the prompt, So lmarena refused to send it.
What's that?
holy
What did they do to make the site even more unusable now
hello
hi
So this might be a stuuipd question someone mentioned token limit? After you get the retry how lognn do you agve to wait 1 hour or 2hr or? Is the toke limit aka the rate limit?
Hope you'll understand what I'm trying to ask lol.
Also forgive my spelling mistakes
Hello Andre and I'm all here cuz I want to learn how to edit in create videos and edit photos and images I'm trying to create a podcast and I want to be able to edit my own material as well as create some material myself for skits
Federal officials announce arrests in a widespread international AI technology exporting scheme with links to China.
Told u
No
like tracking style
Ya
No
But
It has a little less guardrails
ah
how to generate video in LM arena?
Hello! Please check β how-to-video-bot to learn how to generate videos.
I figured out how to unlock PokΓ©mon
Why grok don't have images button
how
Will Gemini 3 flash be better than Gemini 2.5 pro?
3
4
2
No
Has anyone found any anons in the search arena? I haven't. Weird there's 1000 anons in text arena but not search arena
Hello guys, is anyone else experiencing β Something went wrong with the response, please try again later β error mid chat ?
Has a 50-minute waiting time been added to the regular Claude Opus 4-5 session?
I am using Claude - opus-4.5 and Sonet 4.5. Same problem with both models. But works when open a new chatβ¦.
Waiting time ? After how many prompts?
Claude has the worst rate limits on this app, by far. They are literally run by communists. That's why they are going for regulatory capture. Government approved monopoly and they will be able to charge insane rates for unlimited or increased limits.
I don't know exactly how many requests, but I created a new chat with Claude 4-5 and, after a few messages, I was blocked and can only use the chat again after 50 minutes. Until an hour ago, this waiting time didn't exist in this version, but it did exist in the Claude 4-5 (Optus Thinking) version.
We need a full on boycott of Claude to send those commies a message we won't bow down to their crap
I was using Sonet 4.5 20250911 yesterday night and got β Something went wrong with the response, please try again later β error mid chatβ¦.. then i started new chat using opus 4.5β¦ now same error againβ¦. But after your message, i went back to my old chat and still same errorβ¦.
Clear browser cache
Then log out and log in back
I agree. On official website, Yes. On Lmarena, No.
I used the website and it allowed only 4 prompts. But Lmarena allowed more than thatβ¦
But i also tried on my phone, same error. And this was the first time i used on phone, so how come same error again?
are you only experiencing these on claude models?
moreover log out and log in back
usually it works for a ton
Honestly, i havenβt use any other models yet. I am only using Claude for a coding project soβ¦
I can't start any chat with any artificial intelligence on my computer, so I open lmarena on my phone and create a chat there, sending just one message. Then, I try to access that chat on my computer and the error disappears. Sometimes, the error reappears during the chat, so I send another message from my phone.
So i can go back to my old chats and try again after clearing cache?
I see
of course they are not related to cache memories
log out from your account in lmarena and log in back
and clear cache
Good to know. But for me its been 12 hours and yet i couldnβt access the old chatβ¦ but according to pranav it works so imma try that
you should try both
Same thingβ¦.
I cleared my cache memory, logged in againβ¦. Typed a new promptβ¦.
@bleak lake any other suggestions? πΆβπ«οΈ
I also forgot to mention, switching to different models yet pops up same errorβ¦. So is this a problem with the Lmarena itself ?
Ping moderators ig
should i just @ moderator ? Or any specific person?
Number of captcha is insane, i have to fill a captcha for each new message
@quick jackal bro you here?
@echo aurora
I did it for you
Thanks bro
He's probably asleep, its 2am, lol
Ah okok. My zone is IST so itβs afternoon for me π
You know a glitch is bad when even after a full-on reset it does not go away ( was turning on developer mode)
hii
Now this is seriously bothering me alotβ¦.. after every few prompts i am getting same errorsβ¦.
If the limit is reached, it should mention that.
If i switch to another model, it should atleast respond and not show same errorβ¦.
My friend had the same problem honestly
Idk if he fixed it or not
Probably could be due to bad internet maybe
hello, came here for the fun video making
Nopeβ¦. Starting a new chat always workedβ¦. But this error is coming mid chatβ¦. And wont load even after reloading, logging in again, clearing browser history/ cache
But every time i open a new chat i have to load the data from previous chat again, and if there is a limit of prompts, then i am already using multiple prompts just by typing old data
hello and welcome check out #1397655624103493813 for instructions on how to use the video bot!
How can I create a video with sound?
only some of the available video models support sound, so it's up to random chance if you get one with your request
OK ty
The overlapping brand icons is a chart crime and LLMarena should be ashamed of themselves. I can promise you, nobody is going to be fooled. You should fix it or lose all credibility
hello
lol are you serious?
grok is second or it isn't. if it's second, it's brand icon should overlap anthropic. Simple as that.
it was generated using https://app.flourish.studio/ we didn't set that deliberately it's probably like alphabetical or based on the placement at the beginning. The video has the names in order the ranking is clear at all times.
future is coming
Please why is it I can't choose my video model
fr
last time too when gemini 2.5 flash released
everyone scking it off with youtube thumbnails like "the new best model" like tf its a smaller model
i think its the poor people community celebrating
Hello, I'd like to report a problem with the LM Arena software, specifically with the Nano Banana Pro model. Sometimes the model stops responding and displays an error message. This has happened quite frequently recently. I hope this issue can be fixed. The same problem also exists, but to a lesser extent, in Cluade 4.5 Opus Thinking.
general
Yes mods are aware of it, for a while now
It might be the problem with nano banana itself
Those are rate limits because this models are expensive
Okay, but sometimes, even if I didn't use it that day, I get an error message.
That would be due to server overloads i predict
Fine, thank you
Sry and yw
@gaunt spade
"its a wrapper for glm 4.6"
"its a wrapper for gemini diffusion"
1 million-token context window

Gemini 3.0 Flash coming soon.
https://youtu.be/nn-lTCcMJWM?si=3d427lr2pFexH0ma
In this video, I'll be discussing the recent launch of Gemini 3.0 Pro and the appearance of new models called Skyhawk and Seahawk on LM Arena, which are likely early checkpoints for Gemini 3.0 Flash. I put Skyhawk to the test with my King Bench questions to see how it stacks up against competitors like GPT-5.1 and Sonnet.
--
Key Takeaways:
...
They have a official discord server?
But wait what is tensor
I have never heard this name before
this channel has the most annoying ai generated voice
there are tons of good free TTS and small models and hes still using this
and intro
yea
that intro is annoying as hell too
I know he thinks that the voice now is kinda his signature but he needs to c hange it
I mean as long he provide information, I fine with it
how are you 100k and you cant afford a good TTS
even if you dont want to there are gazillions of free TTS
Give the List now
Is rate limits still there in Lmarena???
Always will be
Unless Ai becomes absolutely cheap
Bruh I want answers!!!!! π
Try other tools too
U can try Yupp Ai if you want
@echo aurora please remove lmarena rate limits
He can't and he won't neither
Compute is expensive bro
π
kokoro / edge-tts
like so many
you can even run them locally
Give the entire list
brotha
I don't have a pc
Alright tysm!
should be enough
Happy Gemini 3 Flash day everyone. Is it on the arena yet?
Are these best ones?
More likely as anonymous
they are not
there are better but i dont have a list
I mean in Free List
these are more lightweight
Bruh ok
no there is a better one but the model is big
No worries I don't need for now anyways
I need online tho
It's already been on there as an anon. It should be coming out today, according to sources
edge-tts is online
its an api wrapper based on microsoft api
its good
That's great
Gotcha
Tysm again
May you be more useful in future
As well
you can try it here
ava is a good voice actor + andrew
there are like some 4 decent onces
Gotcha ty again
Name them
i agree
andrew
Yeah
Better for females
isnt there a webgpu one too
kokoru model
kokoru
Not bad at all
I could sleep on this
There is elevenlabs and minimax speech too
not only
minimax?
The Chinese Company
lol since when
Years ago
Ty!
Are these yours?
its fast but meh
there is this leaderboard
for tts
There was a great TTS model but I think they removed it
It was sooo goood
im still looking for that model i mentioned before, maybe its what u're talking about
Not kinda probably it was expensive to host
True
edge-tts?
Hugging face is just so good they provide free working open source models
u mean the microsoft one?
Ok
It's was long time ago I think it's made by Chinese people
Chinese are everywhere
Because they are smart
they provide free server/vps
i think its 12gb RAM
Yeah so you can host your own ai
im already hosting my own stuff π
its handy
But idk if they allow to host ai
Where
Huggingface automatically configure your opensource ai
Daym they allow Free hosting wow thx
Gotcha
huggingface spaces
Hey sorry to bother does the model in direct mode also hide their identities? I asked gpt 5.1 from lmarena and it claims to be gpt 4.
I used real gpt5 to evaluate the response from the model, real gpt5 said the response is clearly gpt4
Am confused
Noice
The ai models don't know their identity themselves
The ai identity crisis
The models are as stated
I asked gpt5 to make some reli hard reasoning problems
give that to the model and fetch the response back to gpt5
Gpt5 claims that's clearly gpt4
Maybe that's not good validation?
Because that screenshot you gave
Any ai will say that
Is not good validation
And asking a model for its identity is just not Efficient
Has this released to Lmarena?
This one provide free opensource models.
https://huggingface.co/spaces/hadadxyz/ai
Sorry to say removing rate limits is very unlikely.
Afaik Free users get 4 minutes of daily compute for using this spaces?
Yeah I forget about the limit
why now in nano banana have limit for make images?
nice
im Lmarena
Because "too expensive"
Can we have multiple accounts
ok
On hugging face probably but the limit is just so small
So it won't be helpful
4 minutes yeah
Pro users get 25 minutes
@pure torrent Please have a look at β β https://discord.com/channels/1340554757349179412/1397655624103493813 for a step-by-step guide on how to generate videos using the bot.
Why I got error when I upload image made from lm arena and I want to edit it with ai?
Not here lil bro 
@echo aurora Nano Banana Pro is having many errors again
Appreciate the heads up, are you seeing this with the 2k version, both?
For now, I've only used the regular version; I haven't tested the 2K version much.
this one little error is so annoying⦠i think it depends on time when i use lmarena. sometimes my generations are consistent successful and sometimes i get this error 9 times per 10 requests
i havent used it since launch, it was amazing that first week, but i stopped when i got hit with usage limits
did it get worse?
cause it was doing crazy stuff that I could not even do with other IDEs and ai clis
The file on disk STILL shows the broken syntax! The multi_replace_file_content tool is reporting success but the changes are not persisting to disk. This is a serious issue with the editing tool.
even opus 4.5 is furious lol
ive been trying to fix this issue last 2 hours
IDE tool call is so messed up
"This is a serious issue with the editing tool." lol
finally opus fixed it
took me like 40% of opus usage
holy
hi @tulip tusk please read #1397655624103493813 and use the video-arena channels for video requests, thanks!
where is this from?
is there a way to upload 2 images to generate one video?
There is not sorry to say
HI everyone, hope to learn new stuffs here.
Welcome welcome 
openrouter
its benchmaxxed
I think because there weren't anywhere close to the top 10 before that
Gotcha. Unfortunately, we have been experiencing high error rates with this model. The team has been made aware and are looking into solutions to lower this.
cc @glass zealot
sometimes my generations are consistent successful and sometimes i get this error 9 times per 10 requests
It's difficult to say what's causing this error as theSomething went wrongis a generic error message which can be caused for different reasons. However, what you mentioned here points to rate limit being the culprit.
<@&1349916362595635286>
Hello!
Can someone tell me what this model "Hazel-gen-4"?
and how to select it in direct chat in the image generator? i found it only in the battle mode
gotta follow the rules π
272k context
if this beats gemini 3 pro the bench is rigged
also new battle model that has a search output [searcharena only]
first battle model for searcharena in a while
burn openai
anyone heard of big sur AI
I cant see it on cursor
When it comes to models that are using codenames, you won't be able to fidn them in Direct or Side by Side mode, they can only be found in Battle.
Hai pineapple!
Hru?
Howdy
Doing well. We had our company holiday party last night so a little tired, but overall doing well.
Yourself?
Think it was in their API or smth
Oh nice!!
Chilling and doing homework
Nothing special (β β§β β½β β¦β )
Nice nice, preparing for finals kind of thing?
its hidden
but its in the metadata
so its probably coming tomorrow
this is grok, menlo is the codename for grok 4 fast search
in fact, if you use that model right now it still identifies itself as menlo by big sur ai
hm alright then
why would it be on searcharea
arena if its grok 4
I meant battle.
...because they want to evaluate their new search model before release? like any other model on lmarena?
wait how do you know that is the codename if the model is new
because it's not a normal model name?
Think about one thing tho.
Will Menlo always mean grok?
if it says Menlo by big sur ai, it will likely always be grok
it's an lmarena-only designation anyways
Uhh chat did chatgpt release adult mode?
my thoughts exactly
5.1 is just a finetune of 5
theres no way 5.2 will be eons better magically
if 5.2 is a flop i think its safe to say openai will take a backseat in the market
the worst thing you can do is code red and rush models
than be google and make people wait
google took months to make a model and people were happy
openAI took a few days to make one and people were unhappy
@whole sundial is this good?
yes they will in december
It's like the 10th of December rn
π
next week
If they make it paid plan only imma crash out π
its gonna be behind ID
most likely
good, we dont need little kids using it
OpenAI takes the risky route for operation, gemini 3 pro releases, Oh no! Panic panic, we need a model that beats it so we don't get forgotten.
Instead of, oh congrats, now we take time to work on a model to beat it.
Does anyone else feel that LM Arena doesn't punish lying enough?
People are more likely to vote for an AI that makes stuff up than for one that admits it doesnβt know!
There should be an option to justify your vote by saying the other model lied, so LM Arena can add a βtruthβ category where the highest score goes to the model that lies the least.
google has tpu advantage
id give you a thumbs up
openai needs those datacenter up asap
datacenters wont fix their ass architecture
thanks i will do that !
how u know
Yeah this is interesting, would be really curious what others think about this + how it'd be done. Starting a thread in #1372230675914031105 would be ideal.
look at the diff between 4o and 5
4o was perfection
sure it wasnt a benchmark king, but it was a reliable model
4o consistently hallucinated less for me than 5
5 came out and despite them saying that it hallucinated like 50% less or whatever
still tiny
logan kilpatrick said that like they dont have enough TPUs to make their models even more powerful for now
he legit said it
openai had such a good oppurtunity with 5 and they blew it
now deepseek is basically taking over their spot as the reliable workhorse
until they fix their awful frontend [chatgpt]
it wont beat anything
i would argue grok beats it
its not even the frontend
the model just sucks
they need to do a good rerouting of their entire company
robin-high was a good model of theirs
frontend still sucked
but
no it was about deepthink
ah
also btw
me and nickname figured out why google put 4 models into battle
2 are flash-lite [one thinks, one doesn't]
2 are flash [one thinks, one doesn't]
Awesome, thanks
thanks, check this : https://discord.com/channels/1340554757349179412/1448392968653504604
Is there anyone here looking for developer?
π
already or did they mess up
I have been asking for a solution for this error in Claude Sonet and opus 4.5 but no mods are available I guessβ¦ π
I have 3 chats open coz after few prompts each chat gave this error, been 24 hours yet same errors even if i type a new prompt
i have no idea
people are also saying its today bc of this
wait
2.5 flash tts is being uh
removed
so people think its today
Can anyone tell me how to fix this
i see
when do they usually release their products
thursday or a tuesday i forgot
when did gem 3 pro release
um
wait let me uh
compare with g3 pro release
it was a tuesday @torn mantle
Wouldn't get your hopes up
it's not necessarily gemini 3 flash
Why do cursor have access to new model before anything else ?
I know its a TTS model
tweet is confusing between tts and flash model
btw google might drop flash lite with flash
because integrated-info and holo-scope might be flash-lite models
done
oh my god its peak
gemini 3 flash is gonna be so good
yes they are different
Why are people hyped about the flash model
????
It's a flash model??? Nothing special
super fast, very cheap. not far from gemini 3 pro performance
guys did skyhawk and seahawk get removed from codearena or are they just rare
i haven't gotten it for a few gens now
very good chance (in fact willing to wager) that gemini 3 flash is going to be better than 2.5 pro
prob so
It doesn't have thinking tho?
Or just the flash-lite one
Doesn't have thinking
gemini models (since last version I think) have thinking by default . gemini 3 flash is definitly going to have thinking
Lmarena needs to remove these useless Nova models. Total waste of tokens and energy. π
i want to generate ai videos
π
π
π
π
π
π
is flash like a lower cost version of the pro version and less capable of course ?
or this time it will be something else
Ok i will stop now
-# still 3 weeks of 45Β² left..
<@&1349916362595635286>
honestly they should all focus on hallucinations
Overreacth final boss
@echo aurora Come immediately here
Why did the chat just suddenly revive
True
Someone did a chain, but we stopped naturally
π
Nahh, lets stop

We are giving slop to the new vicuna model they must be training
So guys any news on chatgpt 5.2?
Yep I'm seeing
I don't think we actually need very capable ai if they hallucinate like that i want an ai that WON'T hallucinate cause its time wasting and risky
They could easily do a IA that says "i dont know" everytime, but it would make it 5% dumber and they dont care
Not really sure tbh, I did remove the π from #leaderboards
Since fine turning does that
But yeah I don't want to go on a mass ban for people that π
I think they're working on it but its not what they focus on sadly
i recommend 24 hours ban
ππ»β€οΈ
Bud i really respect u with all of my heart but these people are toxic and are just destroying YOUR community
Oops, you are now in the π chain, you will get banned too
#Pen + Apple =
Fake
What if is real
U should not mess with the mods
I've seen this image before
It had Gemini watermark
Yeah but what if it was real this time
Yeah, i edged too much here, stopping now
π€
I don't think it's too disruptive tbh
I removed it
He cut it off and it's low quality
synthid is prob still inside
Maybe you should apologise
Maybe is alright
Idk tbh
Fine
π
Not all people download it
I already know pineapple, he is also joking a little too (most times at least)
I think Garlic is Gpt 5.5 not Gpt 5.2
I'll start removing folks that do π multiple times + if it's their only contribution to the server
Since they need do a lot to suprass Gemini 3 pro confortably
Does anyone know what is holo-scope?
But for folks that are just π they can stay
Garlic will be new name of GPT
And it will be renamed as Closedai
Garlic Pasta Temperoni
Oook 
Does garlic mean like chatgpt will be stinky or something
Is just a codename
A ai can't stink
π
Garlic has wonderful smell
Okay fair point



