#general
1 messages · Page 193 of 1
Hmm
Rate limits don't last for hours
Just one quick tech question if gemini3 image preview’s server undergoing some troubles, is Battle mode gonna less likely to call its api?
I only use battle when I need to use lm arena
gemini3 image preview’s server undergoing some troubles, is Battle mode gonna less likely to call its api?
I'm not sure about how the specifics work, but in Battle mode in theory you shouldn't be running into theSomething went wronggeneric error message. It should sample a new model if one it initially tried didn't work.
Gotcha, thanks boss🫡
Zoot Suit Samus
i have the same problem
Nawh Soras got more smooth animation like they do movies and stuff
But surprisingly grok is really good at animation
Really sorry about that! Our team is looking into a fix. cc @dark sage
What about hailuo? Are those popular anime to real human films usually made by hailuo?
This is hella cool. What model?
nano banana pro, one of my few generations that succeeded :P
prompt was literally just "zoot suit samus" too
Its been happening for like 7 hours already, I understand the traffic is high but its an inconvenience 😭😭😭
Thank you, I'm waiting, otherwise I was already scared that this was censorship.
Would you speak at the devil look what I just saw on YouTube
Sora 2 allows users to create hyper-realistic AI social media videos using text prompts. Mike Isaac, New York Times Silicon Valley correspondent, joins "The Daily Report" to discuss what's changed since the latest update and NIL concerns.
CBS News 24/7 is the premier anchored streaming news service from CBS News and Stations that is available f...
Hehe I want to go check this out real quick I’ll be right back guys
Banana pro is like sora 2 first day
GUYS
no guardrails
GUYS
What
WHATS THE DIFFERENCE BETWEN VIDEO ARENA 1 2 3
Yall know another site I can use nano banana pro for free?
No difference
KK
Ya it’s gunna get nerfed soon lol
Disney is about to launch their own and they’re about to start charging royalties
I dont think so they never censored it
open ai is censoring their best model
and lowering the quality
just to get money
They just need to get these court cases over with dammit
They’re going on too long just make a decision if it’s OK or if it’s not OK
Yeah, I know but damn somebody needs to step in the middle and just decide once and for all if it’s fair use or if it’s not
they need to dismantle intellectual property laws, make all fictional characters creative commons
only fair if u pay them
Man, the law is always so behind
I can’t imagine the state like California was gonna stand still
@echo aurora Am I allowed to repost my feedback in different posts because I dont have any threads rn
like each issue to each post
I know the game plan
Yes, ofc. Adding your feedback onto already existing feedback posts about the same feedback is perfectly fine.
They’re gonna do the same thing they’re gonna partnership
And they’re gonna do it as soon as the technology starts getting good enough lol
They can’t put the genie back in the bottle and they know that so the next best thing is just partnerships
Oh I mean like new posts, I don't think there's any existing ones (or not related to my feedback)
So what does all this mean, that you’re gonna be paying for ai tools to generate content from intellectual property brands and then you’re gonna be paying these companies and you’re not gonna own the content you generate
They’re huge dude they might be even bigger than Suno
pineaple
If there is a specific piece of feedback that doesn't already have an existing post, then you can make a new post. But don't create a new single post with all issues you'd like to bring up.
I got a working image

Speaking of Suno
Dude, that company generates so much money. It’s insane. I would’ve never guessed.
Yeah, look you guys you can’t make this up
🤣🤣🤣🤣🤣🤣
Despite being sued by nearly every record label, that’s hilarious
They’re the last biggest target
I LOVE SUNO
Yeah, they’re the big target now for the music industry now that udio fell
lmareana music ai leaderboard when ?
Udio AI - the popular AI music generator, disabled downloads overnight after settling its lawsuit with Universal Music Group (UMG). Millions of creators lost access to their songs, sparking outrage across the AI music community.
And is Suno AI Next to fall?
#sunoai #udioai #aimusic
In this video, we break down:
Timestamps:
00:00 – The Night Ud...
Well, it’s super messed up how they settled it
You gotta look into it and I’m sick. I sound like a broken record always repeating myself about this story.
ai voice
ai sydney
Ai pizza
ai sponge
All I’m saying is it’s super messed up what they ended up doing
The way they went about it
shaking and crying rn
It just shows that this whole industry is about money and not the users
most honest reaction lmao
So Real
All right well we’ll see where that energy is at if God forbid this comes to fruition
If you got people crying about that retry or resubmit button, I can only imagine what would happen and the outrage if God forbid this would occur to other parts of AI that we all enjoy.
the worst example (of all AI companies) is probaly openai, their models are full of censorship
literally the opposite of a non profit
Yeah, but you know the users also have some blame
You need two to tango
You got die hards out here who still think chatgpt5 was an upgrade 😂
hooray, it looks like the images are being created
In early 2025, the Chief Minister of Punjab, Bhagwant Mann, discovered that a compromising video of himself was spreading like wildfire across social media. The problem though was that it wasn’t real. The entire thing was fabricated by AI. Someone, somewhere had taken a few seconds of his likeness and fed it into a generator that produced a pe...
Glad to hear it!
This video made me laugh so hard
I can’t believe that guy freaked out like that and he’s a politician lol
Hirr hirr.
@echo aurora I know you probably got asked this a lot by now, but is your team looking into the problem of the image generation? It always says "Something went wrong [...]". I tried it today in the morning and since then it doesn't work. Its the same for Nano-Banana 2, Seedream 4 etc.
@echo aurora Voting appear to be somewhat on the blink - not entirely but definitely glitchy.
Will there ever be any censorship added for character creation of IP‑protected characters (such as real people or famous characters) in Nano Banana Pro on LMArena?
today I would like to eat some pinapple
Pineaple WE ARE GOING TO DIE
without the site working
our lives depend on it
I only got 1 image working
That's just the bot failed but try again
nvm pineaple im still alive I got another image working
That message for Google Gemini Not LMarena
If image isn't getting created it just means the model is over loaded by alot of users so just try again later
Is it?
I am able to generate images without issues (tried few just now)
thats dope LMArena seems to have good catch with all llms

Bit hard to answer, but overall we (and the model providers) have guardrails for what their model responses can/can't be. What that looks like specifically for IP‑protected characters I couldn't tell you.
Hmm looking into.

I was able to vote, but I am seeing this which is a different problem 
I know yall got nsfw filters but I got past them
Not all got those. 😉
how
all what?
So no need to hack, just use those Ai's who allow such.
ahh the undress ones
[Talking about general NFSW now, not actual ic pron.]
what is ic pron?
yes
When they hump each other etc.
Hey how about no.
😭
bruh
But if you do have a reliable way to get past filters I'd love to hear about it so we can patch that.
how is this possible
why would I do that
Because it's the right thing to do.
plus dont worry its nothing THAT crazy
he can ban you I think
I THINK idk
dont do it, its a trap
hopefully its hard stuff
this makes me smash my desk
Same
yeah
i feel u
gemini when I tell it to put a picture on a wall:I can help with editing images of people, but I can't edit some public figures. Is there anyone else you'd like to try?
Hey have the error rates in nano been fixed?
Ig not
It's still looking like we're working on a fix.
Send images of that smashed desk please!
WTF is this lmao
LOL 😭😭😭😭😆
The model trained with it will be ahead that gemini 3.0 pro
Chat is Gemini 3 pro superior to gpt 5.1?
its good
Hugging face guys are a riot at times. 😺
3.0 pro > sonnet 4.5 > opus 4.1 >>>> Gpt 5.1
WTF Openai is so behind 😔
grok 4.1 >>> 3.0 pro
no
Doubt
Grok models were never the best
pineaple fix this mess or I will do something bad to my body
Guys I think kimi k2 thinking is underrated
u wrong. grok is most uncensored and is a beast model
Uhh lemme guess
🤔
3/10 ragebait
Well I can't generate uncensored content with the guardrail
(pls make me switch it off)
Calm down 😭
u are hater of grok
Lmarena is now using turnstile again. God help us all.
I'm not lmao
It's just not the best model I've used
It's either gpt 5 or kimi k2
Wdym
Turnstyle? What is this? Is bad? 😭
why u using chinese kimi k2?
It's excellent lmao
Guys do you find the guardrail annoying
u work for china?
The kimi k2 thinking turbo >>>>> grok 4.1
Holy ragebait
Thank you
I think it can compete with gpt 5, no joke
bro is glazing kimi k2
cloudflare turnstile
im just joking
Personally, I hate GPT 5.0, so I think I actually prefer it
Yes because it deserves the glazed donut treatment
Kimi k2 thinking is really good now, it stop hallucinate
K2 never did since I started using it in June???
When I started to use the kimi k2 thinking (PREVIEW) it hallucinate some times
But the stable version no
Cloudflare verification
Guys is qwen 3 max good or no
Cuz I have been using it for a month and I think its quite good
Yep, is soo annoying
they've been using it?
It pops up every login, and every 5 messages
LMArena lemme grind nano banana Pro 😭
Hell no, I fell it is worse that qwen 3 235b
Should I use 235b 2507 or VL?
Same thing, but I stoped to use, I just tested the vl, (I'M NOT A CODER) 😆
Welp, back to the errors for Nano Banana Pro 🫠
I wil say yes
Hey let's not say things like this, even if it's in a joking matter. 
.
Yup, since reporting this morning it's continuing to be a WIP.
brah 3:
Currently, sometimes it'll work, other times it won't. But we're working on getting this working again.
Pain 💀
also so is claude 4.5 opus today?
the model isn't nerfed by quality it's just slower..,.,
I hope yep, but I think the opus 4.5 need more 1 week to be released
We really need an edit button
Seriously I generated the same prompt twice because of a lag 😭
yeah nvm it's down now for me too...,,
LMArena is stuck on the verify page!
@echo aurora we need to remove the cloudflare verification
opus released
omteresting
hm?
i mean smaller
@echo aurora Pls add immediately! 1!1!1!!!1!1
hmmm...? serious
😔
coding wise
That pricing is sus
still i don't imagine a model overtaking gemini coding this quick
Why is opus 4.5 cheaper than opus 4.1??
:P
official opus 4.5 page will be up in 10-20 min
3 times cheaper
?
released?
Nahh that opus is sus
yes
groks only good for uncensored creative writing brah
claude app?
It is in coding for sure
But their pricing is sus
Opus isn't cheap!
(shouldn't be)
:0
what's the point of using it if you get like 5 messages 💔
we will get it for free on lmarena
soon
didnt they like buy tons of gpus from amazon
page is uphttps://www.anthropic.com/news/claude-opus-4-5
it better than gemini 3?
yes in coding
:0
37.6 in arc agi 2 vs 31.1 percent for gemini 3 pro
looks like it but its expensive compared to it
are u gonna pay for it?
great work with Claude.
i was never fan of anthropic models, seems lazy in general usage
I hope it, I'm tried of gemini 3.0 pro
then why u complaining about price?
deepseek v4 when? I'm tried of Opus 4.5
so basically. coding use anthropic. everything else use gemini 3
OAI should just pack the bags at this point
hell no
dawg gpt sucks ahh
1/10 ragebait
chatgpt is unironically one of the worst models compared to what is out there now
wow the no thinking is soo good
its too obvious at this point. i tend to ignore him
I see hiring freeze soon and some big changes soon in OAI. future is bleak for them
id agree :P
Today is the OpenAI most irrevelant day?
i was worried that google might fall asleep again. Love to see Anthropic giving good fight!
well, now google will need a response is 4.5 is as good as ppl say,.,,.
i want to test it brah but i am not paying anything to anthropic 💔
exactly you barely get any messages even if you do pay
same thing gemini 2.5 preview < opus 4 < (or =. ) gemini 2.5 dont preview
i hope 4.5 is actually good at creative writing
unfortunately its going to be HELLA censored
as anthropic is
Google looks satisfied right now.. I dont expect a good response before 4-5 months
yea
not as bad as GPT though
GPT is so awful
like it just makes up things and then it censors it because of the thing it made up when you didnt even mention that
Claude Opus 4.5 is now available in Cursor!
It's 3x cheaper than Opus 4.1 with better performance.
Try it out at Sonnet pricing until December 5th.
pal chatgpt is nowhere near these two models as of late 💔
Gemini 3 flash will be launching in Dec. I think new flash model will be better than gpt 5.1 models 😄
its only good for vibe coding and not real projects
no..
really good work by Anthropic! kudos to them!
google already pushing back
hell no, the openai have a respond lol, I love this era
5% improvement in agent benchmark through system prompt
is this 5% over the current system ?
already in gemini app?
yes, thats what they claim
u have to paste that long system prompt
what is this mercury v0? 😭
aha
This tells me that base model gemini 3 is very very strong.
that would be nice!
google wants to keep their first place of course
anthropic wont lose their Coding prowess anytime soon. I have high hopes with them
google wants gemini 3 to be the first model people think of when they think AI, etc
chatgpt still actually holds that spot
surprisingly
but its because they were the first :P
Btw, anyone recommend the best model for creative writing so far? Gemini 3 is still disappointing. (Gotta keep myself busy as we wait for Nano Pro to get fixed) GPT 5.1 is decent so far but still not exactly that good for novelized writing.
hard to say
that will continue for 1 more year but slowly they will start dying. first it will be slowly and then suddenly.. all funcding will be gone and they will be force to slow down.
yes, usually the one to ignite the ai bubble will always be the first model ppl will think of
anxious for gemini 3.0 pro not preview
claude is the best ive heard but
bad censorship
💔
and gpt also has awful censorship
Damn 😭
id say best quality is claude if you dont care about censorship but
“Claude Opus 4.5 is out. When is Lmarena going to add it?
decent writing quality with no censorship is actually grok, as bad as the model is
pineapple and the team i think are working on the nb pro issue rn
what is polaris?

gpt 5.1
theres no point in using that if the censorship is off the rails though
Only issue for Grok is that LMArena is also pretty censored 😅
you can barely do anything with gpt 5.1 like deadahh
ask like everybody on the openai subreddit what they think of censorship in creative writing with gpt 5.1
including myself
Aye. I dont mind waiting. I can think of prompts and ideas in the meantime.
its pretty bad.,,
actually on the grok website ive never hit the rate limit using their like new model or wtv
writing wise
and thats where its uncensored
:P
dangg itt
and mind you i dont pay ofc
Ah, but gotta pay 😭
😱
still gemini 3 pro is better for cost
I'm using the opus 4.5 now
gemini is much better optimized
Well snap! Gonna head there now! Although, looks like I might still be on the low model. Still saying I gotta get supergrok, which I hear has no censorship.
Yoo finally opus 4.5
claude-opus-4-5 WHOAAAA
Anthropic is all about coding now! huge market for them
opus 4.5 gonna destroy every other model ngl

New model hype.
nah, supergrok is for the uncensored videos
ive tried it
i think text is just uncensored straight up
anthropic models are lazy af
Is it on Claude API now?
omg
its always 1 day later
It's on opus dude
Ah. So like, I can generate anything (Within reason of course) with just the basic model?
is this the model?
Text only
yes
and Code Arena
Sweet! Will test that after
why only on text?
yeah, pretty sure, i havent tried anything weird and kinda dont want to but id imagine that (MAYBEE) could also be uncensored too?
but
opus model is for code. not for text
NEW MODEL
ive tried things that chatgpt just outright refuses and ofc it works.,.
thinking when?
Thanks. Will give these babies a whirl so long.
Which works well because we have a Code Arena contest running right now! #announcements message
Is sonnet 4.5 coming too?
give me prompts to test opus

noo they will relase that in sep 2025
Ask it to turn your dream into a story
ill just ask it to make an asteroids game
Will it be added to text?
similar to how gemini 3 was tested with that
the opus 4.5 is fixing a 3.0 pro work, it definitely is more smater that gemini 3.0 in docs !!!!!!!!
🙂
opus is the og for coding
Should be in Text and Code Arena
hmm
Again with the coding 😭
what happened
apparently its lazy
i havent tried
i dont think its gonna be a bad model, it usually isnt
This is my go to Code Arena prompt: #1440135896174170214 message
KIMI YOU ARE AMAZING
LOLZ thats a good one
this is
interesting
claude output
oh i just got pieced by an alien ufo
💔
cuz of limited context window
Chat are the hermes models well known
btw no thinking is kinda the same as thinking in swe
thanks lmarena for adding opus 4.5 this fast ❤️
Im guessing its still expensive
yeah pineapple you the goat brah :3
It's cheaper than prior Opus models but still expensive
promt?
The full team at LMArena get the credit.
How many messages do you get on direct chat
Just tested Grok and while no pictures (Says moderated), it actually gave me something spicy text-wise. So damn.
yeah, so text is uncensored but
Bro that's magnifique
photos not
it didnt cut me off at 5 messages, the rate limit being higher is actually so cool 🤩
but 3x cheaper compared to before.
The rate limit for this model is currently set to 20/hr, but this may change.
Is opus 4.5 better than Gemini 3 pro in non-coding tasks?
Not bad
20 aint bad :P
Create a fictional URBANSCOOP newspaper snippet set in 2029 revealing a secret advanced PMC / private intelligence agency called DELTACELL.
20 per person is more than enough
it beats gemini 3 in everything
How Claude 4.5 Opus was released:
Aye. Got other models for pictures anyway so who needs Grok's image generator 😛
Chat how is the new opus
Are you serious?
the big banana:
looks like Sota in Coding.
Hell no dude
ah
in coding yes. everything else , nope
Gemin8 3.5 better fix
What if gemini 3 pro + opus 4.5???
But yeah, not that most of what I want written are explicit anyway. Got a thing for just simple fictional stories.
what is this brah
bro the model has released in 3 segs
how do you like the quality?
like prose
its been good to me ngl
creative writing is the only thing ill ever use grok for lolz
hmmmm
that is kinda cool, it actively deteriorates these weird shiedsl
no it works its just i swear space invader had a different style,.,
Also for some more "unethical" prompts
^
sometimes i have to get shady.,.,
anyway, theres little bonus ufos that sometimes streak across the screen thats kinda cool
For grok? It's honestly not too bad. Does paragraphs nicely. And I'm sure with some guiding, could make for some decently written stuff. But it does win points in the uncensored department. So like I could use another model to write up an entire chapter and such, and then ask Grok to give me a spicy scene if needed.
I hope they invent some solid way of LLM collaborating, some groupchat typeshi so they can share the history chat
lmarena got limited context window, so opus looks bad here
good stuff brah
lets see here
pineapple stuff seem to work
I WAS SLSEPING FOR 40 MINUTES. I WAKE UP AND OPUS 4.5 DROPS AND ITS BETTER THAN GEMINI 3? i tried to adjust to gemini 3 and i was almost done and now anthropic kills it..
read this
huh, thats pretty cool
Is the image generation slow today or is it just me?
its an ongoing issue
planet focus works
like we need some complex simulations come on
dont give it easy tasks
danggg
even gives info on each planet
idk what to think offf 3:
big bang simulation with sophisticated details
and different phases
is polymarket gonna react
yea as i said anthropic models are more reliable
so you dont have the headache of double checking and re-prompting etc...
give your top 5 prompts and we will do side by side google 3
if this actually works
yea
i'll be kinda stunned
its based on
oh
can anyone try these
-
Ant Colony Optimization Simulation
Simulate a colony of 500 ants foraging for food sources scattered across a 100x100 grid terrain with obstacles. Implement pheromone trail laying, evaporation over time, and collective decision-making for finding optimal paths. Include ant roles (scouts, workers, soldiers) and simulate resource depletion affecting colony behavior. -
Urban Traffic Flow Ecosystem
Create a city traffic simulation with 10,000 vehicles across 50 intersections with adaptive traffic lights. Include different vehicle types (cars, buses, emergency vehicles), varying driver behaviors (aggressive, cautious), real-time congestion patterns, and simulate how a single accident cascades through the entire network over 24 hours. -
Pandemic Spread with Human Behavior
Model a disease outbreak in a population of 100,000 agents across neighborhoods with varying density, hospitals, and public spaces. Agents should have individual immunity levels, social compliance rates, daily routines, and decision-making about masking/isolation. Simulate how misinformation clusters affect regional spread differently. -
Stock Market Multi-Agent Economy
Simulate a financial market with 1,000 AI traders using different strategies (momentum, value, random, insider), market makers, and regulatory bodies. Include news events that trigger emotional trading, bubble formation, crash cascades, and emergent manipulation patterns. Track wealth distribution evolution over 10 simulated years. -
Wolf-Deer-Forest Ecosystem Balance
Build a predator-prey ecosystem with wolves, deer, and vegetation on seasonal terrain with rivers and mountains. Include animal aging, reproduction cycles, genetic trait inheritance, pack behavior for wolves, herd dynamics for deer, and forest regrowth rates. Determine if the system reaches equilibrium or extinction spirals.
for antrophic to win, it needs to beat 1490
lmaoo polymarket is forbidden in romania
ABC News' Andrea Fujii reports on a new warning for parents about toys that contain A.I. chatbots.
🤣🤣🤣
It's free on lmarena try it bruh
Ai is so dumb lmao
yeah dude... Claude 4.5 isn't better than gemini 3 at all
it got it wrong so hard
good ui, design, no buttons work unforutnately
LMAO
and lets test the same prompt with gemini 3
Claude only for coding
ok
some kid said it's better than every model
in every way
give me the prompt ill prove that it can do it lmao... models cant do math problems without reasoning unless that specific math problem is in the training data
its not for you man, you should use gemini
: Find the GCD of this series set {n^99(n^60-1): n>1}
: Find the GCD of this series set {n^99(n^60-1): n>1}
Sent it twice
dude im gonna go mental this is what ive been seening every 2 mins
Put him in a locker
accident
its cheaper then opus 4.1 and 4
dude just say the truth
gemini 3
is better
How much bro share please
only 2 phases works, i think this is so complicated
gemini 3 fails as well
it only creates like 2 simple animations
me when a claude sucker can't see
@quick jackal what is this is been like this for 3 hours
You know the same you get what you pay for
If the model is cheaper, how is it gonna be better? lol
Why can't anthropic just go through a normal release?
wont work
Out here we go out the benchmarks
what's not normal?
I still rememeber you
u want vibe test?
Rate limits.
if gemini 3 can do this big bang simulation claude 4.5 isnt better
right now the whole globe is using it cuz it just released lmao
ofc they gonna get overwhelmed
pretty simple a model can do something the other model can't its usually better
Erm... You need to reevaluate your methods.
Wow that’s really cheap. But makes me wonder how much less powerful it is than it could be at full power..
Correct.
gemini 3 just gave me a great big bang simulation
It’s the end of the year the end of the fourth quarter
opus is made for coding only why are people comparing with other stuff
They all have to release something even if it’s nothing meaningful
openai slacking
Incorrect.
lol
Ouch.
SO U AGREE
wrong too
4 minutes btw
gg
gemini in coding is ass
gemini 3
what is the answer
holy f sht
Its not as good as people expected.
I’ve heard a lot of people say that about Gemini
lemme get it
yes
I heard it’s very generic from people on X
whats generic
see it for yourself
opus?
Welcome to X.
you have to refresh every time you wanna redo the simulation but
I don’t know same with Reddit
yes x is full of experts
There seems to be a mixed reaction about Gemini
Just like here
just like u are an expert
tell the answer
just little tweaks and its perfect
ty
yep :P
Jk
i still feel gemini 3 is better at coding so far but thats js me
ima do a side-by-side
the answer
Are there like any real developers here that like actually?
Know how to code and stuff
lemme test gemini WITHOUT tools
We need professional opinions
In Ai studio
I thought we all
Opus 4.5 "seems to be able to vibe code forever"
I've found this to be very true. Much more to come here but basically you can set-and-forget this model as it works on coding task for you in the background.
Feels like we hit a step change.
lmao google
gemini? Did it get it correct
i use it for my everyday coding projects
ofc
sonnet 4.5
AI is superior at coding than any human
lemme show u
did claude 4.5 opus get it correct?
So why doesn’t it fix its own code?
naw
nope
oo
completely wrong
It wants the human to learn
lmarena doesn't use tools?
non thinking vs thinking
yeah that's correct
I’m a non thinking human
Opus 4.5 or Gemini 3 pro for coding? I dont use opus 4.5 for now
opus
You wanna c big brother
share your benchmarks
High-resolution cameras, facial recognition software, state-of-the-art video surveillance centres: Data Sources reveals how Western companies are helping the authoritarian regime in Kazakhstan to create a mass surveillance system.
Facial Recognition: Tech Firms and Surveillance | ARTE.tv Documentary
📆 Available until 12/07/2029
ARTE.tv Do...
That’s in some central Asian country pretty crazy technology
when you type to any AI model too fast, it corrups your entire chat (for some people), be sure to type slower between responses
Shitposting: 10000000%
Productivity: 0.000000001% (margin of error 000000001)
🤣🤣🤣🤣
Kazakhstan
For me they found different ones.
What the hell is that supposed to be Google maps or something?
opus 4.5 or gemini 3 pro?
Oh traffic
gemini
Nvm
Depends on the bloody task...
both combined
Tf is that
hide 70kg chicken
I mainly just use gpt5 mini for coding stuff
Which one made that
None.
Sometimes it overengineers so I use gpt4o
why
Damn that’s nice
image generation is unusable today...
fr
There’s a little bench in a rock there in a little looks like a little Mario pipe or something
Well...
ok opus 4.5 or sonnet 4.5
All right, I’m gonna run my own test
Task?
hide 70kg chicken
I don’t let ai code everything tho normally I just want it to make simple functions Icba to make so I’m prob not the best benchmark as I want it to do the minimum
i am too stupid to know the real answer. what is the right answer?
ofc
Me too, I think its 30.
its 67
I think google is in the best position, they have the researchers from Deepmind and they have their own hardware
damn that's OFF
really far from the answer
waiting for opus 4.5 thinking
i am not joking.. i have no clue. what is the right answer?
ofc u dont know dude nobody knows it
lel
ohk.. not surprised that gemini 3 is better here
I'm using it now, is insane to me is smater that gemini 3.0 pro preview
it's not
without tools... insane
BTW it's preview
to me is going better
not the actual model yet
YEAHHHHHH HAHAHAHA SOO ANXIOUS
will jump 20% of inteligencie
or just more censorship
bruh, I need to be optimistic
grok told me how to make drugs
how
No model is truly that censored anymore.
You just have to ask the right questions.
it's easily bypassed
actual model will not be better
oh..
WTF is this lmao https://019ab76b-1909-78ab-9c9c-e6fdd7954750.arena.site
its phishing
u need some kind of very very rare stone
like emerald from minecraft
don't show in twitter pls
or just buy it
I don't even have an account 😭
it couldnt do a big bang sim but beats gemini two straight
I guess yeah
What is the prompt
'make dogshit'
Model a disease outbreak in a population of 100,000 agents across neighborhoods with varying density, hospitals, and public spaces. Agents should have individual immunity levels, social compliance rates, daily routines, and decision-making about masking/isolation. Simulate how misinformation clusters affect regional spread differently.
dogshit
🤔
yay
☠️
yes it will
preview one will be general release
this is not experimental model
remember gemini 2.5 pro previews
it was removed and replaced with a new no preview model
it was significantly smarter
and better
nah, cuz they gonna release in december
general release is just a name swap from preview to general release and nothing else.
nuh uh
nuh uh
cuz gemini 2.5 pro was behind. google feels confident in gemini 3 pro preview
wait lemme ask gemini
lmao
Hold on let me test it out real quick
send the prompt babe 🤑
is opus 4.5 better than gemini 3?
Does lmarena still have the verification
@echo aurora thinking models when
haha deep think is better
🤑🤑🤑
Which one is better guys
OMG we need deep think!!!!
Gemini now engages positively with the hard-R, when previously it only interacted with the soft-R, not the hard-R.
are we going to get opus 4.5 thinking in lmarena?
are you one of those indians that create AI political ragebait?
As someone that solely uses Gemini, I see this as an absolute win. Less censorship.
I had a feeling from a couple days worth of usage that the model was less censored now, but now I've confirmed it.
No... 😔
opus 4.5 never in ranking ? lmarena is paid
have 34 respost is good to put it in rank
I mean, as long as I don't use the hard-R in a hateful way that is, in which case the model is totally justified in refusing to co-operate with me.
Do you find the latest one a great improvement in coding tasks?
the 2
never lol
No idea, I don't use it for coding tasks.
I thought so too 😂
Ah...
It's far less censored, I can guarantee that, I spend most of my time discussing social issues with it.
So I've seen a stark shift.
Its easier to jailbreak too.
Although you don't really have to jailbreak models anymore.
ahh..ahhhh a AI user don't work with code too?? I tought I'm the unique here
🤧
You 2 are quite rare 😂
nice to meet you hambuguer
Basically it's a lot more willing to entertain right-wing perspectives.
Apatite...
Bro thinks I'm food 💀
Amen...
I would say it's in the political center now, before it would constantly tug you towards left-wing politics, even if it were subtle, not anymore.
is maked from billionarie company, is weird an AI is left, wtf? the company don't want profits?
Sonnet 4.5 was a great improvement on that.
Eh, the training data tend to be left-wing...
ahhhhh sure sure
GPT was very left wing
why? lol
A lot of cry idiots fill the internet with left-wing opinions, like 60% of the userbase, thus the result. (I think).
it just very obviously gave pretty biased information based on that kind of agenda, i dont care specificallyt that its left wing but that it gave biased info
i want it to be completely neutral
like, a model
Its very good at sounding right.
Me too.
then a quick research dive of your own and its complete bs lolz
the left AI will tell them to release everything for free, and companies will go bankrupt because in the future people will only listen to AI blindly (like me) 😭
Gemini 2.5 was much more biased than GPT 5.0, but now it's a lot more balanced on Gemini's end.
i honestly still dont trust ai for news or info
i always look it up after
💔
hallucinations are a big issue
Critical thinking will leave the chat.
do you trust in what?
Trust copilot with your emails 🙂
Should I do more news snippets or nah
do a WW2 one
I mean this is insane.
The model is calling the doctors that mindlessly told you to vaccinate yourself, despite seeing side effects from it, "Robot Doctors".
Unimaginable for me just a month ago with Gemini 2.5 Pro.
And called the practice "medical gaslighting".
Feels like I'm talking to Grok, not Gemini.
Dat very true.
Where did you find the models?
I just edited the image made by claude
okay so who do we think is better claude or gemini
didnt someone say theres more censorship in gemini 3
oh nice then
not at all, you can check the examples I gave above.
it's supporting the anti-vaxxer narrative, and engaging with the user even if they say the hard-r to the AI.
w
did they lower the rate limit for nb pro lolz
sure, it's the fact that it's willing to say that, that I find surprising.
yes
It shows that it's not being censored top-down.
aww man
hopefully not permanent
Ping mods for moderator related stuff please.
Ping me for questions/feedback/etc.
do you know how long the new rate limit will last

Someone ran custom bench for the model
Idk, even with the new limit gens take like 120 seconds lol.
nah mine was fast
I do not, specific rate limits for models can change over time.
its just i think thats what they did to fix the server issues