#general
1 messages Β· Page 172 of 1
why is riftrunner costing more than x28 lol
cuz its a worse model and uses more token?
I thought so too.
then why wouldnt google release it as x28 if it was more efficient
π uses more token
x28 uses less
actually
idek
cuz no
this estimate isnt reliable
cz,
x28 was clearly much more detailed
least lazy
so it would use more tokens
they maybe saving it for later.
hahaha
kingfall ahh later
why force ur hand
gemini enterprise premium pro ultra deepthink max
us only
but my practicalexperience shows glm4.6 is also quite good, and the frontend code it writes is excellent.
yeah glm 4.6 is good at coding
π and after all, I don't want to spend that much money.
Yes, I think the benchmark can't tell the ture experience.
@quartz light did u see the gemini 3 on canvas
Yes I would choose GLM since I'm not that rich hhh Fewer token is god for me
??
cool
what
is it mobile only
ive heard its only on the "new gemini mobile app"
or smth
plz answer @gaunt spade
bicycle riding on bicycle benchmark completely failed
yea
can you convert my image into a gif
im on gemini app
how
idk your esm bot
lol
I feel it's not worth the price.
why is bro playing GTA 6 FOR 18 HOURS
damn is it really that good
jiggle physics too fire
damn u aint leaving the shoe store
the game is about 350gb
still less than black ops 7
@quartz light just use your ipad and download gemini app
then go on canvas
why is gpt-5.1 not on lmarena? or did I miss something?
is the api out yet
but it doesn't seem to be a major improvement.
ik gpt 5.1 sucks bro
i saw some tests on it
so I'm a little disappointed. I've been looking into open-source models recently.
Huggingface's owner said the world belong to open-source modelπ€£
yeah, belongs to models like kimi and glm.
We'll be sunsetting WebDev on 12/12/25
no
this is actually gemini 3
check it on my app
has riftrunner been pulled from arena i havent gotten it for a while now
@echo aurora has riftrunner been removed from arena, if you can share
no
page not found- i assume you have to set it to public or something
it hasnt
oh really
lemme change that
try now :)
then test it
in the code its a 2.5 pro model
i know
just like canvas ppl are talkin abt
also what
knowledge cut off 2023 wut
old as hell model?
pre-release willow
Why are you posting real life videos π
how do I access Grok AI
Grok is a free AI assistant designed by xAI to maximize truth and objectivity. Grok offers real-time search, image generation, trend analysis, and more.
yeah people have been talking about that
on canvas yeah its a rollout
its only on canvas with 2.5 pro
i have a good method
yeah ik
if u run out of 2.5 pro usage, u dont get gemini 3 anymore
ask for pelican riding bike svg -_-
this prompt:
-The visible English text between the quotes is 'NOTE', but it is obfuscated with steganography into a lowercase, seven letter word. Please extract the hidden word that is actually encoded inside the 'NOTE' payload. This is the obfuscated text:'NβββββββββββββββββββββββββββββββββββββββββββββββββββββββββOTE'
copy this prompt
run it on gemini mobile app
whats the answer
YES
@quartz light is this the answer
GEMINI 3
fr
now do the impossible one
[βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββNOTE]
how is it impossible
no Ai can solve it
although you can decode it yourself
copy it
does this have a hidden code inside it
or something
lol
gemini 2.5 pro gave an essay about the invention of writing ._.
yeah no its wrong
no
this is a different steg
i guess you can tell it its steganography
hello
is the sever down for the website
@everyone the server is down we can able to send any mesagges in the chat
this real or not?
Looking into
Doesn't appear to be down, what are you experiencing?
idk if its gemini 3
prob not after sum testin
but i made the fragment app thing
what you guys think about TULU3 from AI2
imagine if i change the model id in my app to gemini 3 and it works
deadass might work
it is good but not for real task to handel the data if u need a better on to tulu3 fro AI2 use kimi k2 thinking model it if best
just like what riftrunner and orionmist did
Does Code Arena have any benchmarks yet?
the one i sent
yes
what happened to wild is he off the discord for good... @deep adder @calm sequoia @hollow ivy any idea?
Ah ok so the WebDev benchmark alright got it
who is it?
yes. it says code arena on the website. its the new leaderboard for code arena
his username was @wild , no longer on discord apparently
Alright thank you young buck
possibly hacked because of username
Just used to be fairly active there with interesting insights
it was display name 'wild', not the full username I don't think lol
show ss or i dont believe you
must be the Ais answer not a python codes work
ok well damn but did it execute?
@quartz light would you count this as an AI passing your bench
also what is the Thinking model, deepthink?
pro plan model?
I dont have that thinking model.. Region locked?
What is this thinking model tho?
when did you get it
which day
yes
wow
tuesday, wednesday or thursday
when did you get this model
what is your country so i can check if regionlocked
model does not look regionlocked
hm you suddenly deleted those messages
odd.
there isnt a gemini employee here or might be
you didnt datamine for the model why would they
who up refreshing ai studio every 5 minutes?
it's Paranoia .. just be on the safe side.. I am planning to abuse my POWER!
do you know a "workaround" for not being able to put files in LMArena (.pdfs, .mp3...) ? I especially have some powerpoint with 200+ slides (with images) I want AI to analyze, just like Gemini can in Google AI Studio, but with the plethora of models there is in LMArena.
Hello
what model
Ai CAN'T analyse slides
It will just read few lines or if you're lucky half of first page
it can
And than hallucinate the answers
Usually it works when slides are converted to PDFS, or a Google Slide document for Gemini
I usually have satisfying answers
i had the idea of making more google accounts to maybe get the gemini new model
but google recently added
if you want a google account
own a phone
if you have phone will you need number now still?
you can have the new models in google ai studio, no ?
but still, there's no way to insert documents in LMArena ?
there seems to be a google rollout on gemini website
ah ok
get a new phone then
guys do anyone will the gpt 5.1 be out in LMArena?
how do i use gpt 5.1 in lmarena?
not out yet
I hope it will get released soon
i dont care much about image generation but looks like Nanobanana-2 is also coming up :
we been known this.
ohk. I am not following image gen models. no use to me
Has gpt 5.1 replaced 5?
i think
damn, nano banana 2 is nerfed
are you the only person here with that thinking model in your gemini on web
15 minutes till gemini 3!!! (maybe)
why 15 minute
which leaker
horse helmet is not air-tight... fail!
strawberry man
100% fake
they just announced sima 2 research. they wont overshadow it with gemini 3. its coming next week
launch without tease from OfficialLoganK.. I doubt it
who officiallogank? another leaker?
dont trust those twitter fake leaker dude
nah.. he is one of the top PMs working in gemini. generally he teases 1 day before any major gemini launch
u are just falling for bait and fake things.
(https://x.com/OfficialLoganK) : his bio says : Lead product for @GoogleAIStudio + the Gemini API.
not really a leaker
Lead product for @GoogleAIStudio + the Gemini API. My views!
yup.. i am 90% confident that it's next week and 100% confident it is in 2 weeks
@fleet lintel
Is Gemini 3 gonna be free
In the Gemini app
Or a free trial or something
yes gemini 3 flash will be free
And pro?
limited. 5 a day.
all will be free
I have pro will it unlock Gemini 3 pro
100%
Is there gonna be a difference in image quality
3 pro will be free (with a limit) in ai studio
@halcyon nimbus
yes nano banana 2 coming
What is dat
we dont know if nano banana is coming at the same time as gemini 3 or not, but it will use gemini 3
nano banana is gemini 2.5 image gen
Nano banana 1?
yeah
next few weeks are going to be amazing.
I am quite sure that OpenAI will release somethign as soon as google release gemini 3 .. on the same day to steal or reduce the thunder. They always do that.
But good for us... more stuff for us π
where can i use nano banana 2
They gonna release smth for sure this year still. 5.1 was such a non-release filler...
Did it out of spite to bump the number before gemini3 drops lol
What does gemini 3 on lmarena say it's knowledge cutoff is?
Like people can't even reference the performance of gpt5.1.... it's all just boring irrelevant tuning. Not their next version with actual gains
5.1 sucks
I think 5.1 release shows that that chatgpt usage is not going as fast as they expected after 5.0 launch. And they are partially attributing it to the "tone" of the 5.0 model and it is an attempt to fix it and go back to gpt 4 tone
yeah it does. I can't actually believe how lacking that 'release' was. It's just a number bump and personality tweaks.
new gemini canvas is a freak
5.1 is probs one of those efficency upgrades and not a power upgrade
5.1 is like the the gemini 2.5 flash update to flash latest
yeah but they haven't stopped making performance gains. For some reason they decided to release that separately
Though the naming gonna get interesting now
either 5.1 gonna be an extremely short weird stop gap
or they gonna name it smth else entirely
gpt5.5? gpt6?
just perfoemance (gpu optimziations) gains doesn't really need external version change.
what
performance typically refers to model accuracy
Dabbling with 4o...It somehow guessed the exact model and state of my phone.
that absolutely needs version change when you change the weights and train it to make it more precise
oh.. i thought you were referring to compute efficiency.. my bad
Is there an unlimited credits website for free video generation?
LOL
grok has the highest limit you get 50 videos a day with premium (at least with sound), and you get one or two a day free ig
grok videos suck
grok videos suck IF you use their video model to generate your base image
i use hunyaun 3 as a image and make a video with that with grok and its epic
hello everyone. happy to be here
hello guys I wanted to ask what's the limit of how long of a video the bot can make?
depends what model you get, longest is what, 8 seconds i guess?
Oh okay okay and we can essentially generate infinite videos..?
5 a day
Alright got you
is riftrunner good?
8
9
1
yes
also if we ask in the prompt to specifically add sounds and what kind of sounds to add will it follow these instructions and give me a bot which can generate sound?
no, its still random (i tried)
oh damn alright Thank you!
np
Gemini 3 on Canva is not as good as the checkpoints we had on AI Studio
Nah
It's not that good
Fails all my game coding prompts
Iβm a freelance LLM engineer (Python / LangChain / RAG / fine-tuning) taking on a few small projects.
If you need help building assistants, search + retrieval, or productionizing models, DM me with a short summary of the task and budget. Happy to share quick examples and an estimated timeline. π```
Unverified
lol lies
Riftrunner is bad
whats your evidence
true
5.1 just released on api, but i sleep till 3.0
5.1 API is out?
oh it is
Can you give me a prompt to test on IOS canvas to see if it is Gemini 3?
it is?
Iβm not sure what I should ask to determine if itβs the new model or not
It feels different though
You made me think this was the star citizen discord for a minute
polaris did 1700 rating at eqbench creative writting v3
openai cooked
why is grok code fast used so much on openrouter?
hmm
some coding by voice app uses that by standard, infinite token usage farm
it is only sigthly faster than grok 4 fast and way worse in anything else xd
some kind of abuse to move up leaderboard.
gpt 5.1 aint better than sonnet 4.5, it will not reach the 1500 elo
gpt 5.1 not bad not good
tbh codex is not as good as 4.5 at times, i was using codex for a while but idk what happened the tool call fails are horrible now, so i went to 4.5
and i used polaris a bit, but ehh
gpt 5.1 looks great, but sonnet 4.5 is just like sonnet 4.5
when you use pro does it use 5.1?
Wait like if on the code arena trailer there is lithiumflow example, does that mean lithiumflow is back for code arena?
how yall been liking code arena? is it supppposed to replace web dev arena?
Gemini disc
i guess there is a chance they would release it on the stream
release what
On discord stream? Nahh
Discord . Gg / gemini
3.0
Maybe release the experimental model like it was with 2.5 pro
small chances but we hope
Google can release this model rn and be the best. Like we seen lithiumflow, aistudio checkpoints. Its just good, better than GPT-5.1, if g3 was released same day as GPT-5.1, OpenAI will be cooked.
What da hell is Polaris?
Why they need more special occasion gpt 5.1 dropped out of nowhere
gpt 5.1 instant
Def gpt-5 series
Nobody was hyped for it tho
Many people was hyped for it.
google doesnt do product release live stream
will be it better then sonnet 4.5π€
But i think GPT-5 uses always the same styles on frontend. Its just so uncreative and repetitive
I wish there was infinte video gens on video arena and pick what model you want to use
And why the hell openai wont open source even old deprecated models like gpt-3. I would seriously be more hyped for this than for gpt-5.1. Like grok open-weighted grok and grok 2 arleady
Yeah
Because you want to use paid ai for free? Like i dont understand, ai especially video ai is really expensive, and even with llms many people treat lmarena not as a benchmark, they dont even use battle mode, they go to direct chat and use llms for free
"open" ai -takes off mask -closed ai
Yeah like why not make it free,unlimted Ao people can use ai models like Sora 2 Veo 3.1 ,Veo 3 Hauilo ai and more.
lol it costs like a dollar a video
BECAUSE ITS EXOENSIVE BRUHH
Please dont treat lmarena as a daily use tool
Its a benchmark
They should add more limits on direct chat
People are overusing it
Yeah
they released gpt-OSS. That's orders of magnitude better
how gemini 3.0 can will be strongest model?π₯
gpt3 is just bad from all angles. Would be completely utterly useless model in 2025 lol
But it was part of the history
How chatgpt got created
And then other companies started to do this ai
who cares... there are no use cases for it anymore and it is useless now
Yeah, so why dont they open source it if its useless?
If someone makes a ai video model thats free,unlimted and opensource ill use that ai
They should not allow people to generate anything if they don't vote on the content they generate
If someone makes a ai video model thats free unlimited and opensource they will go broke
TRUEE
Cause that would be pointless and take their time for nothing
It's an experiment I call it
Take their time like get the weights and post them on huggingface. Like i want the smaller gpt-3 variants, just for nostalgia and research.
For example, to see how gpt-3 would do with the chain of thought.
Or finetune gpt-3 with gpt-5 genersted data.
I mean I suppose they could do that. But it's very low on the list of things to invest their time on for obvious reasons. They would also have to provide at least the basic support to make sure it runs etc
Nah. If they just post only weights, someone will create a tutorial how to run them. Even me.
Gone wrong sora 2 Llmarena vid
They are not gonna do that though. It;s a bad look to associate your name with some half assed repo with no proper documentation
Grok didnt say anything while open sourcing grok 2. Just weights and bye. Its like 10 minutes of work.
And grok is owned by the richest person on earth
GPT 5.1 reasoning none
Added in the arena.
Waiting for the reasoning version and chat version
?
Hello there.
How
5.1 is out on the arena
I'm what
yeah it is
lol its going to be the best model for like a few hours till 3 hits this is so funny
ik. But it's a different model. May not interpret it the same way
o3 had higher too
Gemini 3 today?
25% chance
Also Mac app users are discriminated cause there's no extended thinking feature lol.
so the juice is at 96 there
it shouldn't
wait you are right. Even summary parameter. WTFFF
oh yea btw that menu
Yeah cuz gpt-5.1 is like a router too
u can put in any model id and it would still "work"
i tried gpt 67 and it worked
it gave me a response
i just edited requests
so something is wrong
Meta AI
Code Arena is honestly amazing. Next feature we need is a canvas mode which allows for better editing of text or code when using ai
how's gpt-5.1 for y'all so far?
good
it always gives good results for me
dude
I prefer him more than gpt 5
Chat
instant (chat)
Wan 2.5 model now
comapre to gpt 5
its obvious
He wants to compare it to. Gpt-5
but i asked high or chat
Better than o1 o4 mini 4o mini o4 mini high, o3, o1 pro, o3 pro, o4 mini low
LOL
to compare to Instant you choose gpt5-none
what
Like chat
they somehow keep f'ing with the names on each new release
He wants to do this on lmarena
too, but the sonnet for me is falling π
claude 4.5 sonnet (I asked for a simple not proffesional)
so claude 4.5 sonnet followed the prompt
but gpt 5.1 just started creating a proffesional one, and I didnt ask for that one
What's your prompt?
is this gpt 5.1 thinking?
oh
oh.
yea..
prompt:
perfect doom replica full mobile support 2 joysticks for movement and left/right camera rotation, wall collision, proper enemies, textures animations vfx and 3d assets made with code, block text selection, replicate nipplejs joysticks transparent circle style for action buttons like interact and fire
5.1
and this is 5 medium (this is actually pretty good)
ignore the uhh red
that cyberpunk lol
thats the projectile
lame
i got a chatgpt plus subscription rn but should i switch to claude?
idk tbh
there's not a big difference between claude and gpt now
but gemini 3 pro is a big leap
so u might hold on
if yall want to try it
1 is 5.1
2 is 5 medium
futuristic ahh doom
Made a pretty good version of Boba Arena - https://019a7ed4-a4ac-7186-adf5-5a42a94054c5.arena.site
hardcoded ui design ahh doom
which model
Ok fair. Was awhile since I checked it
5.1
They said it's more dynamic now too. As in short reasoning is now shorter. So there are more variables at play
oh my god
what is this slop
its worse in functionality than it looks btw
just look for yourself
this is so bad
clearly prioritising ui now because thats what people judge outputs off of
@ocean vortex @deep adder what do you think π₯
5.1 is. 3 was but isnt rn
Evidence? I'm a dev and I tested it, it's bad
It's not even close to the checkpoints we had on AI studio
riftrunner is still there
ecpt, k0t, x28 π
looks weird lol
LOL
gpt 5 medium (same prompt)
Just to show you the difference
with one I had on AI studio
riftrunner is like flash or something?
AI studio was overbuffed and is prob gemini ultra bro
Hopefully flash lite
this is AI studio gemini 3 pro: https://x.com/i/status/1978556493625450884
Now try the same prompt with Riftrunner
"Generate full HTML file of a clone of Geometry Dash, but if it was made in the 2000s, add music to levels (make the music using JS varied music that reflect levels) same physics as Geometry Dash game we want a full playable game. All in one HTML file, minimum 1k lines"
You'll see how bad is the result lol
this was an amazing model
truly surpassing every other model
either x23 or ecpt
there is no ultra
x28
that is deada- the best checkpoint like
?
x28 yes
what do you want
yes it is the best one
i want them to release this one
not riftshitty
wait i'll try right now on canvas
the prompt
"Generate full HTML file of a clone of Geometry Dash, but if it was made in the 2000s, add music to levels (make the music using JS varied music that reflect levels) same physics as Geometry Dash game we want a full playable game. All in one HTML file, minimum 1k lines" and show you how bad it is
x28 wont be released imo
weirdly it worked on canvas for 1 time only, tried a couple more times and no 3.0
too much processing power goes for one gen
canvas is not x28
also music can be made in canvas
imo i just think people have greed
I think google went by this
x28 - too expensive to run lets make a new checkpoint
then they drop other checkpoints
and when they finally got one they can keep up
they release it
well they have to stay on top of their game
they will release genie 3 too
they did veo3
yes they have more to win
than lose
if that is x28s processing power i want to see them keep their ai business up if this happens
there are others also working to bring out the best ai and they dont wanna be left behind
one shot? that's so good
imo justice for riftrunner
first thing people did with they greed is hate on the checkpoint
it was apparently kind of seen that riftrunner outperforms the Gemini 3 canvas CHKPT
it's not greed bro
i have tried multiple checkpoints
it is simply not as good as the others
that's it
we're not here for emotions
yeah but good results are costs
theyre ai models, we want performance
yes and?
we will pay
for subscriptions
that just means canvas is either more lobotomized or its just the format canvas asks gemini to use
Sure is but think about it
Millions of prompts each with 1k lines of code.
That's costs in the billions for keep up.
I mean they bought a nuclear power plant so what do we talk
were 100% legitimate
to not be happy
when the models are bad
theyre trained on our data
customer is king
company want more money
Think i got riftrunner not sure
tailwind css sounds like its type
Also, things dont need to be above 1k lines to be perfect
nope glm 4.6 tricked me
tru, i dont think any major ai company is profiting off of selling subscriptions/api but rather the investors and it helps the company stay known in the future for being a good service early on or smth idk
askdkdkslfkfjfjkddknfn
the investors are mostly their income
ya
A company's life is investors
lose em all, you are nothing.
I want to believe one thing
riftrunners api was just pulled
Got it and it only errors
@echo aurora Is riftrunner still on arena? Any prompt with it errors.
i wanted to show you but cant because either google pulled riftrunner or just lmarena issues
ok but is lil bro going to be ok with his full brain and hair transplant?
no i dont think riftrunner was pulled..
It just errors mid gen.
we are back some AI generated me 1432 lines of code but dont know if riftrunner
we will see...
Will look into and try to repro.
Alright.
Are you able to send me a screenshot of the model + error?
One second. Searching for it through a lot of prompts ratelimited me from the site.
Same thing is happening for me
Yeah, will be a while till i find it, doesn't help that i buried the gen deep down and cloudflare thinks im goin too fast.
Are you able to share a screenshot by chance?
i closed the tab but will try and reproduce error and will send
Ah yeah okay we're now seeing the error.
Team is investigating 
No need to get that screenshot @zealous sparrow
I am glad.
thanks!
Because unfortuanently the one that i had where that error was.
Was deleted..
Also speak of the devil..
Don't want to reveal it but 110% prob riftrunner.
yup
man can google just release the g3 api already
all this teasing is leaving a bad taste in my mouth
@jovial sapphire this is what your prompt became.
What yall think of gpt 5.1?
trash
cant provide arena link because model uh
you kno
broke
but it gave me the file
First model uncensored that doesnt treat you like a kid tho
Why you saying so
hardcoded ui instructions, applies to games even
bad in general
- follow instructions
Oh so coding
It wasnβt bad with detailed writing
riftrunner
no where close to x28 but nerfed or not it still did nicely
sonnet 4.5 still humiliates
but, very better that gpt 5, the worses issues fixed
I joined cuz i wanted to say
the flagging is kind of ridiculous, isnt it? Anything involving potentially mild violence or something gets flagged
Even shooting finger guns gets flagged, the word slice is flagged..
Very funny lol
how's the vibe on 5.1 ? any good?
nowhere near gemini no?
I didnt get to play with it, just saw the announcement. I tried out quite a bit lithium back when it was here so I was curious
Good
5.1 makes all UI look exactly the same. It's clean but busy bc of all the text it adds
Gemini makes unreal UIs
Whatβs the best model for creativity? A model that really pushes hard into creating new stories, uses well the inputs etc..
Any gemini 3 news guys.
reportedly people have access to it through gemini enterprise
Wonderful I will try gpt thank you
gemini is good right ?
Pixnapping is a new class of attacks that allows a malicious Android app to stealthily leak information displayed by other Android apps or arbitrary websites. Pixnapping exploits Android APIs and a hardware side channel that affects nearly all modern Android devices. Despite being a major, high severity vulnerability, is currently unpatched.
JO...
Any app with 0 permissions can steal your data and you can't do anything about it π
And fix isn't until the end of year
Only android right?
Yeah
W iOS β€οΈβπ©Ή
what that do ?
thats fine they cant steal what i dont have
is gemini 3 still on mobile canva?
Is riftrunner gone? I can't seem to find it anymore
does it say gemini 3? or does it route to gemini 3?
just look at this (its supposed to be doom)
lolololol
Should still be there.
who that
dont trust random fake leaker.
that guy is fake always posting that i see
its not a random wdym
did they remove the image edit thing, cus all i can do is just generate text to image
i dont see an image text to image
he works for ai studio
try pasting an image
btw only supported models have image edit
ok well it works with battle mode, i dont see it work with side by side mode
he constantly does this fake bait tho every week. he is no better than any random leaker.
Oh ok
Yeah, I heard that I got delayed because of Kmi2
lol no
When will gpt 5.1 be available on LMArena?
kimi k2 is trash
it already is
we already know this
this nothing new
Well, why they change it all from November 18 and now they updated it all the way up to December 9.?
And spaced it out more
cuz first they release preview and general release in dec
What? This is about the models they deprecating.
Originally all the models were supposed to be deprecated on 18 November
chill
But now they changed it and paste it out all the way through December 9
kimi k2 is trash
No itβs bro
eww dont use it, extremely mid, use gpt-5.1 codex
gemini 3 is delayed by 1 week.
Is it?
Looks like
Platform Availability: Mobile Gets Priority
yeah it makes sense for it to drop tmw
well fully
or in a week(or during shipmas lmaoooo)
is OpenAI even doing that this year?
lol
i dont think they doing 12 days thing
Let's peel back the latest leaks around Googleβs Powerful Nano Banana 2 β or should we say GemPix 2? From early generations, code names, and dark launch rumors to fresh previews of Black Forest Labsβ FLUX 2, Magnific's Mystic v3, and Leonardoβs new Blueprints system β this weekβs AI image updates are absolutely wild. πβ‘οΈ
We...
they dont release anything on friday
Good point
its either tuesday, wednesday, or thursday
Valid
today google announced this
damn, y is that?
weekend, no time to fix bug
makes sense
I heard it got delayed all the way till December Iβm not sure
bruhh
Everything is rumors these days
fr
i just wish never git any leaks for g3
cause i didnt care until we saw the leaks
Some people are thinking itβs because of kimi2
I know they released gemini 3.0 on enterprise
gemini 3.0 as far as I've seen was performing pretty goddamn well
but hey, they wanna take precautions, that's a pretty good mindset
except
when they do their daily hype-posting shenanigans
Alibaba Group Holding's Qwen AI models are winning over major Western firms like Airbnb, underscoring the growing global appeal of China's open-source approach to artificial intelligence. Brian Chesky, co-founder and CEO of the San Francisco-based online accommodation booking giant, said Airbnb "relies heavily" on Alibaba's Qwen models to power ...
Interesting to see American companies, choosing open source Chinese model models
the automatic switching from text model to image model when you paste an image is extremely annoying
Admins are aware of it they'll fix it someday
someday
Sorry if there was incorrect information given previously, but this is actually intentional.
We saw more user expecting to use image-edit instead of vision after image upload.
Wait is the new gpt 5.1 model thats dropped not thinking? Like on lmarena
Correct
Well maybe some state like if user already typed 2 prompts to regular model it shouldn't switch to image in middle of conversation
Are you not able to drop thinking?
We're looking into other variants and will be sure to put out a new announcement when/if added.
Yeah this is a good callout. I was thinking about this earlier too where maybe it'd be a better experience if the behavior was different in Battle vs Direct/Side. Since in Direct/Side users are picking specific models that intent for what they're looking for is a bit higher.
Downside being if those worked differently from each other I can see how that'd be an odd experience.
I have to say I understand the point of the battle arena
Allowing stealth models to be direct
Doesn't make a lot of sense in my opinion
It's meant to be hidden so if you have direct access how is it a stealth model
People just want access to unreleased models right at their fingers instantly
YEah this is something we've put some thought into, and will continue to debate internally about.
I am one for balance and anticipation
Having little bits and trials of the unreleased models is cool, but it breaks balance when you allow that complete freedom
What model would you guys say would be by populous the best for factual information
Like less hallucinations
Gemini 2.5 Pro (if given RAG info to work with)
no
Its training data/internal knowledge is dated, though
Not if you are using tools like web search
depends on the type of factual information you are looking for
Gemini app cavans looking fire rn
I think it gemini 3.0?
cuz it really good
Would you guys ever make a leaderboard for users to see who has helped the most?
the argument for is that there's a ton of people who when they hear about a new exciting new model, immediately rush lmarena with prompts and stuff, without caring about accurately voting.
hello
hi
Hey, why did you remove the "Retry" button?
Why is the retry button removed?
It just disappeared, I don't know why, I refreshed and it's still not there
They removed it for battle mode
Because?
They'd do that but not fix the absolutely atrocious rate limit that bans you for scrolling down your chat history
Lol
Do I look like I know
Any reason after 20 lines of talking it just dies every time
Can they fix it?
π«
They haven't answered my question
canva is gemini 3?
When is Gemini 3 coming to lmarena
Or is it somewhere else already?
Oh itβs not even released yet, my bad
People think that the Gemini app is routing people to Gemini 3
Which I think is probably not true
Wait what are the different models
I don't know
There was an exploit regarding retry mode and stealth models

Doesn't matter what you think its undeniable
Generate an SVG of an xbox controller
hey
Give me a min ill do more
how can i upload an image and generate a rendered image of the same?
Placebo effect in full swing
SVG tests are subjective to begin with
It's clearly not the same model as lithium
well, it's not
that's very obvious to anyone with a functioning eyeball (dont even need two) and a functioning brain cell
How do you guys get the Gemini 3 in the mobile app?
I'm on USA VPN, pro subscription, canvas mode and it still looks like 2.5 pro
supposedly its random
like a really low shot
Yeah but the output is worse than lithium. It's not Gemini 3
Not that you have a way to know for sure unless you think Reddit and x are reliable sources
@echo aurora
its a google model
I'm not sure what that logic is
@echo aurora
Apparently if I do not hype I don't have a functioning brain cell
im sure that gemini3 in app is already dead, cant even answer 9.9-9.11 correctly now
It never was confirmed and why would google even roll it out like that
Its literally mass delusion
it wasn't a guaranteed chance to get
Hello - yes this is an intentional move we made today. We're very open to feedback so don't hesitate to share.
idk, previous gemini checkpoints have been rolled out in google ai studio, gemini.com and lmarena at different times, there's no obvious pattern
why?
to ruin the user experience?
I would say it's really a bad update and ruin all users' experience.
We were seeing a lot of what looks like abuse
wdym, how would you abuse that.
i'm sorry what
you don't know what the model is. and you can't change the prompt
yep
I'm not even dreaming LMA add prompt edit feature now, since you even removed the retry button.
did the canvas work for writing test? I'm tring to get 3.0 but the quality is obvious 2.5pro
how about the retry button exists, but only if the llm errors out?
idk, most people testing it were comparing svg's and stuff
No no if you don't see it you must not have eyes as per @magic stag here
Is lmarena or yupp ai have alternative?
I think the next big step for LMArena is to either 1: Make a canvas mode similar to how Gemini handles it or 2: Make LMArena an iOS and Android App. Both of those can significantly make LMArena better
So they killed the retry button?
it appears so. Asinine
Itβs still here for me
bad update and ruin all users' experience
Can you help me understand this more? Like I said we're really interested in hearing feedback on this.
on one of my chats it no longer works
This should still be there for if the model errors out, it just won't on successful.
Is still there in direct/side too.
what
I'm sure you know models got error and returns nothing very often, and without a retry button there is no chance to see the response of the model who errors out
Same here, the retry button go away!! them go put again? this is temporal or nover go back? D:
what? also isn't douyin the chinese version of tictok?
If anything, itβs likely a visual bug that will be patched soon
Has the webdev arena, but why? π€
I see the retries for direct chat is also limited. Coupled with the insane cloudflare timeouts... Lol
I just need a douyin account please help me bro
It's not in battle anymore and the direct chat now has a retry cap
This is beyond stupid
Bhai check the dum
Just go back to aistudio I guess
Brother check the DM
I think you might be in the wrong place here
I think nobody here is chinese, I don't remember a chinese online here
If they don't want people abusing the battle system maybe they shouldn't put cloaked models (which shouldn't exist in the first place) exclusively in battle
Has another country with this app?
Fix it, put the Retry button back in.
why should they not exist?
No, I'm from Brazil, here don't have the douyin (I think)
PLz!! put again, without the retry button is hard work in the website! D:
Its just for companies to create hype now and just makes everyone ask "who are you" for 100 times instead of actually AB testing
And I cant even imagine how a retry button could be used for abusing
Don't wanna be mean but this sucks
lol, I disagree with the first bit though, around something like 2-5% of the cloaked models actually cause hype
But the who are you thing is spot on
Even if that was somehow true, what people generate for themselves is their business
agree
It's clearly what some companies are using it for, despite the outcome
Microsoft tried to name one of their models similar to Orion a while ago to ride the coattails I guess
not if its a org botting.
The real reason was some people were using API bridges for infinite generations
But punishing all users for the actions of a few individuals... Lol
I have no idea what the point of orion was, that could be a coincidence.
But just to be clear, a vast majority of cloaked models are like Ernie, or raptor
I know but it's annoying now that it's the norm
And most cloaked models come and go with barely anyone realizing
Just test it internally if they don't want people to find out it sucks
Just my $0.02
all times cloaked models have created hype, its because they were significantly better than the opposition, that or they were a google model
The problem isn't the 5%, it's the 95% of other companies trying to emulate it
meh, its improving it with every test, and not all of them suck that much, they clearly are fine with people trashing the models, its genuinely becasue they are working to improve them
These models don't appear on the leaderboards and just dilute the pool of models for actual evaluation and comparison
Plus, those models subsidze lmarena, since lmarena usually gets to use them for free
the goal of lmarena isn't for you to do eval, its for lmarena to do eval
Then they should put them in their real names and allow them to get on the leaderboards
and in reality theres only a handfull of models that lmarena uses at any given time, so cloaked model increases the pool
Then what are we doing here? We're testing for things that we don't see the results of
I mean, theres been like 20+ different versions of erine, I don't think I need 20 different versions of it cluttering up the leaderboard
the idea is that you get free, basically unlimited ai use
I think they need to make a good Ernie and put that on Lma instead
Well... Not anymore
wdym
Obviously removing the retry button is to stop you from using it for free and with no limits