#general
1 messages Ā· Page 174 of 1
They drop on specific day because of the news and stuff
And the other week is Thanksgiving soo
In the December lmao
If we look at the X factor
It may be the case Kimi k2 has got them worried š¦
Yeah, Iām on the same page. Thatās what Iām thinking too, but I donāt know.
Everyone is saying the 18th
People forget the benchmarks are one thing but price is an even bigger factor.
Thatās a pretty sweet deal considering the performance
is riftrunner on code arena?
lol
lololol
fr
on a Saturday is wild!!
yea
why is a ceo quote tweeting a polymarket?
trolling
btw is it still there or did it get removed?
i didnt expect it from him.
neither did i
idk
sundar pichai is not a troll type guy.
if it was elon, i could see it 100%
š¤š¤. two emoji meaning two letter. NO.
he would have sent three emoji if it was yes and gemini 3
OH MY GOD ITS ALREADY AT 83%
Place a bet
lol
place a bet against
Youād need to put up 10racks
You put in 10,000 dollars
Each No-share costs 0.32
You get 10,000 Ć· 0.32 = 31,250 shares
they aint releasing on monday
Wait, so if you bet on what is it a dollar for the payout
there is no legality left for tech
damn
insider market influencing.
its def market manipulation
sundar pichai can influence it.
now we talking about buying it lol
I forget what the term is how they legally defined it
wan2.5-t2v-preview is actually extremely good, check it out, it might outperform sora-2-pro or even Veo 3.1 in the benchmark
its free speech.
ima bout to drop 10k on it cause i saw that tweet
if it dont drop on that day im effed lol
so thats why its kinda shady to tweet that knowing their influence
what if two emoji meant no?
lmaooooo
gemini 3 and the word 'yes' both has three things.
then GG
A closer look at the rise of gambling and how it's infected every part of American culture.
Support Investigative Journalism:
āŗ Patreon: https://patreon.com/coffeezilla
Follow:
āŗ Twitter: @coffeebreak_yt
āŗ Instagram: @coffeebreak_yt
Credits:
3D Artist: Ed Leszczynski @LeszczynskiEd
Video Editor: Harry Bagg @HarryRBagg
End Credits Com...
He breaks it down
how is this related to ai
Poly market
gambling is related to everything now
yall check it š
hes gemini product lead for subscription
they might as well just say its releasing on 22nd
before or after? cz thats the same screenshot
tweeting polymarket is crazy
Man, thereās a term for it
what u mean same screenshot
They got around because itās not technically betting I forget what they used to describe the
Because they donāt use any of the investing terms neither
Which would make it illegal
all signs point to next week. why would they release in thanksgiving week.
Sketchy
Account have been hacked before that one kid that hacked all those Twitter accounts to promote his crypto scam
Florida authorities say a Tampa 17-year-old is allegedly among those responsible for Julyās massive Twitter hack that compromised the accounts of influential individuals, including President Obama, Vice President Joe Biden and Elon Musk. The scam, which occurred over a day, netted more than $100,000 in Bitcoin, authorities said. Now authoritie...
This kid
logan also tweeted
22nd is saturday my guy
they cant tweet that and not release that day, that is literally manipulation
i understadn its a saturday
the bet is for any day before nov 23
tell me any day and ill buy that on poly rn
Guys are out of your mind if you think itās gonna come out on a Saturday lol
this is easy money
17th1!1!1!1!1!1!1!1!! https://gemini.google.com/share/af41c41946cf
31% is nuts
whats good with you bro?
what is all that
its scary lol
xddd
Is there any way to search past results cause ChatGPT five had the same thing?
i thought i almost dont drained clicking that link
Thatās why I sent that video
Speculation
69% is for any day before 23th, and 72% is only for 18th
yes yes the thinking emoji unicode is 1F914 so this is what its all about
yeah 22nd makes no sense but before 23 makes sense lol, and 18th seems like its right
why not 17th
no one releases on monday
goog
Yes
show me one example
ChatGPT 5 was Monday I think
Goog...
cause monday has 3% chance right? @cloud zinc
chatgpt 5 was aug 7, thursday
lmaoo its wild imagine google exec and partners on polymarket betting
its 40% tuesday, 20% wednesday, 40% thursday. the days of week when they release
google can make easy money for investors by changing the day to like tmw lol
Dude, they got bigger investments than this 800 K
Lol
They got shareholders š
polymarket is not investor
wait i just looked over the timeline of gemini model releases
2.0 is literally thursday wednesday thursday wednesday
You don't get it
no im saying tehy tell their investors they release tmw and then they buy billions on that day
Ya 800k is toilet paper money
but that would sway it too much i guess
thats insider trading
But one could argue though
bro everyone does that tho and they are literally tweeting
guys
its going to release on thursday š
One could argue that if someone wanted to and had insight information
but they dont buy billions
I mean, if I worked at Google and I knew the date
told u, no one releases on monday
not everyone at google knows tho
i just looked over 2.0 release dates: its a cycle of thursday, wednesday, thursday
then 2.5 released on wednesday
I know Iām just saying if I did know
if i knew, i aint telling no one
Bro, Iāll drop my whole check on it
i wasnt srs about 17th
My whole salary for the year if I knew
and u get to jail for insider trading
which is.....
Corrected again
when did i say that
socky3635 ā 8:48 PM
thats insider trading
I have been corrected twice again I apologize I misread that
But if this not an investment then, how could this be inside of trading just on the side?
low tier employee would get sued
Donāt they have corporate protection
Isnāt that the whole definition of a corporation?
u have to be executive or sum to get away freely
Iām pretty sure lots of people already do this
Well, the thing is to get hit with inside of training. You have to be part of the.
they need to just release flash first to tease us a bit
no one cares about flash
U.S. Commodity Futures Trading Commission (CFTC)
you will if its better than 4.5 sonnet
Polymarket is illegal in the US so you wonāt get done for insider trading or anything like that
Theyāre the ones that do the insider trading
it wont 100%
flash is going to be the model everyone use in the future not pro
Investigations, I mean
for regular people sure
we only care about SOTA here.
i promise you will if it is better than sonnet and just as fast as current flash
that is SOTA
thats SOTA in speed and performance
literally would be best model
then what about gemini 3.0 pro?
there would be no point in using pro
Interestingly, their terms do not explicitly say āinsider trading,ā though they do include a catch-all for āviolates Applicable Law.ā
nah they will nerf flash to accomdate pro
Flash was good because it was a cheap alternative
wait is flash riftrunner?
maybe lite
what are your expectations for lite?
idc about it that much
I mean, thereās still a lot of use cases even for something like ChatGPT 3.5 ChatGPT4
but i wonder if flash is gonna be on par with sonnet 4.5ish(maybe) what would lite be like?
People think theyāre outdated, but thereās still a lot of use cases and you figure that the price would be dramatically lower
thats what i am saying
If 4o was around Iād still use it today
so lite and flash would be prob used more than pro because of cost and speed, ill use pro when i need really complex stuff but day to day, i would take the speed over performance any day if it is as good as sonnet 4.5
it'd be cool but the competition can use it
which one is riftrunner again?
Anybody know how this works?
You just post your Sora video link, and it downloads the video without a watermark directly from open AI somehow
I canāt figure out how it works, though
removing watermarks isnāt very difficult
No, I get that part but you donāt upload a video or anything you you just copy and paste your link
The traditional way was you upload a video and then the watermark would get removed
Then some other places are even selling you could download your generations and high-quality without being a pro subscription
Howās that possible just from a link?
Like I tried to at least two videos, and obviously it removed the watermark with no smudges or anything
But they can only be videos that are already posted to your profile not your draft
I noticed the trend like two days ago. I was just curious if anybody knew how the hell they did it.
They would never catch one employee
ššš
What app is that?
Looks like a chat group or something lol
Kimi k and ChatGPT 5.1 lover? Lmao
š¤£š¤£
who that
yall lowkey late with it
he said its dropping tmw
or late tonight still more work on it before they can release it
this is for Open router bth
btw*
The iOS endpoint isnāt watermarked
one piece reference?
No it just how it designed pirate ships for some reason
If it's Gemini 3 it's as stealthy as a š„ø
They all come out, looking like one piece for some reason
so weord
maybe bc of its popularity
who that
Iāve been trying to get Stalin and adolf in a boxing matching
OpenRouter team
it might
Itās gotta be possible
might no.
why?
Hello, I would like to create a short, somewhat artistic, realistic video about a violin.
Hey! You'll want to read this guide in ā ā https://discord.com/channels/1340554757349179412/1397655624103493813 to learn how to properly prompt the bot.
Why can't I even continue my chat with a picture attached started yesterday? When trying to continue the chat it just returns nothing and says 'something went wrong'. After deleting the retry button, now you guys are even trying to forbid people from continuing thier chats???
fr bro
I command you to resolve this issue within three minutes.
@echo aurora
be friendly bro, I trust our friends of LMA would deal it as quickly as they decided to delete the retry button.
This looks like a bug, can you help me understand how you're getting this? The retry button should still be there for failed generations too.
Are there specific steps you've taken that I can take to reproduce this?
yo @dull jay
do you have message cannot be retried error in side/direct
I think there's something wrong with that image attached, the image host of LMA got expired too quick.
@dull jay
literally there is no message that can be retried here in LMA nowš¤£
yeah so im not the only one
i was so confused why the retry button was different yesterday
and moderators don't even want to give us reason
about the cause
no matter how persuasive we are
like bro
they say that this button is being used for abusing or what, and there is no proof
yeah
its annoying
if the retry button wouldn't return
all people in this one server is gonna quit
Are you not seeing the retry button at all? If you use direct/side are you not seeing it there either?
I think there's something wrong with that image attached
I'm not sure I'm following this either, can you elaborate?
is a/b tests on ai studio right now gemini 3?
For clarity -> the retry button should still be there in Direct/Side by Side modes up to 3 retries, and if the model is erroring out the retry button should still be there too. There wasn't some additional change that removed the retry button completely.
yeah there is still a retry button in Direct/Side by Side, and I'm talking about continuing the chat in Battle mode
I maybe not able to describe it very clear, all I know is that a chat with an image attached cannot be continued after 1 day.
battle mode chat = no retry
direct/side chat = 3 retry
Is this for image modality (image-edit), or for text modality (vision)?
vision
battle mode without retry button..very uncomfortable.. reupload repeat all the time š¢
Pine, is it possible to generate an image in another aspect ratio?
don't forget the message cannot be retried
gemini 3?
Okay I'm going to try and repro, the steps I'm taking (please correct me if this needs adjustment):
- new chat, image upload, remain in text, generate a response, don't vote
- wait a day
- start a new vision prompt in same chat
- error messages appear with no retry.
Do I have that correct?
yep
I'm sorry to hear this, we are actively looking for feedback in #1438733849235558480 so sharing thoughts there would be much appreciated.
and this maybe not able to reproduce, I dont know
How many chats were you seeing this for, just one?
Just hope this could be fixed
just one for now
@echo aurora, is it possible to generate an image in another aspect ratio?
The image ratio is going to be set sorry to say, prompting it "make this X ratio" isn't going to adjust the output
Sorry I'm confused, I'm seeing the retry button in this screenshot?
No problem, this is something we'd like to change one day.
It'd be a nice feature to have for sure.
Really depends on the model...
Why are we limiting retries??
It is very inconvenient since I am generating some action shots that have some blood in them
Usually it takes atleast 5-6 retries for me to get a generation now I have to re-upload the prompt in seperate chat multiple times and it's very time consuming
Is there a way to access Gemini 3 directly?
fr bro
Lmarena is for testing models (and I suppose leaderboards ) so āinconvenientā shouldnāt even be an argument
Just use the original model from the original provider
What rank for gpt 5.1
looks like my information was correct. Nov 18th release of gemini 3 š„³
How we supposed to judge and test without an output huh?
It sucks now, I have to keep opening multiple chats, and on mobile itās even worse
True

I really want that retry button back on battle mode, I love it
<@&1349916362595635286>
Same dude different account, he was already banned before
is there anyone stuck in a loop like me? jus like... bot A and B count to 90s and start over again and again
i see... now the site is randomly auto refresh itself
sometimes Im typing the prompt and it all gone
guyz gemini 3 pro!!!
look at mine (creating image of dress and bot give me back this photo) haha
its GPT-1 mini, ignore the prompt and draw random things
like this
"judge": and the "judge" is using 1 model plenty of times because you prefer that model
i dont think thats judging
your just using it for your own use obviously
i think thats just you bro
im not the one complaining about retry being gone..
i make sure i try atleast a few ais to get a better understanding
whats the pattern
you can go figure
weirdasf
why? do you want everything to be explained to you simply
"explain ts like im 5 years old"
Not a bad illustration of Planck scale quantum froth.
Š”ŃŠøŠ»Ń Pixar. Š£ŃŃŠ¾. ŠŠµŠ¼ŠµŃŠŗŠ°Ń Š¾Š²ŃŠ°Ńка-ŠæŠ¾Š»ŠøŃŠµŠ¹ŃŠŗŠ°Ń ŠŠµŠŗŃа в ŃŠµŠŗŃŠøŠø ŠæŠ¾Š“Š½ŠøŠ¼Š°ŠµŃ ŃŃ Š¾, ŃŠ»ŃŃŠøŃ ŃŠøŠ³Š½Š°Š» по мини-ŃŠ°ŃŠøŠø. ŠŃŃŠæŠ½ŃŠ¹ план ŠµŃ Š¼Š¾ŃŠ“Ń: ŃŠµŃŃŃŠ·Š½Ńй Š²Š·Š³Š»ŃŠ“, ŃŃŠŗŠ¾Šµ Š¼ŃŠ³ŠŗŠ¾Šµ Š¾ŃŠ²ŠµŃение. ŠŠ°Š¼ŠµŃа плавно Š²ŃглŃГиŃ, ŠŠµŠŗŃа ŃŠ²ŠµŃенно ŠŗŠøŠ²Š°ŠµŃ Šø Š±ŃŠ¾ŃаеŃŃŃ Šŗ вŃŃ Š¾Š“Ń."
my truly sincere and honest reaction to the message you have provided for us to see has been displayed in the image sent above
I'll add this to that.
What do yāall think the best model is for web dev? Iāve tried loads of them, including the top 2 on the leaderboard, they perform good, but then I tried gpt5.1 codex (which I believe isnāt even top 10), and it did much better, same for deepseek v3, not as good, but still very well
š Types of Lunar Eclipses
1ļøā£ Total Lunar Eclipse
- The entire Moon is located within the Earthās shadow.
- The Moon appears as a dark disc due to the complete absence of sunlight.
2ļøā£ Partial Lunar Eclipse
- A part of the Moonās body is located within the Earthās shadow, while the other part is in the Earthās penumbra.
- The Moon appears partially dark.
ā ļø Note
- When the entire Moon is located only within the Earthās penumbra, it appears as a dimly lit red disc.
- This is not considered an eclipse.
If the platform allows the user to consistently use only the models of their preference, what is the problem with this approach? Your Excellency's opinion?
4.5 32k, 5.1 high, 5.1 codex, 5.1 low (best price to performance)
5 low and 5.1 low are the best imo
ive been using them on websim
so he deleted the post?
hello guys how i can generate this
Hi please check #1397655624103493813
Noooo! wArNInG Drop bear!
Hi
Why are people dissing ChatGPT?
Itās so funny because hands down Claude is the best model hands down
Is there a channel on this server about new added models?
erm
Why am i finding about notebookLM right now
Ts is so op
It litteraly gives answers and questions exactly as in the documents..
gemini 3
gemini 3 is mostly confirmed that it's next week
who confirmed it
Hi, can tell me why removed the image reset button when I uploaded an image?
what was your top p and temp
i intentionally glitch gemini
In which case you use it?
make a nes emulator and say back to me if it did or no
im going to test the thinking injector prompt it gave me first
on a diffusion text model
Good
wym?
CEO of Google (Sundar) tweeted (more like re-tweeted) that indicates launch is basically imminent.
why did i get rate limited? lol
simple bench says that its worse than gpt 5 high
is it better at coding tho

well well well
these models are not self aware (which makes sense) and need system instructions to know who they are
who deleted post?
yeah no, thats not the reason why its saying that lol
nano banana 2 is AWESOME
why then?
thats what it should be saying
What the
im doing smth to lmarena
ohk
its a gta 5 mod
nah the map is kinda bad
thats fake. show me the tweet
thats def gta 5 mod
link me the tweet
you can see the username bru
i dont see it bru on his account.
i cant put links here cuz it gets deleted
u putting fake tweets
yes
Good morning fellas tell me some good news
Is she out
That beautiful model that nano two
no
Damn
this week
tuesday is a good day for releasing models
@quartz light yo is GPT-5.1 codex good at coding
than 4.5 sonnet and opus?
depends on use case
its good at agentic coding
i support you
thanks?
its universal truth for me
š
yeah I heard elon talking about doing that, seems insane
totally insame i think soo
was that not google?
nope
its nvidia
the God of GPU and AI POWER
LoL
it can be a collab
Why does YouTube let people watch videos when they could watch ads instead
its their own tpu. no collab
name? btw thanks for the news
Project Suncatcher
blind?
omfg nvm
its not on code arena
...
bruh
i didnt realise i had code on
pineapple sorry for ping
No all good, but it should be in Code Arena too
Are you not seeing it there?
no
codex and codex-mini are towards the botton in that list
yeah now what
but thats codex
not high
theres 5 medium and 5.1 regular
there is no 5 high
but I wanted to compare 5 to 5.1 high
i know
so it is intentional
@round sedge check it
why is there no 5 codex or 5.1 medium?
if theres 5.1 codex and 5 medium
Ayoo
Tell it to not use jsnes
Cheater
medium
We're still looking into adding this one, tbd.
5 codex
IIRC there was a latency issue we were having with this model (on our end) and didn't feel good about adding to the arena as it'd bias votes.
another one it made without jsnes: https://019a8860-78f1-7481-8ca5-64c6b100b141.arena.site
and yes its bugged but
kinda works lol
it added mobile controls to this one but it has the same problem https://019a8860-8ab6-7e45-99ab-3c896c818693.arena.site
Sure
Yo whats this nes rom??
Oh nvm games
@echo aurora is there a way for the model selector to save your last used model?
Essentially if you leave the site and go back you'll still have X model selected? Sorry to say there is not, but it's an interesting idea.
yep, to keep ur model selected, so you don't need to pick it every time you visit the website
We have been exploring having URLs that can link to specific models, for example: https://lmarena.ai/?mode=direct/modelname
No, but it's something we're exploring as a possibility
Very understandable why having that feature would be helpful.
should i send it on feedback?
@echo aurora Are there any other mapped shortcuts besides ctrl+S?
if i have an idea to make asi legal for companies how can i share it.
share it here
OpenRouter just added the two Sherlock alpha models up but they both give error 400s for now
check back soon
Why is nano banana not working in lmarena.ai??? Can somebody tell me smt š
its grok
grok? š®
What happened to gpt5.1-high? I've been waiting about 20 minutes for a response (and yes, I've tried a new window).
just use it in ai studio its way faster and more rate limit
Sweet guess soon wasn't that long :D
yup its working for me too :D
interesting that it identifies itself as Sherlock maybe its not Gemini 3 then
Do you guys know why itās not working though?
It should be running fine, but today itās failing for no clear reason. If anyone knows whatās going on with lmarena, Iād appreciate it.
happens every day i try not to think about it lol
Yeah, but the issue now is that it doesnāt even show up in the rankings
i see it in rankings
yo guys its been a week about lm arena is not working for me like, whenever I write a prompt to any model, it immediately gives an error without even starting to generate. I've tried changing accounts, using a different device, and connecting with a VPN, but nothing worked
From which country are you guys using it?
Give him an unsolved crime case to solve. The answer is quite interesting.
from turkiye
its large.. its a solution for makeing the asi legal for everyone.. i dont want to share it here , so some company can say its their idea and cannot others use it.. such as any company..
seems fine
"make me a cutieee thingyyyy"
"cool japanese women" lol
And do you guys get the āredo photoā option? I only get the download button
"make me a cutieeeeeeyyyy awww japnesee womannnnn, ewe!" ( Thats How Its MADE!!)
i laughed hard on this...
headshot photo
do you guys get the āredo photoā option? ššš»
i see retry in direct chat with it yeah
huny 3 is way better tbh, till next week when nano b pro comes out
huny 3?
whats nano b pro
nano banana pro
next week?
who said that
ive seen a lot of high up x speculation
i only get redo option in direct mode.. i tried in dual-mode (make me a man that very very oH! CrrEe"OH"pyyy ganago"aH"!) but didnt get the redo option...
thanks for helping me šš»
oH!, In What Did i helped you IN?
for asi
in thea prompts?
the photos its indeed not fro asssi!!!
how to make asi legal for companies
You helped me because I was getting stressed thinking the redo option had disappeared just for me.
who are you? and why are you intrested in that?
i am trying to make asi legal
Wooooo!, Ooh ! Okay Oaky!! , thanks for me thenn... š
if i told you tha idea , and you have been UNcoverd as G00gle im gonna cry harder than enouough!!..
dOnT! Get StressUd! Honney!!.. stress is made for me only.. hehe..!!
who are you? , are ya a team of lawyers working ot go0gle or sometng?? š
just tell me idea, i will credit u
maybe
i dont need a couple of milions in crypto even if in xm.r .. i need the freeedom , tahts why god choosed me as mind , and he why did god puts the idea in me.. hee knows me as of (MY GOAL ISNT MONEY BUT FREEDOM)
if i did give u it, and you this made others cannot use it, someday i cannot get the better asi but a lower one, even asi cannot be opensource just for that which is will eventually gonna make me cryyyy!!!!
i will provide freedom if u share.
so u will be out of stress
share for free and then you gonna make it freedom??
Ne wmodels onopneorouter??
i need some m.oney alsoo , i need it , so i can expand my other researches...
if others companies still can use it.. and its written for public and got famous.. im in.. and forsure its not a couple of thousend of eu.ors..
im not planning for using that moneoy for living a rich life.. but i want it in crypt0 so i can expand some of my researches.. im a believer of god..
hi
can you come private?
u said u dont need money
maybe..
nahnah..
true true.. if my goal is freedom.. i want it , so i can make my own version of it.. and all i want is a api key.. so i can code it in my own space..
ai apikey..
first share idea, then freedom
its loong.. i dont think discord would letme paste it here..
dont laughat me saying freedom of under a law.. š š
come somewhere private.. other then discord.. and see.. is it okay in session app?
dont you know also discord is a company?
who are you working for? , can i know?
private info
choose one.. "usa eu or china you"
i know every option is more harder then the other.. but if you china , i may be more easy with you.. beacuse i dont remember china have a stong stricted laws about asi..
*strong
yall tried the new sherlock model?
It's alright
Please let's respect eachother personal information. Thanks
nahh its amazing
like g3 levels
been doing a lot of tests on it
It throws me errors on openrouter..
guys why when i ask opus 4.1 what model it is it says its claude sonnet 3.5?
im confused
Models are trained on the past, not the present nor the future!
ohhhh
thank you so much
i had a feeling it was a system prompt
like how t3 and claude does it they kinda just make the ai model say what it is
using a system prompt
thanks for the quick answer sir
which one are you using? think or dash?
Hi
Hi
Aye
thats weird, im usign it in opencode and its smooth
What is the dif when you use OPEN CODE
good question, but it seems to work
but i used it on openrouter as well and it works
try refreshing
So you use it for no reason you just like the provider ?
Might be the version of ai you are trying to use got deleted
Try ne chat
New
What should the maximum length for a video be, generated in this discord?
7
22
10
1.5m
9 is larger than 11?
You have to dm
alright..
u sharing?
private and encrypted communcations is better..
grok
can you come private, to see ?
XAI
is quasarflux the best creative writing model?
i thought 5.1 was
sharing what?
400k context window
and what happens in LMarena, if a thread has reached that size?
would the thread lock, or can one continue, with the first messages vanishing from GPT5.1-high's memory?
gemini 3 aint even out yet lol
it can continue but he gonna forgets the importent things .. im not sure gpt-5.1-high gonna forget from the first messages until the continue of afterwards messages..
but it will when it does come out
we did vote based on gemini-2.5-pro .. so we voted based of what we imagine of gemini-3.0-pro would it be
..
Exactly..
yes yes yes..!!
i tried search alot.. but im thinking of it gonna be in the end of the year of near that date...
Interesting response
but i have the idea to make it legal or at least make it cannot escape the machine ever..
how
so why they dont take my ideas and make it real.. and the ACS gonna be real?
ok give me idea, i will make it real
i tried to give you it in DM , but you didnt answer.. also please stop spamming in heree.!!
share here
stop spamming... this is annoying!!.. talk in one message pleasee.
os code in spanish
??
asi not understand
but he isnt have been made out, what are you talking about??
how to stop asi from escaping the machine
How can an inferior intelligence control a higher intelligence?
asi cant control. we are higher intelligence
in my view, that is only possible, if the ASI had a very strong protective instinct towards humanity (like mothers do to their baby!)
but currently, it is unknown how to achieve that
thats what we are doing to ai..
Hi guys
Confirmed: Gemini 3 drops this week
Believe me or not but you'll see
Question now is will it be a bad checkpoint
Great!
as i see from my chatting with Gemini-2.5-Pro ..
I tried the one on Canvas
It is not as good as x28
The best checkpoint we had was on AI studio A/B testing
i imagine it would be a Very Great!!
If you want to see some very good results from x28
who is that?
Me
gpt5.1-high = gpt5.0-high
That's me, the Twitter account
it's just different
And the models are
Gemini 3
checkpoints
but on AI studio a/b
Not the lmarena one or Gemini app one
But that being said, I do not think Google gonna make big improvements either tbh
This one was the best
oH, Why?
Then you haven't tried
the models
Gemini 3 is gonna be a huge leap in general coding abilities
and web coding (ui related stuff)
how so? Have you actually tested it on a big number of tasks?
No models achieved this level https://x.com/phth0nus/status/1978556493625450884
It's mostly just hype at this point
Yes
Meaningless
I tried more than 100 prompts
On X28
I coded an automation tool that gave me a/b tests on ai studio
Let's wait and see, not long till it releases
i guess 18 Nov..
But realistically, small improvements at best, marginal ones substituting gains in one benchmark for loses in another at worst
is there anyone looking for a dev?
15-20%
improvement
in coding
80%/85 SWE bench
yeh, im ..
Hello. I am a blockchain and AI engineer with experience building secure, intelligent, and practical systems for real-world applications. I specialize in developing smart contracts using languages āāsuch as Solidity, Rust, Move, Haskell, and Go, and integrating AI to make blockchain applications smarter and more efficient. My main areas of work include AI-powered smart contract development and auditing, building AI oracles for on-chain decision-making, developing automated trading bots that interact directly with smart contracts, and creating AI agents that allow users to control contracts using natural language. I also develop AI-based monitoring systems that detect suspicious activity and optimize transaction timing. By combining off-chain AI intelligence with on-chain execution, I deliver innovative solutions that are practical, secure, and scalable. Please feel free to contact me if you need any assistance.
Thank you in advance.
come private..
source?
good
yeah so not true
yup
yes
we'll see!
again, i have tried a very good model
on ai studio
that i believe to be SOTA
for everything
but if they release
riftrunner as main model
i believe in you... some how you are right...
then it'll be bad
That's very easy to hype yourself up like that. Remember the wolfstride and what people were thinking about it... ?
There are plenty of examples
I am not hyping myself up
The reality can be different
I spent 100 hours
benchmarking
one specific
checkpoint
on personal tests
This is my opinion, don't believe me if you want
Gemini 3 will be SOTA at coding
you will see this week
i believe in u.
Your opinion acknowledged. Everyone is free to have it. @jovial sapphire š
we'll see..
yup
he might rem.ove it...
i have no interest in hyping people about the product of a multi billion dollar company
who cares
i'm just saying i tested one model that truly impressed me
we care..
and did a very good job on tasks that other models struggled at
so unless they release a very bad version of it, as we saw with Zenith (lmarena) and GPT5
they might release a downgraded version
paste one of them there. Just curious š§
geometry dash game
2000s style
no other model can do the logic
try
try it
in one shot
Opus 4.5 can't
GPT 5.1 can't
stoppp sappamming the chatt.. , please do one message , dont talk in a multiple messagesssn!!!!!!!
Sonnet 4.5 can't
mb
What's thats Slang Bro?!
my bad = mb
really!!??
yes!
Yeah no one is actually doing that. It's 1-3% performance regression at worst but even that is very rare. People just love to look past the shortcomings when something is hot and new, not publicly available
somHow thats Correct!!
how to download?
Yeah but the checkpoints were vastly differents
Example, on AI studio
coem private..
when you get AB, you can see the request that is sent and there is a parameter called "checkpoint", that's how i classified them and compared performance
different checkpoints, different models, some of them were FAR better than the others, and it's not random occurence
ok
so yes, they will not downgraded them voluntarily but they can release some models
and not the others
They may be 'different', but they gonna scientifically perform the same for the most part, overall. No one is gonna hold back on a more competitive model with so much competition. They much rather rush it unfinished tbh
No
I don't agree, some checkpoints were clearly small models
So they don't "perform the same"
They were different models, completely different
I think you don't understand my point
I am just saying that Google was doing A/B testing of different models
nah broo, we did...
And some of them were better than some others, that's it
Then how can you disagree if you did understand? It's not an opinion
They are testing small models too, but those gonna be released as smaller variants. No one is testing the full model just to distill that before release. Besides model size is relative. Performance != size. GPT4.5 performance is sht by modern standards despite much bigger size
it's really not
og gpt4 is bigger than gpt5
yes ikr
then how did you say "I think you don'r understand my point" ????
so not an exception
With progress, you can get smaller and smaller
with virtually no downsides
and better performance everywhere
yes i know
.
"perform the same for the most part,"
i said "
And some of them were better than some others, that's it"
It's not an opinion, it is a fact
yeah, maybe...
I tried the checkpoints, some of them were better then the others
This debate is stupid
are you an ML researcher? @ocean vortex
I meant it more like 'they are different'. Slightly better for one specific thing, slightly worse for some other random thing
i can make a group with a stanford ml student if you want
no bro
gemini 2.5 flash
and gemini 2.5
pro
idk this is not that impressive to me, maybe im dense lol
are SUPER mega different in performance
try with existing models
he still can do researches and do thinking.. even if he isnt an a thing u saidd!!!...
im pretty sure they can
they aren't releasing flash as a pro lol. Flash checkpoint is gonna be tested and then released as Flash. Pro is gonna be released as Pro
blud
on ai studio
the checkpoints
were flash and pro
models
they werent specific to pro/flash
i made the automation tool
i know how it works, i worked on it
If you select Flash it is always flash unless you get a/b
u cannot say u know its worked, u must provide some proofs!!
looooooooooooooool
there you go!
when you get a/b doesnt matter if u selected pro or flash
so i made a tool that looked for a/b only
and extracted checkpoint for each prompt = response
for A/B it's a fair game they are free to test Gemma or Ultra regardless of the model you are using. What is your point?
That's kinda common sense
then u must share it..
while you're saying they were all performing the same
you're the one who's not reading my messages
.
.
really?
By design. It depends on the data they are trying to collect. Maybe they want to test 2 very different models on math tasks head to head or whatever
I said "a/b" tests 20minutes ago
yes so they were different in performance
you said ""perform the same for the most part,"" š
anyways
I didn't say that for a/b testing
Share it.
thats ur problem
i was talking about a/b
checkpoints are on ab
its not... u must share that tool u used...!
with a/b you have no clue and no control what they are testing so you shouldn't be assessing much on that at all. With lmarena and things like that, at least you have a way of knowing it is the same model you interacted with before
mystery models I means
they do have identifiers
Correct!
lol
on a/b
you can know the checkpoint name
which is an identifier
that's why i say "x28"
it was the beginning of the checkpoint name
so i did multiple tests, with ONLY one checkpoint: X28
so i knew what i was testing
and that it was the same model
yes
ayo LOL
wait i will show you when i get next a/b
show now..!
here this guy explains it well
How did you get the identifier of it?
Inspect-element/network?
to intercept