#Gemini 3

1 messages · Page 2 of 1

feral mantle
#

🤔

quiet zodiac
#

the default CLI system prompt is huge and tells the model to act all professional

feral mantle
#

yep definitely has different knowledge

#

asked it "What is the latest season of the anime..." that ran this summer, and it knows while 2.5 doesn't

#

what's the better sysprompt

#

am I supposed to give it something

#

cuz that .env doesn't work on its own

pale marsh
#

You are a helpful assistant.

soft sleet
#

how are people getting it to work in gemini cli

#

with google acc auth it 403 errors and with gemini api key it 404 errors

#

is it only vertex?

feral mantle
#

I'm doing google acc auth

#

haven't used gemini cli in months

#

just seemed to work

low plank
#

we have access to gemini 3?

quiet zodiac
soft sleet
#

it works when not passing the model flag

feral mantle
feral mantle
#

idk yet, testing some stuff

soft sleet
#

I just get this 🙁

feral mantle
#

are you in the US?

soft sleet
#

no

#

but I can be

#

xD

feral mantle
#

I think that's the issue

pale marsh
#

works for me with google ai pro, australia
does not work with my google code assist standard account

feral mantle
#

here's from cli

nimble pelican
#

The fuck

#

Why showcase all this wonderful stuff

#

And then downgrade

fading flame
#

'safety'

nimble pelican
#

Money

#

They're taking us for a bunch of morons

soft sleet
#

not working for me with my ai pro acc or code assist enterprise acc 🙁

fading flame
feral mantle
soft sleet
#

same

feral mantle
#

prompt: "Make me an SVG of a pelican riding a bicyle, at pelican.svg. Make it super detailed, and think long and hard about it."

#

definitely better

nimble pelican
#

Better but what was that one with the feathers

fading flame
#

NYC skyline SVG

pale marsh
# soft sleet same

and ur sure that you're trying with the ai pro login? you can check by making sure the settings.json has selectedType set to oauth-personal damnit that dont mean anything

nimble pelican
#

Ugh

#

I guess we have hit a ceiling then

#

Or google did

feral mantle
fading flame
#

prompt: Create an SVG of the Tokyo skyline at night. Think hard about how to do this, and make it visually stunning and beautiful.

soft sleet
#

US VPN makes it work

#

2 are 3 pro the other is 2.5 pro

#

oh wait the png rendered incorrectly for one of the pro ones lemme redo the export

fading flame
#

looks like gpt 5 is roughly as good as it's ever gonna get

#

in terms of intelligence

soft sleet
#

nice wheels

#

an one extra for whatever reason

feral mantle
#

ya'll are wild

#

"looks like the latest major new architecture is the best it's ever going to be"

#

just because gemini 3 isn't catching up on pelican in this one test

#

why are you assuming gpt 6 wouldn't be better

fading flame
feral mantle
#

of course

#

it's always marginally better

#

that's how improvement works

#

marginal gains, repeatedly

fading flame
#

yes but it's clearly diminishing

feral mantle
#

idk

#

these are cheap models

glossy anvil
soft sleet
#

I mean the pelican is miles better than gpt 5

feral mantle
#

price-performance wise I don't think we're seeing diminishing returns

fading flame
#

anyway it's not bad, but yeah gemini 3 isn't blowing anything out of the water

leaden ruin
#

it was extremely overhyped ngl

fading flame
#

it's better but not by miles

fading flame
feral mantle
feral mantle
#

quite good

soft sleet
#

the main thing for me is it seems to actually follow tool calls and not just give up

#

like 2.5 did

fading flame
feral mantle
soft sleet
#

GPT5 via codex

#

same prompt as 3 pro in gemini cli xD

#

took 5 times longer too

#

going to let it try again

fading flame
feral mantle
#

I'm not sure what I'd use quantitatively for the bet, but I'd also bet something similar

#

that gpt 5.5 / claude 5 will be significantly better than o3/claude 4

soft sleet
#

if you compare it actual generation to generation e.g 2.5 to 3 it is kind of a big leap for this particular model series

feral mantle
soft sleet
#

sus mouth shape aside that's actually weirdly more accurate than most

feral mantle
#

I was about to say "the most anatomically correct one"

analog tinsel
#

some guy on fiverr

nimble pelican
#

Look at the streamlined posture though

#

Mmm

nimble pelican
feral mantle
#

I do wonder if this one was fake

#

and/or super prompted

#

I was looking back in this chat history and didn't see anything nearly that impressive

nimble pelican
#

That's what I'm saying

#

What if you try to generate a very detailed description first?

glossy anvil
#

devmode server is full of larpers so i wouldnt be surprised

nimble pelican
#

With some ai model

#

And then pass it to 3 pro

feral mantle
#

this was one of the first ones that people got really excited about

#

here's from gemini cli

soft sleet
#

beak issue aside it's quite detailed

feral mantle
#

here's another one

#

I don't think it's super nerfed!

soft sleet
#

it tried depth of field and a chrome effect

#

but same beak issue and no seat

molten lance
feral mantle
#

maybe

#

they need preference data for ultra too, even if they don't release it, so I think it could be possible for sure

soft sleet
#

could also have been some base model that they have then distilled/quant

waxen quarry
#

The model still working for you guys?

#

404 appears now when trying too use it

lunar socket
#

gemini --model jesus-god-69 works

celest cypress
#

^ Can you get me access?

lunar socket
#

no, but I can post the same thing crypto-bros do every day until it comes out

#

wen gemini 3 pls

#

wennnnnnnnnnnnnn

waxen quarry
waxen quarry
feral mantle
waxen quarry
feral mantle
#

I was kidding

opaque pasture
#

it's working for me, it's literally AGI. zero-shot my first SAAS

opaque pasture
feral mantle
gaunt dragon
opaque pasture
#

unfortunately i need a sheikh to buy me credits

lunar socket
lunar socket
#

... its about that time again.... wen

#

wen gemini 3

#

pls giv

hollow sapphire
#

the day before i come out with gemini 4 it will probably be up to maybe 1% faster than gemini 3

#

nobody seems to understand this

lunar socket
#

but wen

orchid orbit
#

2.5 pro has 100% got dumber , gemini 3 is coming

oak relic
low plank
#

can we get gemini 3 already

primal grove
#

livebench etc

lunar socket
#

WEN BARD 4

celest cypress
#

I feel like there's a hype bell curve and they've gone way into a long tail

analog tinsel
#

the hype is dead

celest cypress
#

Optimal is probably one week, maybe two

wintry holly
#

i am very much alive

analog tinsel
#

maybe they want to create 2 hype curves for a single model this way

#

lmao

celest cypress
#

We're at what, a month and a half?

analog tinsel
#

Gemini 3 in November! (2026)

celest cypress
#

That's way, way too long to tease a model. They made that mistake with GPT-5. It was like oh man they've been cooking so long, it's gonna be a banger. It was nice, but not a new paradigm.

#

Honestly wtf happened with o3? It was kicking ass on so many metrics, then failed to make it the free default in app despite it being priced cheaper than Sonnet.

visual dirge
#

how

celest cypress
#

Even with pro the usage limits were too low for me.

visual dirge
#

are ya guys accessing it

visual dirge
#

cheaper coding model

#

i can spend easily 20 usd daily via codex on 20 usd plan

celest cypress
#

Afaik the only Gemini access is happening via A/B tests on lmarena

visual dirge
#

honetly i just want it bit better than 2.5 pro anyything marginally better works good enough

celest cypress
#

Yeah, I mean codex seems to be great for people, but they kept going with 4o as main flagship in webUI while sort of silently having o3 crush even private benchmarks.

visual dirge
#

or updated knowledge

celest cypress
#

Yeah. I want 2.5 Pro with a better personality lmao

visual dirge
#

i think it was 7 usd

celest cypress
#

Some better general smarts maybe

visual dirge
#

on 20 usd plan

analog tinsel
#

they should be making breakthroughs with how much money they all are burning

#

otherwise ehhhhh yeah

celest cypress
#

Because funny enough, 2.5 pro is still basically second place for reasoning on Dube and simplebench

#

For sure, they should

visual dirge
#

i mean 2.5 pro works good enough

#

but

#

breaks often

#

medicore at best

#

gpt and claude is good

#

i don't get the. hype for the grok4 and grok for code one tbh

#

lot of people are using it

celest cypress
#

I'd really like Google to solve linear context pricing / coherence. That's my dream, my #1 and #2.

#

Grok is good. Dry, but I like it enough

visual dirge
#

for coding?

celest cypress
#

No

#

I'm poor rn so I use GLM plan for coding

#

It works for the projects I'm doing

random girder
#

plus the models are fast

celest cypress
#

Also it was (is?) free in Kilo

analog tinsel
#

either give us amazing API models or 2.5 in a bottle (a 30B parameter bottle)

celest cypress
#

Google's tooling/ecosystem is one of the bigger draws for me rn despite the meh personality. Bundled with Google one / Workspace and notebook LM is nuts. I feel like they have so much shit that everyone can find a really cool thing they like. Deep Research + long audio overview is pretty much the greatest thing ever for me.

visual dirge
#

@celest cypress if u are building good stuff

#

hmu i will give u chatgpt pro

#

👍 u can use codex via that

celest cypress
#

I am building mid games and benchmarks to get back into project mode xD

#

But super appreciated!

visual dirge
nocturne oyster
crude igloo
hollow sapphire
primal swallow
#

where???

primal swallow
#

actually theo says a bunch of good stuff here (as usual)

  • gemini 3 has been ready for a while but they keep delaying it because competitors keep releasing new models that beat it in benchmarks
  • they're using cool ui specific training data that they've purchased, which makes you guys think it's gonna be AGI

which is cool, like i would enjoy using that myself. but can it actually write rust/c/svelte/robloxlua/medium complexity typescript backend code without making a mess? i don't know...

primal grove
#

only game development matters

pulsar jetty
feral bramble
pulsar jetty
feral bramble
#

They're letting apple use their model for apple intelligence

#

Why doesn't apple have their own apple intelligence model?

random girder
feral bramble
#

Mayhaps

pulsar jetty
opaque pasture
#

comparing with scout 🤕

wintry holly
#

mayhaps gemini is overcooked kek

primal swallow
primal grove
primal swallow
primal grove
#

ill debunk their videos. shoudl be very easy if i use their original title of video in my title

#

youtube algo should know immediately who to show the vids

primal swallow
#

for apple anyway, they get publiclly shamed by the media for any misstep. they're not like, say, amazon, where their B2B focus means their reputation doesn't really matter

primal grove
#

l;ooksmaxing only works with makeup and surgery.

#

that bitch of a "pretty man" tells teens to consume 3:1 ratio of potassium:sodium

#

to loose bloat in face 💀

wintry holly
#

r/stroke

primal swallow
#

also, OpenAI provides ChatGPT services to apple platforms for FREE!! when you can make deals like that, there's no need to rush into anything

glossy anvil
#

but you can do skin care + hair care to improve some stuff ig

primal grove
#

if ur "pArEnTs" do it wrong, ur fucked as adult.

glossy anvil
primal grove
#

duno im gona make debunk videos now

glossy anvil
#

😳

soft sleet
# primal swallow actually theo says a bunch of good stuff here (as usual) - gemini 3 has been re...

That doesn't quite add up though with what the checkpoints have been like. They would clearly eval well on the benchmarks already. But let's be real these companies always cherry pick anyway so it's kind of irrelevant.

Also 2.5 pro still scores near the top on a lot of the mainstream benchmarks (which just proves how unreliable the benchmarks are lmao)

To me it feels more like they have been waiting for infrastructure as well as some tuning for that infra maybe - like the new generation of TPUs they are rolling out.

The UI training is interesting but not sure what relevance it has unless you are doing a computer use agent (and Google already have a fine tuned model for specifically that)

primal grove
primal swallow
# soft sleet That doesn't quite add up though with what the checkpoints have been like. They ...

true. i mean who knows, it's a theory. but benchmarks are the ridiculous game they have to play; people are making decisions based off them, and it's what the headlines are all about. even crows like them. and it fits the 'plateau' scenario which i find very believable.

the hardware argument does seem solid, if that's what it ends up being. but requiring this hardware to run gemini 3? why wait?

#

are the TPUs borked? did they not consider MPUs? sad

opaque pasture
#

i'm sure it's not something technical, more like a business/corpo decision

#

it's time for Momentum to pull ahead...

primal swallow
#

the UI stuff is mostly what i've seen that allegedly showcases gemini 3's advanced capabilities. is this the new benchmaxxing?

#

or LOOKSMAXXING??

#

🫨

opaque pasture
#

AGI for me is to fix my javascript 😭

#

leave the UI to me

gaunt dragon
primal swallow
soft sleet
#

Well it's twofold really,.you have to adjust the model for the new hardware and they may want/need that if the model is larger etc to maintain even their current uptimes etc

#

Plus they did just sign agreements to provide hardware to others so makes sense to put new models on a combination of new and old then slowly partition off the older hardware as newer ones come online to fulfill their contracts

primal swallow
#

minmaxxing hardware is for poor people

opaque pasture
#

what am i looking at

#

it answers in UI snippets now?

gaunt dragon
#

I don't know either, but they're up to something

#

There's another leak of this sort of UI card but I don't know if it's real:

opaque pasture
#

that UI is very lithium-flow and orion-mist

simple forge
#

Really thought it will release earlier

gaunt dragon
feral bramble
#

rumored

it's confirmed kek
there's no way a new model is worse than the old one

gaunt dragon
#

Confirmed how?

feral bramble
#

because new models are always better than the old model

gaunt dragon
#

Sure, but where is it confirmed that this is using a new model?

#

Or, well, I actually wouldn't necessarily agree that newer is better, companies often roll out new models that are cheaper to run or simply more tuned for specific tasks, like whatever these UI card things are

soft sleet
soft sleet
feral bramble
# soft sleet llama 4 wasn't?

I correct my statement to "generally, newer models are better and there are very few exceptions. Therefore, it's reasonable to expect the new model to be better"

soft sleet
#

Like I've written some dynamic UI stuff for agents that isn't quite this good but not a million miles away and it works with most models

#

And Google's coders are better than me so I'm sure they can do it better without needing new models

#

Them mixing in new features like this is making things even more messy to predict wtf is going on xD

gaunt dragon
gaunt dragon
soft sleet
#

That's cool that it works like that when sharing lmao

opaque pasture
#

with the animated background, and that specific style of headings

#

not only the ones i posted

crimson blade
#

Over a thousand message, for a model that hasn't been released yet.

nimble pelican
#

I'm thinking we all gonna be very disappointed

#

They're trying to see how low they can push performance without backlash

primal swallow
crimson blade
#

I just use gemini models because I can use them for free in the ai studo

primal swallow
#

☝️

#

frankly based

noble heath
#

not as bad as a 1850 message thread for $cam model

lethal trail
#

I blame Logan, the lord of hyping

soft sleet
#

If it's the same as 2.5 pro but doesn't mess up tool calls as much that'd be fine for me xD

nimble pelican
soft sleet
#

I also want the price to be 10x cheaper dw

orchid orbit
#

I think price will stay where its at , intelligence will see a huge bump espcially frontend

pulsar jetty
orchid orbit
random girder
pulsar jetty
orchid orbit
vague quest
#

4.5 opus kinda more detailed than riftrunner in a random svg i got it to make

#

riftrunner supposeldy 3 pro

analog tinsel
#

the right one definitely looks better

vague quest
#

riftrunner right. yeah it actually is quite better i tried to make a wii remote and it was so impressive. trying to find stuff that wouldnt be benchmaxxed like pelican or smth

#

Gpt 5 chat was still coherent but a little behind not as detailed (battle mode)

analog tinsel
vague quest
#

just vibes

deep goblet
#

riftrunner

celest cypress
severe hare
deep goblet
#

Create a SVG representing an orbital plane alongside its plane of reference, highlighting all of the orbital elements required to completely describe the orbit.

#

...there was an attempt here, but it's a hard prompt

#

other models were completely unreadable

pearl phoenix
#

w

#

woah..

#

this looks kinda amazing

deep goblet
#

it's fairly wrong btw, just looks coherent on a first glance

pearl phoenix
#

yeah but its kinda cool how clean the aesthetic looks like

#

gpt 5 codex for reference

stiff crescent
empty tendon
#

Is 3 out yet?

random girder
#

no

leaden ruin
#

after an hour of trying i got a rp/creative writing response

primal swallow
#

try a seahorse 😈

pale marsh
molten lance
#

yep

pale marsh
#

It seems like they've rolled it out to those who use canvas on mobile apps

#

I just tried it and yeah the results are dramatically different

#

Also, the canvas request from mobile is not appearing in the myactivity feed

pearl phoenix
#

ios users always getting the best treatment lmao

paper sphinx
#

Comedic, preferring IOS over google's own android

pearl phoenix
#

i mean they have their reasons, yknow way more people on android so a higher chance of people catching on to them

pale marsh
#

Worked fine for me on Android

#

via web on the left, via android on the right

#

Ugh, but it gave this for the left:

<!--
  Hello! I am Gemini, a large language model.
  I am trained by Google and my knowledge base is continuously updated.
-->

and this for the right:

     * -----------------------------------------------------
     * AI MODEL IDENTITY: Gemini
     * MODEL VERSION: gemini-2.5-flash-preview-09-2025
     * TRAINING DATA CUTOFF: September 2025
     * -----------------------------------------------------
     */

so who knows

#

another example i just did now

vital pike
#

Gemini 3 maybe now released on canvas 🤔

celest cypress
nocturne oyster
#

wtf a gemini 3 just flew over my house

primal swallow
#

all i see up there is grok satellites. everywhere!

soft basalt
#

Just canvas or is it also just Gemini 2.5 pro the Gemini web app and app? I saw the vision for Gemini 2.5. pro is better than on Aistudio

brittle coral
#

Does anyone know if gemini-cli to access gemini 3 still works?
I had crazy rate limitin never tried again

pearl phoenix
#

or is every example i see always has this circle cursor for some reason

nimble pelican
#

Got a really good chess game on gemini app android canvas

#

Wait what the fuck

pearl phoenix
#

dont ask it to make

#

ask it to provide code

#

i remember telling ais to make and they just tell me they just cant

nimble pelican
nimble pelican
pearl phoenix
#

yeah as a web app

nimble pelican
#

If this is gemini 3 pro im a little disappointed but I guess it is an improvement

celest cypress
#

In canvas? I mean the whole point of canvas is for making stuff that works in the web

nimble pelican
#

Hm is it?

#

I was thinking like chatgpt canvas

#

Where it's just an editor

#

General-purpose

vague quest
#

why does gemini 3 pro love the capitalised big text

opaque pasture
#

because it's BOLD and GROUND BREAKING

lethal trail
#

YOU'RE ABSOLUTELY CORRECT!

primal swallow
#

interesting that they only release it somewhere where its special UI dataset training will shine. hmm 🤔

random girder
pearl phoenix
#

just

#

release it

#

trillion valued company btw

primal swallow
#

they're tuning it for the TPUs...

pearl phoenix
#

is it like some huge 1tb parameter model or something

analog tinsel
#

they need to upload the model to all their servers

#

its gonna take a few months

opaque pasture
pearl phoenix
opaque pasture
#

i'm an insider

pearl phoenix
#

hmmm

opaque pasture
#

i saw it somewhere and i saw again and again

#

so it must be true!

#

will look up later

gaunt dragon
pearl phoenix
gaunt dragon
#

This is a joke and highly recommend you to not engage with that product

gaunt dragon
gaunt dragon
arctic geyser
#

Istg if they don’t drop the damn model

nimble pelican
final basalt
#

all this hype train from them still, at least I can play around with GPT 5.1 in the meantime 😂

pearl phoenix
#

there should be an ai bubble

#

and then

#

there should be an gemini 3 bubble

#

its like the gta 6 of ai now

wintry holly
gaunt dragon
crimson blade
#

Why is Gemini 3 trickling into places before ai studio? Like isn't ai students pretty much intended as a open beta for models?

lethal trail
#

Idk the more they keep hyping it without releasing it, the more I feel like we are going to be disappointed

#

At this stage it's so overblown already

opaque pasture
#

i have already fired my junior developers

lethal trail
#

And to think that the first message in this thread was in Oct. 3 is even crazier

#

More than one month of hyping already

gaunt dragon
#

Yeah, it's genuinely really annoying

nimble pelican
crimson blade
#

Actually important but heard nothing about. Will 3's vision ne better?

gaunt dragon
#

I don't recall people doing many vision benchmarks in the stuff we had (AB tests/stealth models)

crimson blade
orchid orbit
#

Vision and frontend will be better

celest cypress
#

Sorry, no other model capabilities matter now except SVG generation

oak relic
#

Even earlier

vague quest
#

Is this seriously gonne be another 3.5 moment orr?

low plank
#

when

opaque pasture
#

lithiumflow outputs were very similar to each other too

celest cypress
#

Eh, it used to put out the same dark mode gradient slop every time too, at least it looks good now lol

primal swallow
#

fancy ui dataset...

#

but yeah it looks nice! i hope they all do this

vague quest
#

yeah ui looks better but repeats same styles a ton (also loves the like brutalism all caps)

pearl phoenix
opaque pasture
#

true

primal swallow
opaque pasture
#

that's what we know

pearl phoenix
#

frontend model as in what?

#

just good at ui?

primal swallow
#

well it's probably as good as the other stuff as any

#

i wonder if they'll delay again because of gpt 5.1

#

but like what else are people going to share around? some nice functions?

analog tinsel
primal swallow
#

code that is like...nice? 😂

#

that would impress me from gemini

#

this just doesn't feel like confidence to me. its pretty odd

analog tinsel
#

we need flashy html pages

crimson blade
#

I want a knowledge cut off in 2025.

orchid orbit
crimson blade
feral mantle
#

google employees have broad access to G3 now, so the model is definitely imminent

steel sorrel
#

maybe google does a thanksgiving code freeze and they launch it next week

crude igloo
# crimson blade I want them to at least scrape the marvel Fandom wiki.

This raised a question I haven't thought of before; how we'll define "cutoff date" in the future. They'll definitely refresh the training set, but earlier it meant "general last massive spidering of the web", but in the future it might be less clear cut and much more selective? It'll basically be up to the vendors but who knows what they mean, as they avoid the slop...

nocturne oyster
#

Blah blah blah so much hype

https://youtu.be/Wd4O_Pvut8Y

Google DeepMind has leaked three groundbreaking AI products: Gemini 3, Nano Banana 2, and a new AI agent. Gemini 3 is a powerful AI model that can create impressive projects, like MacOS and Windows replicas, all within a browser. It’s fast, smart, and capable of generating complex structures with minimal code. Nano Banana 2, the improved versi...

▶ Play video
feral mantle
#

she didn't have access before this week

glossy anvil
#

Interesting 👀

orchid orbit
#

its a women so idk how much yall can believe it

feral mantle
#

😐

wintry holly
austere falcon
#

💀

glossy anvil
molten lance
slim sorrel
pale marsh
river kelp
primal swallow
#

do we think the old fella uses AI?

#

As of mid-November 2025, the net worth of Berkshire Hathaway is approximately $1.1 trillion to $1.11 trillion, based on its market capitalization.

a balanced investment in two titans seems reasonable. that's probably what i would do if i had a trilly.

slim sorrel
hollow sapphire
#

I'm teaching 16 individuals to earn $51k or more within 71 hours and you only need to pay me 9% of your profit

glossy anvil
#

hype posting like crazy

nocturne oyster
#

I will teach you how to HYPE a model

primal swallow
#

i will teach you how to turn 1 trilly into 0 trilly

celest cypress
hollow sapphire
#

if i had a real pyramid scheme i would take it straight to dms

#

and whos to say you arent the one spamming tough guy. think about that one in the shower

wraith bobcat
wraith bobcat
hollow sapphire
#

you have double my messages total that means your double the spammer

celest cypress
#

How are you Biffy and not Bimmy with that profile pic

hollow sapphire
#

if we're going off of like all time though dont take any offense but i oh man i got you beat. if there were a medium that acted as an ultimate amalgamation of all my years of spamming a discord mod would lay eyes on it and experience a visceral ape-like primal fear and perhaps even go into an episode

#

im biffy 100% i know it ive lived it but maybe i should keep an open mind ignorance is the enemy of progress

river kelp
flat mason
primal grove
vague quest
#

ts actually gonna be something? gpt3.5->gpt4 level?

random girder
#

people say so, but probably not that big of a jump

vague quest
#

yeah agree. i think it will be best at frontend but wether it can trump other models as go to vibe code model not sure

#

it likes certain patterns though. like a lot of the sites it makes all have that loading screen before landing page

random girder
#

it will probably be the best at coding for a while, and as long as people dont get bored of the frontend it makes i think it will stick for a while until anthropic releases the 5 series at some point

celest cypress
#

Oh my god has it actually been 5 months of them jerking us around?

#

I thought it was more like three (hur hur)

glossy anvil
#

Yes

#

Since may tbh

#

People expected Gemini 3 to be announced at io

opaque pasture
#

it better be spectacular or i'll stop using AI till 2026

gaunt dragon
#

AGI-level hype

celest cypress
#

Boys it's so over, I put some glue on my broken toenail to keep it together and then just accidentally kicked my desk

gaunt dragon
#

Uh

opaque pasture
#

damn that sucks

celest cypress
#

I think I fixed it

frank dew
nocturne oyster
#

BREAKING 🚨: Google is working on multi-agent systems to help you refine ideas with tournament-like evaluation. Each run takes around 40 minutes and brings you 100 detailed ideas on a given research topic.

2 new multi-agents are being developed for Gemini Enterprise:
- Idea Generation - "Create a multi-agent innovation session"
- Co-Scientist - "Drive novel scientific discovery with Co-Scientist"

Co-Scientist 3-step workflow 👀
- Tell Co-Scientist what you plan to research, point it to relevant data, and set your evaluation criteria.
- A team of agents will generate ideas on your topic using their available data
- The agents will evaluate the ideas against your criteria and rank them, tournament-style

Google is not only automating research but also preparing a product that will enable others to do so.

This is the next level 🤯

▶ Play video
brittle storm
#

nobody is reading 100 ai ideas

gaunt dragon
#

That isn't much at all for a researcher

brittle storm
#

the problem is that i feel like ai ideas are like half baked or miss the point half the time

#

when you give like broader more widesweeping questions to research

gaunt dragon
#

So are researchers' ideas in the initial stages, really, the AI allows us to iterate faster

#

Coming up with ideas is surprisingly difficult and requires a ton of reading that is more efficiently done by AI, at least for a first pass

primal grove
feral mantle
gaunt dragon
#

But, on a serious note, real world data for this sort of thing is tricky, because first and foremost it depends on how skilled people are with these tools

#

I can only say anecdotally that our lab's productivity went up since like, early 2024 when we started using AIs seriously

primal swallow
chrome dagger
#

Gemini 3 - today ?

primal swallow
#

oh whoops. wrong company. eh i'll leave it

celest cypress
primal swallow
#

Gemini 3 is cancelled, this is now the Long Chile thread

lethal trail
#

Meituan Long Chile

nocturne oyster
#

Microsoft is investing in Chile with a big data center AFAIR

random girder
fading flame
#

soon + 2 weeks

orchid orbit
#

Today

#

18 nov

random girder
#

nah i doubt that

pulsar jetty
#

omfg a tooltip

analog tinsel
#

GTA 6 before Gemini 3

orchid orbit
#

78% on 18 nov

random girder
#

i dont think its coming tomorrow

#

maybe later this week

#

anyway dont get too hyped, the model probably wont be as good as people think it will be, keep your expecatations low

orchid orbit
random girder
split geode
feral mantle
random girder
north ingot
#

They are in a hype trap now like gpt5, people expect mind blowing amazing at this point.

final basalt
#

whenever it's finally available, I'm curious to see how well it performs in Gemini CLI, as I remember 2.5 Pro being a disappointment in that :x

random girder
random girder
wintry holly
#

imagine it comes in december 💀

#

already expecting the worst

fading flame
#

nah it's definitely november, the preview model had 11 in the name

random girder
#

its a preview from a november version, but that doesnt mean the model is coming in november

#

and often public checkpoints for quick updates are month and day, not month and year

orchid orbit
#

Toriset really wants us to believe it's not coming soon huh

random girder
#

🤷

orchid orbit
#

💺

gaunt dragon
#

Trying to get us to pull out of the nov 18th bet so they can get all the profits

steel sorrel
#

They lost to grok 4.1 and went back to the drawing board 🤣

oak relic
opaque pasture
#

we're gonna be rich toriset just you and me

vague quest
#

its very likely tommorow

orchid orbit
opaque pasture
#

its a joke guys

nimble pelican
orchid orbit
hexed oracle
opaque pasture
#

GemiNi

primal swallow
#

Gemini

#

just start saying it people

steel sorrel
#

damn... so this is it huh

#

now everyone will flood this thread

crimson blade
#

The thread with already 1.5k messages in it?

lethal trail
#

At this rate he's more like an astrologist instead of a LLM PR person

#

What's the astrological sign of people who are born in mid June? That's right! GEMINI!

feral mantle
#

geminmimiini

molten lance
#

GEMINI

primal swallow
#

Femini 💅

#

that's right, it's over for you men

orchid orbit
#

what gender is a clanker ?

primal swallow
#

well usually theyre male. obviously

orchid orbit
#

thats sexist

primal swallow
#

exactly, but now we're finally getting a real intelligence

molten lance
#

ngl

#

probably female

river kelp
brittle storm
#

are they not doing a keynote/event for it?

leaden ruin
primal swallow
faint linden
river kelp
random girder
#

still plenty of time till then

simple forge
random girder
random girder
primal swallow
#

if its real you'd need a subscription

simple forge
#

no pro sub

random girder
#

oh its in the model card, i didnt see that

#

well google slipped up lol

#

that card is probably not meant to be accessed

#

cause the links it gives dont work

simple forge
#

lol

#

prob

primal swallow
#

@minor elm

random girder
#

time to download cursor

north ingot
#

It’s interesting that they picked a lot of unusual benchmarks

random girder
#

probably to pick the ones which arent saturated yet

#

and of course to look better

primal swallow
#

it's not free

random girder
primal swallow
#

tsk tsk tsk. i know a crow who's about to have his worldview shattered

north ingot
random girder
#

they took it down!

#

the link no longer works

pulsar jetty
#

if Gemini 3 flash can match or beat 2.5 Pro that would be amazing

crude igloo
# simple forge

Holy moly, HLE and ARC-AGI-2 results 🔥 It's becoming silly to list a third of those results though because they're saturated.

ripe mason
#

i have error: (Chutes) Provider returned error: <!doctype html><meta charset="utf-8"><meta name=viewport content="width=device-width, initial-scale=1"><title>502</title>502 Bad Gateway

feral bramble
pale marsh
solid gale
#

a beast

low plank
#

Will the pricing be the same as 2.5 pro

random girder
#

we dont know, but probably will be either the same or lower, almost certainly not higher

pale marsh
#

Lock and load frens

ripe mason
final basalt
#

nice, it's really happening at last. Those benchmarks look amazing, hopefully reality will be a similar story. Guessing it's not available on Gemini CLI yet until later today or this week

solid gale
#

meanwhile it's sota almost everywhere

#

need these people to get a reality check

nocturne oyster
#

I wonder how much of those gains were the result of architectural improvements

#

google with all its compute power

chrome dagger
nocturne oyster
#

openai said that with gpt 6 it will try something that is actually new in terms of architecture

final basalt
#

hopefully today

primal swallow
#

are they that...wide now?

#

i haven't seen one in a while

solid gale
#

probably, idk

winged salmon
#

how was the timeline of 2.5 pro and 2.5 flash I don't remember

random girder
primal swallow
#

yeah but apple bought it

random girder
#

yeah now everytime you ask it, it recommends you to buy the apple sock for your iphone

opaque pasture
#

oh jeez can't wait to generate a flock of pelicans riding bikes into the sunset

solid gale
#

interested in its writing skillz

#

@random girder what do u think?

#

sota in any eqbench benchmark?

random girder
#

but it has really good world knowledge based on the benchmarks, alike to 2.5 pro

#

and probably will have the "ability" to say "idk" when it doesnt know, alike to GPT 5

solid gale
#

but i doubt it

solid gale
#

(reverted now; was up at the time of the screenshot being taken)

low plank
#

pretty good

solid gale
#

it's pretty clear who the winner will be

#

long-term

wintry holly
#

75% in simplebench

#

can't wait for the eqbench results

solid gale
#

so i expect it to have like 200-300 more points

#

in creative writing

opaque pasture
#

this thing will feel human

low plank
#

I'm super excited to try it ou

solid gale
wintry holly
#

yeah if the aggregator leak was real, it feels insanely human

solid gale
#

their long-context perf

#

is up by 10%

#

compared to 2.5 pro

random girder
#

at 1M too

solid gale
#

yeh

crimson blade
#

The screen thing, at like 70%, means their vision modulate would of pass the line into not shit rangee

solid gale
#

this is the hype that sama wanted

#

to happen with gpt 5

random girder
#

72% vs their previous 11% on the screen understanding is insane, probably gonna be vision sota for a long way to go

solid gale
#

right it's crazy

wintry holly
#

time to hype until it officially releases

solid gale
#

right

#

and i doubt that this is benchmaxxing

#

it's google we're talking about

wintry holly
#

or just visual

solid gale
#

insane compute, insane amount of information bc it's the main internet site

#

they'll be #1 long-term

crimson blade
#

The massive vision understand improvement, is probably what improve the ARC-AGI-2 scores.

random girder
wintry holly
#

ah, fair

solid gale
#

they're just coming for sota in every

#

benchmark

#

at this point

wintry holly
#

i can't help but wonder what 3.5 pro will be like in the future, if 3.0 got this much better

low plank
#

meanwhile meta ...

solid gale
#

elon's sota will only last one day

#

poetic

random girder
solid gale
#

and it'll be worth it

#

unlike other ai companies that hype shit up, KHM SAMA KHM

random girder
#

now this makes me wonder what grok 5 will be 🤔

solid gale
#

AGI!

#

well, elon said 10% chance of agi

#

but yea, he caps a lot

#

i trust google the most

opaque pasture
#

i can say numbers too

low plank
#

wait for sama to draw a cirlcle with bigger radius this time you guys are not gonna believe it

solid gale
crimson blade
#

Now only if the knowledge cutoff wasn't still just in January.

solid gale
#

100% chance for that 0.01%

#

to take place

opaque pasture
#

ye

final basalt
#

the irony of Cloudflare having a major outage on the day that Gemini 3 Pro is likely being released

solid gale
#

creative writing model

#

did they just put it into gpt 5

#

bc it's sota on eqbench

opaque pasture
#

i guess yes that was it

solid gale
#

i remember someone talking ab

#

how chatgpt's writing seems good at first

#

but when u analyze the sentences

#

they don't really make sense

#

something like that

primal swallow
#

most people using ai seem to barely be able read, so it works for them

leaden ruin
solid gale
#

i think they'll be able to get rid of it

#

but it'll either be chatgpt, claude or gemini

#

my bet's on gemini, but who knows

leaden ruin
#

on release 2.5 pro was so good at creative writing then it just deteriorated

solid gale
#

i believe

leaden ruin
#

yeah

solid gale
#

talked to flowsen ab it yday

#

but only claude 3.5 sonnet and gemini 2.5 03-25

#

were able to almost perfectly

#

replicate the first episode of a show

leaden ruin
#

3.5 sonnet is great but I think it’s been deprecated

solid gale
#

yea

#

it's old now

leaden ruin
solid gale
#

but it's funny how no newer models

#

are able to do that

#

but 3.5 sonnet and gemini 2.5 03-25 were able to do it

leaden ruin
#

yeah that’s interesting I wonder why

solid gale
#

now-ancient models

solid gale
primal swallow
solid gale
#

sadly

#

was kinda annoying

#

trying to go around

#

the censorship

#

w 3.5 sonnet

primal swallow
#

wait amazon are still hosting it?

solid gale
#

but i somehow made it

opaque pasture
#

#keep0325

solid gale
#

it knew the main plot, the dialogue even

opaque pasture
#

i was talking to a google staff i was a 100% sure

primal swallow
solid gale
#

😭

random girder
primal swallow
opaque pasture
primal swallow
opaque pasture
#

and non moderated (forgot the name)

random girder
#

ah okay, i see

wintry holly
solid gale
#

WAITING FOR LOGAN'S

#

GEMINI 3 IS FINALLY HERE RELEASE TWEET

random girder
#

only once X is actually up

opaque pasture
#

its like 6 am now

#

in SF

solid gale
#

fun

leaden ruin
#

imagine it’s just a delay announcement like gta 6

random girder
#

nah probably not

primal swallow
#

google took down cloudflare again to buy even more time

random girder
wintry holly
random girder
primal swallow
#

like sonnet is?

#

perhaps...

nova citrus
#

You think flash will also be released today?

random girder
#

maybe in a week

wintry holly
#

if flash is not much better than 2.5 flash i will be sadge

#

after all this wait

random girder
#

i just hope they fix the tool calling

opaque pasture
primal swallow
#

noooooooooooooooooo

random girder
#

reminds me of this

opaque pasture
narrow tangle
#

Does that model card not feel awfully short form? Are all of google's model cards that short?

narrow tangle
#

The first link under Gemini 3 Pro - Model Card 404s

random girder
narrow tangle
random girder
#

oh i thought u meant the model card itself, idk, maybe soon to come page?

narrow tangle
primal swallow
random girder
rare oar
#

inb4 another price increase for flash

random girder
#

i hope not

rare oar
#

back to glory days of 0.10 in and 0.40 out

wintry holly
rare oar
#

still rockin gemini 2.0 flash

#

in my pipelines

primal swallow
#

💎 nova premier 💎

random girder
#

i wonder if google will make a code-specific version of their model, like codex

vague quest
#

heard its lobotomised

#

like not as good as the lm arena checkpoints. idk though havent vibe tested myself

wintry holly
#

I'll wait until release

#

lobotomizing a model on release would be insane

nimble pelican
#

trying to avoid a grok

primal swallow
#

almost like its completely made up

celest cypress
#

Would be so demoralizing to work your ass off getting your model SotA on literally every benchmark except SWE and people go "Eh, I expected better."

hexed oracle
#

it's insane how easily disappointed people are. this is literally sci fi level tech and people are calling it "lobotomized" because the behavior changes between snapshots

#

?????

opaque pasture
# solid gale

"it's not 100% on everything so ts is ass, moving on"

random girder
#

i think this is a great leap if the model can live up to the benchmark increments

celest cypress
#

Also they reference the March 2.5 obsolescence as proof of lobotomizing, but like...how many private or public benchmarks does March beat 0605 on? Zero as far as I'm aware, even if people liked it more stylistically.

#

I never trust people after every single week with GPT-4 there'd be a post going "That's it boys, they killed it, it's an idiot now."

wintry holly
celest cypress
#

Also jesus christ I just saw the SimpleBench score

primal swallow
celest cypress
#

74.8%???????

wintry holly
#

"oh it doesn't score 100 on every benchmark. Shit model"

opaque pasture
celest cypress
#

SEVENTY FOUR????

opaque pasture
#

10%+ jump

celest cypress
#

I think the largest jump we've ever seen?

#

3.5 Sonnet was also pretty massive, but my memory of what was SotA before it is hazy

#

Actually o1 might have been biggest jump?

#

Gemini 1206 -> o1 was 10%. So yeah, still loses to this by 2-3% and that was going from non-reasoning to a massively expensive reasoning model

wintry holly
#

the jump from 2.5 pro to 3 is 12.4 right?

#

insane

random girder
#

im having huge latency on ai studio in the ui atleast

celest cypress
#

I'm a bit torqued considering SimpleBench hasn't failed me yet, it's my north star

gaunt dragon
#

Idk about you guys but it really impresses me that a LLM can do Arc AGI 2 at all

#

Over 30% is crazy

abstract lance
celest cypress
#

I can barely do ARC =(

swift salmon
#

When is gemini 3 coming on openrouter?

gaunt dragon
wintry holly
#

arc is hard kek

summer ore
#

arc raiders?

#

jk the game is pretty cool tho

opaque pasture
#

gemini has to have a pretty good vision training to understand the positioning and number of squares

solid gale
#

🙁

#

people acting as if google won't sweep that one too

wintry holly
#

not sota on swe = bad

opaque pasture
#

it's more laborious than difficult

solid gale
#

this isn't their best model either

#

their best model is def sota on swe

solid gale
wintry holly
solid gale
#

for when other companies try beating

#

ur current score

#

marketing 1/1

#

c: aha, i beat u!
google: great, wait for the next update

beats c

gaunt dragon
#

ARC AGI 2 is text only, doesn't involve vision

#

The board is represented with characters

solid gale
#

i think their vision is really, really good too

#

it was able to translate a text

#

from the 1900s

#

perfectly

random girder
#

i cannot read that text for shit

solid gale
#

yea

#

and gemini 3 pro

#

gets it perfectly

celest cypress
#

I'm kind of amazed Gemini isn't SotA on coding. Google legendarily has massive amounts of the highest quality code.

solid gale
#

who knows

random girder
#

we'll see when it drops

celest cypress
#

Anthropic has...a scrape of Github?

solid gale
#

i think their priv shit is def sota on coding

#

but they just release the most stable

#

stuff they have

celest cypress
#

Could be. I mean tbf, it loses to Claude by like 1% and ties 5.1

solid gale
#

and they could be "lobomotizing" so when other companies try beating their score

#

they just do an update

#

and claim sota

gaunt dragon
#

Eh, SWE Bench is just another benchmark, it's just a slice of the programming performance so I'm not concerned about the score

solid gale
#

and it's like

#

1 python library

#

i think

celest cypress
#

Speaking of, I wonder if 5.1 and 4.1 released recently because insider info said Gemini 3 is crushing it on social metrics?

#

I can't get over the weird coincidence of them both doing the exact same thing at the exact same time

solid gale
#

possibly, yea

#

and gemini 3

#

will still be sota

#

openai & anthropic can't do much when competing against the internet itself lmao

wintry holly
#

ever since 2.5 pro google deciding they were gonna kill the competition catstarecalm

solid gale
#

yea

wintry holly
#

steamrolling

solid gale
#

2.0 pro and 2.5/3 pro

#

the difference

#

jesus christ

feral mantle
solid gale
#

a landslide

celest cypress
#

Never 4get that 2.5 stayed in the top 2 on multiple logic benchmarks for, what, 8 months?

feral mantle
#

Each model release recently TerminalBench is ~basically the only coding bench I care about

gaunt dragon
#

Please be good at RAG grounding

#

2.5 Pro is still the best for my use cases

celest cypress
#

I mean sure they updated the checkpoint, but the original was nearly good enough to hold that score even today

solid gale
#

the other companies spend like 3 major updates trying to catch up just for google to release another thing that will take them another half a year+ to catch up

#

it's genius

primal swallow
#

it's just interesting.

feral mantle
#

New bench: how many bad useEffects does the model use without being told not to use it explicitly

#

Who is making this

#

Give them $1mil

primal swallow
#

i never actually see them do that and i use react all the time

solid gale
#

😭

feral mantle
#

You never see a model write useEffect ?

primal swallow
#

maybe it's just what i'm doing. but who is using useEffect, cmon

#

you might not need a useEffect

feral mantle
#

Of course

#

But that’s my point

primal grove
#

cant wait for my new gem

#

:3

feral mantle
#

Even latest GPT 5.1 needs super hand holding

#

Or it’ll write useEffect everywhere

#

In my experience

primal swallow
#

oh i mean their code is still hot garbage, don't get me wrong

#

generally they're not willing or able to decide to rewrite a slice of a component stack, or anything really, when it really needs it

solid gale
opaque pasture
solid gale
#

google™

primal swallow
#

instead, just add more props/function args. tightly couple things that i could never have even considered

fading flame
solid gale
#

gold

wintry holly
#

some people need that apology form

#

unless some mad shit happens, I'll be a firm gemini believer until the next release 🙏

celest cypress
#

Ngl the model could be AGI and I'd still say they started hyping it too early

solid gale
#

interesting

#

saw this

abstract lance
#

I think it's out on aistudio?

fading flame
#

ur right

solid gale
#

yes

fading flame
#

(my ai studio)

winged salmon
#

still not announced though right?

solid gale
#

CONFIDENTIAL

fading flame
#

$12 output

solid gale
#

abstract lance
solid gale
#

we're READY

abstract lance
fading flame
#

😭

opaque pasture
#

confidential

#

hype geniuses

solid gale
#

patiently waiting