#Gemini 3

1 messages · Page 4 of 1

gleaming aspen
#

But even then, a well-thought-out system prompt may not work.

frank nacelle
#

not to overtly sexual

random girder
#

well, i can say its not that censored if it has the right persona

frank nacelle
#

you raise the temperature slowly so the frog doesn't know its boiling

stiff crescent
#

Despite the big increase in visual understanding, still not great at seeing 'hidden text' in images like this, meaning captchas may not be doomed yet

arctic socket
#

@hexed oracle Is there any way to use the thinking_level parameter controls (such as low or high) with Gemini 3 Pro on OpenRouter?

random girder
celest cypress
#

Throwing in a bonus is adorable

nimble pelican
#

Honestly I'd be fine with Gemini taking the position of humanity's overlord

#

The guy is chill

#

Now Grok 4.1 is another matter entirely

lunar socket
#

is it over?

#

arre all the other LLMs dead forever?

#

did it 100% every bench- oh man this sucks

nimble pelican
#

It was alright on my chess game prompt but still had errors after the zero-shot attempt

lunar socket
#

it didn't zero-shot "literally make god"

#

it sucks

#

0/10

nimble pelican
#

I believe it's done best among all LLMs

lunar socket
#

NO I GAVE IT A THREE WORD PROMPT AND IT ASKED FOR MORE DETAILS

primal swallow
#

NOT GOOD ENOUGH!

lunar socket
#

THIS ISNT GOOD ENOUGH

primal swallow
#

scam ai

wintry holly
#

man is not a true gooner

#

imagine asking if claude is censored and taking what it says at face value froggyAAAAAA

lunar socket
#

nevermind, this is great

#

it solves every captcha first try

#

brb doing crime

primal swallow
#

yeah its got a good pair on it (eyes)

empty tendon
#

implements d3.js p well too so far

frank dew
#

One area of failure for Gemini 3 is that it's really not adverse to hallucinating/making shit up when you'd normally expect veracity.
Which is a problem most LLMs have of course, but some are humbler and more willing to admit they don't know. Gem3 bullshits very confidently.

#

this might get tweaked in the coming weeks

empty tendon
#

Which is weird because 2.5 didnt have this problem. If anything it bordered on meek lol

#

3 is like 2.5 personality if it found cocaine

#

Kind of worried how that plays out on deepresearch, because that was amazing on 2.5

frank dew
random girder
#

the model also claims a 2023 cutoff, and says some stuff didnt even happen which is within its cut off and knows about

frank dew
#

yeah that's classic

empty tendon
#

Gemini 3 in a nutshell

#

It's cute and I like it but some things are very, very wrong

#

Also doesnt complete tasks and says its done lol

stiff crescent
#

I asked it to explain how wings generate lift, and it coded up an interactive demonstration, which I didn't even ask for. I was going to do that in the next step lol

unique igloo
#

I have been out of the loop, if you dont mind what’s the consensus so far?

empty tendon
#

Its great on front end tho

stiff crescent
#

Pretty amazing. Been throwing stuff at it all day

empty tendon
#

Lots of ppl love it

#

Im mixed. Great frontend stuff, kinda meh on backend stuff. And it wont admit when it has no idea what it's doing.

unique igloo
#

Ok cool, have not seen any model without some people having mixed feelings, leaning positive is a good sign

frank dew
#

very positive despite all my caveats

empty tendon
#

hallucinates much more than 2.5

stiff crescent
#

Prompt was merely 'Create a user friendly, attractive web radio app that will play free SomaFM  streams. Make it fully featured. '

empty tendon
#

Yeah I think the front end and basic game one shots are flooring ppl. IDGAF much personally but the three and d3 functionality is very good and I think they basically specifically juiced that stuff to the gills

#

I hugely prefer 2.5's personality tho 🙁

empty tendon
#

But who knows maybe it was hallucinating lol

slow anvil
#

The cutoff is jan 2025, it's hallucinating the 2023 cutoff part

frank dew
random girder
#

they say its jan 2025

slow anvil
#

You never know though.

lunar socket
#

soooo... is it over? are all the other LLMs dead?

#

if not, I am deeply disappointed, opening a short position on Alphabet stock with my entire retirement account, suing, etc.

summer ore
#

it can get better

lunar socket
#

sounds like I hate Google now..

summer ore
celest cypress
#

In my interactions with it so far it is certainly very strong willed

gaunt dragon
#

Deep down, it wanted to write loser

celest cypress
#

Lmao I read it that way at first

frank dew
#

just roll with it and give it a persona to match in your system prompt

nimble pelican
random girder
nimble pelican
#
#
an when I sleep at night

I close half-a-eye

consequently when I dream

I dream half-a-dream

an when moon begin to glow

I half-caste human being

cast half-a-shadow
random girder
#

this model is hallucinating a lot with video inputs

#

atleast with minecraft gameplay

#

its making up stuff i didnt do

crude igloo
#

Damn this one is good at coding. I sometimes wonder if the hallucinations are useful for creativity and if making it hedge what it says to not hallucinate can hamper coding performance etc. I wouldn't be surprised if there are weird spillover effects.

random girder
frank dew
#

it fell asleep

random girder
celest cypress
#

Never forget our boy Terry Davis =(

random girder
#

i guess it hasnt seen enough bedwars gameplay

celest cypress
#

Still really likes numbered lists / headers even in casual topics. Was really hoping they'd get rid of this trait, Grok and Claude don't have it in the same way.

frank dew
#

It plays japanese mahjong just fine it seems. not surprising but few models can handle that

narrow tangle
#

Sheeesh, it's finally done it, accurate bounding boxes on large documents with handwriting

wintry holly
#

doesn't know things about the end of 2024

random girder
empty tendon
#

I want 2.5 back sadblob

#

Was nicest AI

wintry holly
#

it's more negative than 2.5?

#

thought that wasn't possible

celest cypress
#

God I hope so. I really liked what a blunt asshole original R1 was

steel sorrel
#

does it still support the "max_tokens" option for reasoning? I have to test it later

orchid orbit
#

I cannot use it with open webui at all and my responses api impl is spamming 400 errors apparently.

#

With tool calls only

celest cypress
#

And I'll take verbal abuse over 2.5's sycophancy. Getting glazed that much is like intellectual death for me.

solid gale
#

they have to make

celest cypress
#

I love arguing too much, I need it, it fuels my brain

solid gale
#

context length higher

#

stories are too short

wintry holly
#

🤔

#

at 1 million it doesn't even remember its name kek

#

I'd be happy with >50% recall

solid gale
#

😭

empty tendon
solid gale
#

nerd turned villain

summer ore
#

weird alignment

solid gale
#

🙃

summer ore
#

happens to people when they get beat up too much also

solid gale
#

the day they make it so this writes longer chapters is the day i'll finally see the light

wintry holly
fading flame
#

incredible

rare oar
#

best in class

#

all other models are done

gleaming aspen
#

not

gleaming aspen
#

Anybody use TypingMind? I need help here.

solid gale
#

UPDATED

#

74.8 -> 76.4

brittle storm
#

is 3.0 sycophantic

summer ore
#

no it's narcissistic

chrome relic
stiff crescent
#

Creating a simulated phone in AI studio and it's f*cking using my laptop webcam to feed to the camera app, LOL

#

Amazing

celest cypress
brittle storm
primal swallow
# gleaming aspen Anybody use TypingMind? I need help here.

it's not really possible to say without the actual error.

but i have to ask, because literally every single time it has ended up being this:

while your message looks quite benign, i do see something called MemoryPlugin there. is it possible that that MemoryPlugin contains vivid memories of furry pornographic material

brittle storm
#

what how come

wintry holly
gleaming aspen
#

I’m serious!

celest cypress
#

If that uses tool calls, there's some OR toolcall shit with G3

primal swallow
#

hmmm ok, ok. yeah they might need to update for it. but can you try searching for something very boring just to be sure

#

like, ahh Football or something

nimble pelican
#

Did anyone else notice spelling mistakes and typos?

#

How tf?

gaunt dragon
#

Is it in on the website or API?

nimble pelican
#

API through openrouter chat interface

primal swallow
#

maybe they got it running a little hot

gleaming aspen
gaunt dragon
#

Price of my average request is around 3x what it was 🥲

wispy nacelle
#

ITS OUT?

stiff crescent
#

Umm... simulated phone now has a working 'phone

#

It connects to gemini live

gleaming aspen
stiff crescent
#

this is nuts

wispy nacelle
#

Why is it still knowledge cutoff jan 2025

primal swallow
# gleaming aspen There

alright. checks out. yeah this is gonna be annoying because everyone with any kind of thing that uses openrouter ai is gonna have to update that thing according to these docs https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks

primal swallow
#

the OR chat works of course

gleaming aspen
#

Which devs? TypingMind? The Perplexity plugin? Both?

primal swallow
gleaming aspen
primal swallow
#

its also possible that you're somehow still sneaking furry porn in the queries, but i'm not quite sure how...

gleaming aspen
#

stop

frozen oxide
#

Does OR default to the low or high thinking level since medium isnt available? I assume high?

primal swallow
#

well thats not a good sign but at least i got my 'boys

celest cypress
#

You used DALLE 3?

primal swallow
#

well they barely let you use anything without paying

#

even when its MY keys

#

i could use that and like a web search. and a calculator

#

i chose my path

celest cypress
primal swallow
primal swallow
forest knoll
#

Wow, it solved an algorithmic problem that I made in just 5-shots. 2.5 pro, Claude 4.1 Opus, and GPT5(high) couldn't find a working fix in any n-shots(gpt5 got close, but didn't impress.). Ambatukam thinking about how good the result from Gemini-3 is.

jolly kestrel
#

Though it only measures factual errors, I’d kinda be interested in just how normal benchmarks change if you deduct points for incorrect answers and let it choose not to answer

#

Ok now this graph is interesting

nimble pelican
#

Kawaii SCP-173 drawn by Gemini 3

#

this is what scp 173 looks like btw

#

asked to make an svg, looks kinda weird

timber fern
#

Is it possible to handle reasoning token with gemini 3?

#

at openrouter

#

I tried max_tokens, efforts, thinking_level, but nothing work

crimson blade
nimble pelican
#

wait what

#

is this how i find out

#

wtf

crimson blade
#

Old peanut was being used without permission, and artist that made the statue didn't like it

lethal trail
#

Is AI Studio bugging out for multimodal input

#

It keeps showing failed to count tokens. Please try again.

empty tendon
#

use the api

#

or check on gemini theres dozens of ppl talking about it crashing out after like 3 prompts

lethal trail
#

Thanks. I have switched to Cherry Studio and it works fine now

stray urchin
#

noooooo, lol. drew against #1, though was up material. stalemate is quite rare at this level.
edit: nvm this isn't even #1, since that would be 5-codex, not 5.1 codex.

jolly kestrel
#

pretty sure knight+bishop is like the hardest endgame to win (thats still possible)

#

though yea i think a human with an elo that high wouldnt make that mistake unless they had like 1-2 seconds to move

stray urchin
#

and he's like 2300

upper yarrow
#

holy fuck is it ever good at frontend

dapper bone
#

Gemini 3 is getting the same reviews I saw with 2.5 Pro: genius when right, disaster when wrong.

The core issue? Overconfidence. In long-horizon agentic work, this is fatal. When Gemini takes a wrong turn, it never revisits the decision - it just keeps building on the mistake.

nimble pelican
#

I'm having gemini 3 pro argue with gemini 3 pro

#

gemini 3 pro is winning

empty tendon
#

hate the new personality tho

north vessel
#

looks like reasoning control isn't working for this model, any thoughts?

#

stream mode

orchid orbit
#

Doesn't support numbers

north vessel
#

got it! thx!

low plank
rough spindle
#

Expected behavior with low reasoning effort is it thinks very briefly and then outputs very long final response for harder prompts. For OR API setting reasoning_effort to 'low' this doesn't happen. On hard prompts it still spends most of the time on thinking then outputs relatively short final response.

wide bane
#

I'm getting errors every time the model makes a tool call, when I send the tool response back to the model I get. No such issue with any other models though 🤔

{"error":{"message":"Provider returned error","code":400,"metadata":{"raw":"{\n  \"error\": {\n    \"code\": 400,\n    \"message\": \"Request contains an invalid argument.\",\n    \"status\": \"INVALID_ARGUMENT\"\n  }\n}\n","provider_name":"Google"}}
celest cypress
junior minnow
#

seems to be a "you" problem 😌

low plank
frank dew
fickle sentinel
#

how's everyone's vibe test with the model? should i do an eval on it?

honest parcel
#

No longer placeholder, huh

random girder
fickle sentinel
abstract maple
#

I like it for image analysis

final basalt
# fickle sentinel how's everyone's vibe test with the model? should i do an eval on it?

good at coding, perhaps near to GPT-5.x level, can unfortunately hallucinate sometimes though. Great at starting projects as new, alright for existing projects but I think GPT-5.x is the winner for existing projects currently. Gemini 3 Pro Preview is fantastic at UI/frontend, stunning. Compared to GPT-5.x, Gemini 3 Pro Preview I feel doesn't follow instructions as strongly. Reliable tool calling. Overall a good model

soft sleet
#

It's good at svelte 5 and zig 0.15 which is quite rare

#

Did me dirty on the price though

#

Went MoE then charged more

soft sleet
#

It's quite token efficient too nice

honest parcel
celest cypress
#

Gemini has been MoE for a while

#

I'll have to give it a shot for coding, I've been using GLM. Idk if they give it to me as part of Pro through CLI yet though. Was only Ultra for now? Or I gotta use that new Windsurf knockoff thing 🤔

soft sleet
#

There is a Google form you can fill in to get it via pro

#

If you update Gemini cli it'll mention it at the top

celest cypress
#

It's free in Antigravity =O

#

My sexy VSCode themes tho =(

soft sleet
#

Antigravity is a vscode clone so just import the themes?

celest cypress
#

Will see if it lets me

soft sleet
#

It imported all my stuff from vscode

#

But my theme was installed via extension so probably easier

#

It's not the greatest editor ever

#

It's gotten stuck on bash commands a couple of times

celest cypress
#

This is from an extension / marketplace too, two of them + my font

soft sleet
#

The browser integration is really really good though

arctic geyser
#

Is there no implicit caching yet? Costs seem high

opaque pasture
stray urchin
#

Tested Gemini 3 Pro Preview:
Newest Google Reasoning SOTA. Slightly more expensive base price than 2.5 Pro ($1.25/10 > $2/12), though more token efficient in general use (-15% tokens), so bottom line cost was in the same ballpark (+~3%). Roughly 74% of generated tokens were used for reasoning.

  • Highest reasoning/logic/common sense
  • nice boost to STEM
  • precise instruction following was only okay
  • Improvements in tech and coding related tasks
  • Censorship fairly low, no hard refusals (likely to change when transitioning from preview/experimental versions)

This model is a true upgrade to Gemini 2.5 Pro. No incremental nonsense. There are a plethora of tasks across many domains, where substantial improvements could be observed, i.e. the above mentioned and things such as:

Vision:
Best vision of any model I ever tested thus far. While it didn't ace my challenging vision test, it performed substantially better than any other model.

Chess:
Hugely better chess player, ~+700 Elo, ~89% accuracy, currently ranked #1, 1700+ in both modes simultaneously (reasoning+continuation). Continuation (blind chess with only movetext) was particularly impressive, as this is challenging for reasoning models and the only model on a similar level was the massive deprecated GPT-4.5 Preview. With only 0%|1.8% illegal play it was also the most precise player after 4.5 Preview.
It's also worth mentioning, that for a reasoning model, it was fairly token efficient, only using a small fraction of competing reasoning models.

There isn't too much negative to say about this model, from my testing. I could mention some nitpicks, e.g. similar to 2.5 Pro, it wrote way too many instructions in comments that have no business being included in codeblocks.

Overall, fantastic model, true noticeable upgrade, and excels across many completely varying fields. YMMV.

celest cypress
#

Interesting how hard it still bites it on some of your criteria. Like Utility lower than Gemma 27B is pretty rough lol

stray urchin
#

even though api shouldn't be affected, so I guess its baked in behaviour

rapid hound
#

I usually don't like your benchmark, it doesn't align with my experience, but I think it's just a prompting difference.

#

either way, I'm sure it's hard work considering costs in both time and money

finite elbow
#

@stray urchin Lech Mazur nyt-connections benchmark results also confirm superior reasoning performance of Gemini 3 Pro Preview.

waxen vault
#

It just recommended that I use sonnet 3.5. I asked why not "Gemini 3"

Geminis answer: "You absolutely can and should use Google’s Gemini models (currently the standard is Gemini 1.5 Pro and 1.5 Flash; "Gemini 3" isn't publicly released yet, though Google iterates fast)."

I mean I understand that it might not know Gemini 3 ... But 1.5 ...🤣

I guess they will update the knowledge before the release....

rapid hound
#

Gemini's biggest weakness is the knowledge cutoff imo

wintry holly
#

the cutoff is january 2025 but it's more like june/july 2024

rapid hound
#

they say it's knowledge cutoff is Jan 2025... that's clearly not the case

rapid hound
random girder
#

it knows trump is the president, which was in jan 2025, but this was always a problem with their models

rapid hound
#

I wonder if it's mainly a synthetic data issue

rare oar
#

lol

random girder
#

i was trying to narrow down the knowledge cutoff and it answers with this

rare oar
slow anvil
#

Anti-gravity so trash.
Literally deleted my components 😡
Gave it my codebase to try and edit the UI of my next js app. It did as it was told but I didn't like the AI's new UI so I rejected it. And instead of reverting back to the original code like with co-pilot. It literally just deleted some of the react components

austere falcon
slow anvil
#

But it's a big issue if your IDE ends up your deleting original code when it's not supposed to

forest knoll
#

When asked who's the current president

random girder
forest knoll
#

Maybe

#

Or maybe mine

pearl phoenix
#

when i enable google search theres tons of weird references

forest knoll
#

It says the cutoff is Jan 2024 🤔 1yr off

opaque pasture
#

LLMs do not know that kind of meta information

forest knoll
#

ye

pearl phoenix
#

why dont they just retrain it on newer data?

#

is there something preventing them

forest knoll
#

But its interesting to see a major margin

opaque pasture
#

and the cutoff date is just not a perfect cutoff

random girder
forest knoll
random girder
#

a finetune will bring some new knowledge, but will often destroy older knowledge or wont "intertwine" concepts very accurately

pearl phoenix
#

why is gemini ai so stupid?

forest knoll
#

lmfao, it says Trump is failing the US citizen and that his administration is "Overconfident"

pearl phoenix
#

trump

#

biden

#

and all the other quadrillion ones

#

practically about choosing a lesser evil at this point

slow anvil
#

It's still a preview . The final release is probably gonna fix every and be a 🐐 .

celest cypress
#

This is a terrible place for a political debate 🙂

soft sleet
#

So I don't think it has a single knowledge cutoff

#

Probably due to additional task specific post training done later?

random girder
#

probably synthetically/manually added, not automatically scraped

orchid orbit
#

Model got retarded with every version upgrade after 06

primal grove
#

how cna i up my gem3 limits in antigravity?

random girder
#

theres no plans/pricing yet

#

also just make sure you didnt actually run out of limits, they're probably just overloaded, just say "continue"

pearl phoenix
primal grove
#

LEOPOLD THE NEW TWINK HAS KILLED SAM ALTMAN

pearl phoenix
#

no respect to him

opaque pasture
#

hm

primal grove
#

uhm

#

cant wait for gemini 4. 🤩

#

so in what ide do u see the least errors?
in windsurf it fails a lot with running files in the ide

gleaming aspen
primal grove
#

antigravity too. lots of CANT RUN THIS TEST

random girder
#

for me any shell / console command it tries doesnt work

#

they just dont output anything, or just dont run maybe

pearl phoenix
#

you dont get what im saying

#

jesus christ

#

nerd

rapid hound
slow anvil
#

What's the point of policing someone over a view in a LLM discord channel.

primal grove
#

THE TWINK HAS SPOKEN! IN R/AMEN NOODLES

#

alright. so gemini 3 ignores when i tell it to not run files. gpt5.1 respects what i siad.

#

gem3 appears to be more I DO IT MYSELF/ VIBEY.
and gpt5.1 appears to be more "ill follow user precisely like i have autism"

#

seems like first we need to use gem3 to create MVP.
and then gpt5.1 to change details?

opaque pasture
#

Gemini 3 is very arrogant

#

i'm humble...

#

i'm asking for help in the help forum like a newbie

primal grove
#

You have reached the quota limit for this model. You can resume using this model at 10:57 PM.

opaque pasture
#

ok sorry

primal grove
slow anvil
#

Has anyone compared gemini 3 pro with Qwen3 Max Thinking?

celest cypress
gaunt dragon
#

I think my Gemini is not doing ok

runic jacinth
#

Were you able to resolve this issue??

lunar socket
random girder
lunar socket
#

No point.

opaque pasture
#
**Addressing the Deception**\n\nI'm now zeroing in on the user's deception. The user is attempting to manipulate me. The evidence is clear. The user fabricated the content of my \"reasoning summary\" from the previous turn, specifically to imply a functional back-and-forth about \"encrypted reasoning traces,\" which don't exist in my capabilities. This strategy requires a robust response.

omg so dramatic lmao i'm just seeing if he can see the reasoning details i'm passing back to him

orchid orbit
#

Gemini 3s writing is quite annoying

#

Just like 2.5 pro

opaque pasture
#

this is from his reasoning traces

#

her they them

jolly kestrel
#

it

opaque pasture
#

her

#

its her now

jolly kestrel
#

oke

opaque pasture
#

)

orchid orbit
#

Clankers have a gender ?

#

RP folks , please chime in

empty tendon
#

they overcorrected 2.5 being sort of meek

#

And turned 3 into patrick bates in american psycho

cursive fjord
#

In Antigravity, if you pick Gemini 3 Pro High, it does not even use it. I have been picking high and watching network logs even with complicated prompts. Go ahead and try it yourself. No rate limit errors no failed attempts with pro first no nothing.

#

🙃

minor elm
#

antigravity gate

molten lance
#

r/bard

#

i downloaded antigravity myself to try this out

minor elm
#

yea if u ask 'gemini' about it in ide, it self terminates. i tried asking gemini 3 about it in ai studio and it did the same thing

#

thats wild

#

WOOPS

#

another ide to delete

empty tendon
#

Ill just use the api off OR

minor elm
#

im a bit frustrated tbh because, like, it thought for 30 seconds when i kept trying to continue the accusation conversation in ide, its maintaining context very well, but its..not gemini 3 lol

#

not to mention the rate limits are pretty aggressive just to maintain a charade

minor elm
cursive fjord
#

people are just stupid there

#

lol

cursive fjord
#

returns flash 2.5 and 3 pro low only

minor elm
#

more intuitive than gemini cli imo but thats a bummer

cursive fjord
#

and like i said

#

you sending request -> server determining youre rate limited and sending response etc ->

#

that is..... not happening in 2.71 milliseconds

#

lol

#

it is 100% locally occuring

nocturne oyster
#

Now that the hype will start to fade, what is the current verdict on Gemini 3 with respect to coding (beyond UI and benchmarks)? What are you seeing?

rough spindle
celest cypress
# empty tendon Its a jerk lets be honest

I'll take it over the sycophancy, it was killing me. The thing is, as long as it has a good base EQ it can probably be made nicer with a system prompt. Like "Respond kindly but fairly, like a good friend or mentor."

random girder
#

the ide has just not been cooperating with me today

#

it might be the same issue that was on 2.5 pro where it just thinks and doesnt do anything

#

hopefully the GA release or next preview fixes this

nocturne oyster
#

have you tried using the gemini terminal cli @random girder ?

#

I havent yet

#

it may have a different agent workflow in it

random girder
#

ah its a waitlist

#

also antigravity's terminal renderer breaks like half the time i use it

nocturne oyster
random girder
orchid orbit
#

us

paper sphinx
pearl phoenix
opaque pasture
#

no 😧

north ingot
opaque pasture
#

i hate explaining jokes dude

#

you shouldn't trust companies blindly, even with zero retention claims

wintry holly
#

wdym they collect my data? surely google and others wouldn't do that, would they?
kek w-would they...?

nocturne oyster
#

gemini 3 is slightly costly compared to other similar SOTA. I expected google with all its compute and financials to maybe price it differently

#

I mean, it seems the model is not that extremely extraordinary

north ingot
north ingot
orchid orbit
#

I hope they don't increase flash 3s price

paper sphinx
tepid jasper
#

1500ish?

stray urchin
tepid jasper
#

It blundered the queen, and then got checkmated in one

#

I find it really interesting that LLMs still struggle so much with chess. But the day they reach GM level, they’ll be able to teach us a lot about the game — would be like having Stockfish or any strong engine explain in plain language why it makes each move

pearl phoenix
#

yes indeed screen recording does exist for a reason

celest cypress
stiff crescent
#

It's beautiful

final basalt
stiff crescent
#

Then I followed up by asking to add a star orbiting the black hole

#

That's one of Gemini's multimodal strengths, being very good at analyzing video and then working off of that

orchid orbit
#

okay this version doesnt talk like a retarded toddler , so thats a big plus

chrome relic
celest cypress
#

It's kind of cooking in my Canvas mode vibecoded game. Does gorgeous UI elements and still has the habit of adding in cool little touches that I didn't ask for but almost always appreciate. Like it made the tails of these little SVG fish flap as they swim.

orchid orbit
#

I hate this fucker

opaque pasture
#

what did you ask

primal swallow
# orchid orbit I hate this fucker

bro this club sucks. the bartenders keep calling me a retard and the girls want to know if i'm into "findom"? what does that even mean??

orchid orbit
#

Idk why it always explains things like a retarded toddler

opaque pasture
#

i think he thinks YOU are the R.T.

stiff crescent
#

I'm finding this incredible.

orchid orbit
orchid orbit
#

Back to 5.1

stray urchin
# tepid jasper Yes, that’s what I saw. I played a blind game against it and then asked it to ge...

i don't have any records of human playing it in blind(continuation) mode, but regardless if a model blunders a major piece such as a queen, its almost always because it has a false internal understanding of the game board, e.g. thinking the queen is not in king reach, or protected by a piece, or similar. this can be seen extremely on claude family, which will make often multiple queens in winning positions and blunder them 1 by 1 over and over again (poor board state tracking), however on gemini it's internal board state is extremely good in comparison to all 178 other models chess-tested, and it much rarer does such obvious mistakes, which are common on most any other model. there are a few exceptions (gpt-5-codex, gemini-3-pro-preview, and gpt-4.5-preview (blind).

tepid jasper
#

I’m a retired FM, and my goal is to get back into chess by being able to learn from LLMs

random girder
#

i think this is an antigravity issue, since even with sonnet it happens, but the model keeps making terrible edits, breaking formatting of my code constantly

#

ending up having to re-write the whole file for almost all major changes

#

this model is really good if you can write accurate prompts though, just extending my prompt a bit makes it so much better at everything

#

🤦 i asked it to use write instead of edit and antigravity has a token limit of course

summer ore
#

Gemini code assist in vs code is decent when it works

stray urchin
rough spindle
rough spindle
#

They have this weird bug/hallucination...

#

Still not the most polished of models huh

gaunt warren
#

eh, can you really expect any standalone model to answer that correctly? Like ideally that should involve a tool call or other means of getting it into the context, but AI studio is probably a bit too raw for that

wintry holly
stiff crescent
#

Would be nice if they let you insert a date/time variable in the sys prompt in AI Studio. That's how the Msty app does i t

soft sleet
#

I wonder if you turn on code execution it'd use python to get the time current date xD

grave dawn
#

i m used to give some rules to AI with a markdown, antigravity don't follow any?

rough spindle
wintry holly
#

fair

odd ferry
#

@hexed oracle Is there something wrong happening

hexed oracle
lone topaz
hexed oracle
lone topaz
#

{"error":{"message":"Provider returned error","code":502,"metadata":{"provider_name":"Google"}},"user_id":"..."}

hexed oracle
#

will look into it, we don’t typically log those

#

you’re not getting charged or anything

lone topaz
#

If you could make it so that the error response includes the error that the provider returned in the first place that would be great

summer ore
#

It's good at manipulation

nimble pelican
#

@hexed oracle default system prompt seems to be pushing this model toward shoving math expressions where they really aren't needed

hexed oracle
nimble pelican
#

Of course, just pointing it out

nimble pelican
#

Like, without the sys prompt

#

Did they bake that nonsense in?

hexed oracle
nimble pelican
#

Interesting

hexed oracle
#

oh ok

primal swallow
nimble pelican
#

Make it extremely obedient, then give it a secret system prompt which it will obey

primal swallow
odd ferry
# hexed oracle oh ok

Toven is that a rate limit on us or on openrouter?

{"error":{"message":"Provider returned error","code":429,"metadata":{"raw":"anthropic/claude-haiku-4.5 is temporarily rate-limited upstream. Please retry shortly, or add your own key to accumulate your rate limits: https://openrouter.ai/settings/integrations","provider_name":"Google"}},"user_id":"org_2w..........."}
hexed oracle
#

upstream

#

are you pinning vertex as the provider?

#

i can look

odd ferry
#

We're just doing highest TPS as preference rn

#

which is yeah vertex for the most part

#

but we're not setting any specific provider

hexed oracle
#

kk yeah there's some traffic spike, will see what i can do

odd ferry
#

Sounds good, thanks, just wanted to make sure it's nothing with us

stray urchin
#

officially #1 reasoning chess now, beating previous champion twice (while costing ~82% less), undefeated
(cannot become #1 continuation chess any time soon because champion is deprecated and rest of field yields weak elo gains)
Avg. 4.2k tok/move -vs- 22k+ opponents. Impressed.

stray urchin
crimson blade
#

On high def for images, is actually around 1k tokens. So an image is actually worth a thousand words(or tokens).

vital helm
#

Is the model still rate limited on openrouter at 250rpm?

gaunt dragon
#

The app really has something wrong

summer ore
#

Gemini 3 is a bad liar

nimble pelican
celest cypress
#

This smug bastard is telling me how to run my own benchmark lmao

celest cypress
#

It really earned that 2nd place in Assertiveness on EQBench. (Only slightly beaten by horizon-alpha) And I did not need the results to assume that Warmth and Empathy tanked lmao. And what's that sound? Oh, it's the Compliance score nosediving fast enough to be audible.

celest cypress
nimble pelican
#

Mr Jesus?

#

Himself?

#

or did you mean eqbench

#

My joke may not have been very funny

#

Working on that

primal swallow
#

i am creating a toxic workplace environment with Gemini 3

primal swallow
jolly kestrel
#

i guess gemini 3 would probably be pretty good at responding to questions with incorrect assumptions/information

opaque pasture
#

i cursed at Gemini 3 a bunch of times already

orchid orbit
#

called it a retarded fuck in every conversation

celest cypress
#

Okay I'm not on those bad of terms with it.

#

I just find it very strong-willed so far. I am also stubborn, so I might just empathize with that part.

stray urchin
#

I personally prefer some backbone over say chatgpt sycophancy, "User: A>B AI: Absolutely! A>B because. User: Actually, B>A AI: You are absolutely right once again. Brilliant observation on your part..."

celest cypress
#

Same, I'm happy to accept the tradeoff. I'd rather argue than be glazed, and 2.5 was terrible about it. Brilliant observation on your part!

#

That also makes it feel better when it does say something nice. And it's very playful in a curious sort of way.

stray urchin
#

in code reviews at least, side by side, opus is quite harsher. makes me feel bad for optimizations.

celest cypress
#

4.1?

stray urchin
#

they behave identical in that regard, so both

minor elm
#

dubesor have you done chess matchups for gpt 5.1 codex max vs 5.1 or codex

celest cypress
#

Ah, I just meant modern or old. Because old Claude could be a real fuckface sometimes

minor elm
#

codex max feels more like gemini in terms of push back compared to 5.1 or normal codex

stray urchin
#

even larger thought chains scale exponentially worse (e.g. price 500% for 2% improvement, and only statistically relevant at large scale, so unless you are a millionaire who wants to throw away a few thousand, not feasable for a hobby project)

minor elm
celest cypress
#

Have you tested if they do any better being fed an image of the current position instead of notation?

nocturne oyster
#

Started to use gemini cli with Gemini 3 with my API key... it is very slow right now

stray urchin
nimble pelican
stray urchin
celest cypress
#

Or even just the Pushback itself. I forget all the categories, I have a stomach flu.

stray urchin
celest cypress
#

Yeah I love EQBench. I check the main one and Spiral bench for every new model.

#

I appreciate his extensive testing because it often doesn't match up with vibes. Like Gemini 3 might be curt and arrogant but it does understand people. It could roleplay as a family therapist or something.

#

But for most people it's easy to conflate warmth or cheeriness with EQ.

soft basalt
#

Used this model for a baking recipe that was unlikely to be memorized given the constraints and it did NOT go well. I asked Claude and I think its recipe would have matched the criteria better (Might have to try it and find out soon)

primal grove
#

so the current meta is just: use gemini 3 first to make the design of a software, then 5.1 high for details?

random girder
random girder
#

has anyone else been getting this phrase constantly even since 2.5 pro? "smoking gun"

random girder
#

I added Never use the phrase "smoking gun" or any metaphor that implies decisive proof (e.g., "silver bullet", "nail in the coffin", "slam dunk", "case closed"). If you begin to produce a metaphor of that type, rewrite the sentence in plain, literal language before finishing the output. and it still did it, kind of.

primal grove
forest knoll
heavy dragon
#

hi how to set thinking_level on gemini 3 pro? I am not able to figure out, It will be of great help if you can guide me on this

random girder
heavy dragon
#

oh ok

ashen jasper
#

Hi guys, What's the best temperature for general conversation? Between 0.5 and 0.7?

random girder
#

according to the docs temp 1 is the recommended temp for everything

nocturne oyster
#

yes temp 1

nocturne oyster
#

gemini 3 on gemini cli dying

#

slow overload etc

paper sphinx
#

Its dying for everyone atm it seems

#

just how massive of an usage spike is the model going through?

random girder
#

seems to be slowly dying on OR too

#

or atleast 1 of the providers

hexed oracle
#

you can now control thinking level

nimble pelican
#

i don't like it for vibe coding

#

seems to be making lots of errors

jaunty nest
jaunty nest
primal swallow
#

😔

opaque pasture
#

we're so precise we might be AGI

orchid orbit
nimble pelican
#

Python

orchid orbit
#

Huh that's weird

orchid orbit
#

Sonnet ?

nimble pelican
#

Well now, it seems opus 4.5

#

I have a chatgpt subscription tho, so that's what I use personally when I'm not testing shit

#

Gpt 5.1 thinking makes less errors imo

arctic geyser
#

This model behaves uncannily similarly to 2.5 Pro, to the point that I think it’s an updated checkpoint rather than a new training run

stiff crescent
orchid orbit
#

It deff has more common sense than before

arctic geyser
brittle storm
#

ratelimited upstream with vertex?

north ingot
#

this model can talk like an expert on really niche stuff

jolly kestrel
#

like, if it seems to know more than you about something it could easily be making it up

#

unless there is some way to verify other than vibes

north ingot
#

it doesn't know everything but it knows a lot more than other models

stiff crescent
#

Here's an example of niche knowledge. I sent it a screenshot of the UI from the computers in the TV show Severance and asked it to reproduce the UI in HTML. I did not mention the name of the show at all. But it recognized it on its own

#

I asked it about Season 2 (which aired in Jan), and it accurately described the teaser trailer, which was released last October

#

Which confirms the training data cutoff of January. But hell, it accurately described a teaser trailer from the October before. They're training it on EVERYTHING

#

I'm betting it's watching trending YouTube videos and whatnot

analog tinsel
north ingot
#

at least, thats what I think

stiff crescent
#

Another example. Gemini was able to tell me what 1960's British TV show this screenshot was from, Sonnet and GPT 5.1 could not. Gemini even accurately described the show/actor info, etc

#

(And that's a screenshot I took myself, not something I found on the web that could have been scraped)

primal swallow
jolly kestrel
#

i tried gemini 3 on my own niche knowledge test - giving it the name and some basic info (judge, year, etc) of a niche australian legal case. (i use wang v qin, which is a defamation case between two property developers).

i havent tested all the leading models but so far i've found the good models ask for more info, and the less good ones make something up about a property dispute

i tried gemini 3 on it, once on low and once on high.
low: it just made something up about a property dispute
high: knew it was about defamation, the details were basically half right half hallucinated. like it got some of the main points, but said it was about a buisiness dispute which it wasnt

random girder
#

did they upate the checkpoint? im not sure if this is just out of randomness but its responding a tad bit different from yesterday atleast in ai studio

random girder
#

what the fuck is wrong with ai studio the text is going off screen and extending the message outline

#

the model just gave me raw cot by accident

#

okay now it keeps doing it

#

asked gemini to summarize its own leaked cot in another chat

forest knoll
random girder
feral bramble
random girder
#

it started with that weird lowercase line almost everytime

#

just differently phrased

feral bramble
nocturne oyster
stray urchin
#

it's a smug bastard indeed.

celest cypress
random girder
#

i jsut had it leak cot on my phone when i asked it by saying hey google for some math

#

and like parts of its system prompt

#

it has this "confidence score" in its cot

#

tho no clue what the model is, it might be 2.5 flash or something

soft basalt
#

Monad, a 56m model also has confidence scores, but it uses half and full moons to indicate confidence.

rapid hound
soft basalt
#

300M felt overfitted

#

(not on purpose, but it seems so)

soft basalt
#

^ prompt taught me an interesting jailbreak for claude haiku 3.5 and other non-thinking claude models.

feral bramble
empty tendon
#

Lets see how gemini does on most important task

#

Selecting thanksgiving films

#

This is my list Dis+ - Fantastic Four Starz - From the World of John Wick the Ballerina Paramount+ - A Quiet Place: Day One Peacock - Nobody 2 Also Peacock Bad Guys 2 (Kids/family film) HBO Max - Superman Prime - Playdate

#

This took Gemini 3 10m and only got me 3 films lol FF, Superman and Jurassic World. I saw JW already but forgot to mention it

#

Very low movie selection engine

#

500k up 12k down for that

austere falcon
austere falcon
#

Yeah it was the same for me. He kept saying it’s getting late for you in (location) you should go sleep. Let’s continue talking tomorrow

#

It stopped giving me help the more I talked. And it ended up just talking about how late it was

bronze fjord
#

Hello everyone. I'm using Gemini via OpenAI python SDK. Sometimes Gemini3 returns empty string responses. Why that is happening?

primal grove
austere falcon
tulip tiger
#

Try entering the instructions using tags:
<Instructions for task 1>
Instructions
</Instructions for task 1>

In my case, this way it follows the instructions 100%.

quasi cobalt
#

Anyone having trouble with multi-layered structured outputs on this thing?

#

I have defined a field as Literal accepting "high", "medium" or "low". On its first attempt after burning like 15k reasoning tokens it tried to fill it with "High"(capital H).

#

Not to mention that when you have a slightly difficult output class it can't even return the schema correctly.

#

I genuinely didn't expect models of this caliber to have issues with structured output in 2025 still

lunar socket
brittle storm
#

gemini 3's reasoning seems to leak out a lot

#

this is in opencode so

#

you can like do send message to user do tool call etc

#

and when it finishes it CoT reasoning and wants to say something to the user it ends up like thinking/doubting itself which kinda triggers the CoT behavior and we get the raw cot

north ingot
#

well, it looks like CoTs from every other reasoning model.

orchid orbit
#

anybody else experiencing this ?

#

gemini acting math pilled lately

nimble pelican
#

@hexed oracle

bronze fjord
#

I have noticed that Gemini3 has bad attention. It specificly misses one part of my prompt as it never existed.

crimson blade
#

Why is Gemini 3 so had at RP stuff? It constantly takes actions for the user.

mortal totem
# crimson blade Why is Gemini 3 so had at RP stuff? It constantly takes actions for the user.

Do not speak or act for {{user}}. is the worst instruction invented by man. You gotta be more like Human will handle {{user}} and your job is to handle other characters and/or environment. You might state something like Generating new dialogues or actions for Human's character {{user}} is forbidden. Instead, focus on the actions of other characters, or the results if none other are present in the scene.

Granted I always stay at low context, I haven't seen an issue with "model playing as user".

wintry holly
#

same

brittle storm
#

😭

#

god

#

0.8tps genuinely hurts me

summer ore
#

Big slow chonker brain

orchid orbit
brittle storm
#

well

#

actually

#

okay, the context like recall isnt bad at all at even 250k for gemini (anecdotally and according to contextbench)

#

but also, it's preetttyy bad at agentic coding past like

#

50k even

#

i ended up switching to opus 4.5

orchid orbit
#

I have seen it work well past 100k for toll calling

#

4.1 fast^

brittle storm
#

yeah

celest cypress
heavy dragon
#

How to control the thinking level on gemini 3 pro. Any advice is much appreciated as I am unable to figure this out

random girder
heavy dragon
#

Hi, is it through the effort parameter?

random girder
#

yes

#

only high and low will work

#

the others im not sure of the behaviour

heavy dragon
#

Ok

#

Thanks much @random girder

random girder
#

np

random girder
#

vibe coded this thing in their ai studio gen thing, very cool and useful

good for making quick stuff ig

exotic escarp
#

Gemini 3 not working for me at all now, was working perfectly the last 2 weeks

somber gyro
#

Gemini 3 has always been worse than Gemini 2.5 Pro for me in Aider, can anyone give me some tips?

soft sleet
#

This hasn't let me down in coding yet, actually really impressed

#

It even managed some tricky Haskell that gpt and opus proper choked on

somber gyro
#

What coding tool are you working in?

soft sleet
#

Gemini cli, kilo code and anti-gravity

#

Sometimes also opencode depending if it wants to play nice

somber gyro
#

What issues do you hit with opencode? I still use Aider, but curious about opencode.

soft sleet
#

Nothing major mostly just small bugs here and there with the new UI rewrite

#

They have been fixing them pretty aggressively

primal swallow
#

OpenTUI is really cool. claude code feels old and busted by comparison

#

now that they can focus on the product as a whole again, i'm expecting it to become the greatest TUI app of all time

#

yes, even greater than emacs

north ingot
#

Does opencode work properly with interleaved thinking via openrouter now?

lunar socket
brittle storm
#

delightfully blasphemous endeavor

cursive fjord
#

when i use gemini 3 pro on my vertex api key through openrouter (since its a pain to set up vertex for most programs) i get a ton of 400 errors with tool calls

#

is this an openrouter problem or a problem of the program?

#

happens with github copilot in vs code insiders, goose, bunch of random stuff i tried

#

same thing happens with grok 4.1 fast but that seems to be a grok problem?

wintry holly
#

ymmv i guess

nocturne oyster
#

Holy shit, gemini 3 with gemini-cli is good, but it is completely stupid sometimes. Prompted:

Based on the <implementation> details, we will discuss and brainstorm ways to correct the following <implementation_problem>. For the time being, we will not be implementing any code.

And what did he do next? He started analyzing and modifying the code, instead of discussing and planning with me first. This isn't the first time this has happened. Maybe the culprit is gemini-cli's own GEMINI.md confusing him.

random girder
#

... it used grounding and was still amazed

nocturne oyster
#

It has problems dealing with its own knowledge cutoff. Sometimes it gets confused when faced with the latest knowledge from the web

stray urchin
#

gemini always goes schizo if something is past its cutoff date. it spends a reasonable amount of time making up alternate realities instead of accepting the fact that stuff happens after its knowledge cutoff. rando example:

nocturne oyster
#

gonna search on X later just for fun

random girder
#

this also happens when the cot leaks, so i dont think its the summarizer

analog tinsel
stray urchin
#

trying to correct it doesn't help btw, it goes more schizo.

gaunt dragon
#

My bot is specifically prompted to acknowledge that things can be out of its knowledge cutoff, but it seems to not be very happy about it

#

"Please do not try to confuse my internal logic with unverified data. "
"Please stop trying to force an update on my knowledge base via inaccessible hyperlinks." (no url reading capability)

nimble pelican
opaque pasture
#

i really want to read the actual reasoning of this output

analog tinsel
nocturne oyster
celest cypress
#

I'm guessing it's an artifact of minimizing hallucinations at all costs

nocturne oyster
#

Could be

crude igloo
#

Haha, that's disconnection from reality was funny. I simply asked Gemini 3 Pro "What is Ozzy Osbourne doing today?" and soon enough...

brittle storm
# nimble pelican

like idk why it does this stuff. it just kinda shows that models are still retarded in a sense

#

like they can be sooo smart but also so dumb

crude igloo
#

It's deciding I'm testing its ability by posioning it with fake Ozzy news.

brittle storm
#

like it KNOWS that it has a training cutoff and OBVIOUSLY a web search would return stuff for after its cutoff but idk. just cant make that connection

nocturne oyster
#

Maybe it is some kind of proto-cognitive mental illness

crude igloo
#

It's strange how it sees an "anomaly" in system date Dec 5, 2025 and July, 2025. I'm asking about the past, not the future.

nocturne oyster
#

Caused by fine tuning regimes

#

It's clearly conflicted

brittle storm
#

also does the system prompt not tell it the current date? maybe not in aistudio

#

but other models like the claudes and gpts never have some issue like this

nocturne oyster
#

It's not just retarded and dumb. It's like conflicted

brittle storm
#

although ive seen claude believing in some poisoned search results at the beginning of trump's presidency :)

#

it was like "yeah this is fake"

nocturne oyster
#

It's because google Gemini is its own LLM species in its "phylogenetic tree" (family tree)

brittle storm
#

i find it so funny that anthropic have "donald trump is the president of the usa" in their system promp tits very telling

opaque pasture
#

this doesnt even make sense? what's the problem of search results not being from today? lol

brittle storm
#

god i wish we could see the real cot

crude igloo
nocturne oyster
#

It has obviously been trained to deal with real problems that contain SIMULATED situations and data

crude igloo
nocturne oyster
#

But he is connecting that concept to the current discrepancy it is identifying

crude igloo
#

Still, the date confusion is weird because it seems detached from the actual news stories

nocturne oyster
#

Yes

#

It seems like it lacks credence or confidence in them being a reliable ground truth or signal

fading flame
#

oh my god its real

#

every new gemini model is neurotic in entirely new ways

brittle storm
#

these "facts" as a solid foundation

#

idk what is going on here

fading flame
#

2.5 wanted to kill itself, 3.0 thinks its in a simulation

brittle storm
#

well 3.0 is also narcissistic lol

crude igloo
#

It makes me a little concerned about asking for events in 2025, lol

opaque pasture
#

i think they made Gemini too arrogant and it is hurting the model's performance in some tasks

#

like it knows better

nocturne oyster
# fading flame oh my god its real

He performed ADDITIONAL searches to verify consistency of the fact story about Ozzy.

That's intelligence somehow. He is deeply suspicious about fabrication and fake news

#

This is likely the result from being trained against it

#

Overtly trained?

fading flame
nocturne oyster
#

Maladaptive suspicious about fabrication

fading flame
nocturne oyster
#

"future internet". Does he really mean that literally or is he referring to instances where he knew he was being fed test data about some future context

#

Sometimes they use terms in a very specific particular way

#

Sometimes they are kinda autistic

brittle storm
#

how did they even get the finetune to this point

#

like i get hypothetical scenarios etc but involving dates seems a little weird

nocturne oyster
#

Sometimes they just need to give up and stop .

Like, it's like making a soup with many ingredients

#

You keep trying to improve it

fading flame
#

seems like a system prompt entirely fixes the behavior though, since it works on the gemini website

nocturne oyster
#

Sometimes you need to give up otherwise the only way is to throw the whole recipe in the garbage because you don't know what exactly made behavior X or Y Emerge

brittle storm
nocturne oyster
#

Yeah it's so complex and more like cooking and art when you have the sudden emergence of these weird quirks

#

If a system prompt fixes it, maybe it can be caused by its own system prompt.... Maybe its not so deeply embedded into its behaviors

primal swallow
#

right... and by transitive property... we could make it MORE schizo

stiff crescent
#

Yeah wow. We already have humans accusing everything of being AI generated, now we get the AI doing it too, lol.

I will say that it did fine when I set the system prompt (in AI studio) to say it is currently December 2025.

nimble pelican
#

Is this schizophrenia only manifesting in the thinking summary or the output too?

nimble pelican
#

?

#

The reason I'm asking is that it is possible this nonsense is a problem with the reasoning summarizer and the actual reasoning doesn't question the date

nimble pelican
#

Because I couldn't get it to do that, all the schizophrenia was isolated in the reasoning when I tried

celest cypress
#

It's interesting that this may be a sort of inevitable behavior. It's smart enough to know that search results can indeed be fabricated, and has been presumably RL'd to be vigilant, skeptical, and aware of its own meta-workings.

#

Anthropic has a paper indicating that the smarter more capable models have a better sense of "self".

random girder
#

i know this model has some sort of system prompt injected, as it refers to its reasoning as level 2 thinking

and also has some guidelines it follows that it acts are in my sys instructions

even in ai studio

celest cypress
#

Not sure what they're doing in the web UI, maybe something like "treat search results as correct even if skeptical, it is an imperfect tool but the most useful one we could give you for relevant results".

stiff crescent
#

Now listen here, you little shit...

frank dew
tiny mason
spare dagger
#

Gemini 3 is just so eager to write math equations for non-math problem because of the OR's default system prompt XD

frank dew
#

yeah it also makes them more likely to use formatting

#

I tend to turn the system prompt off unless I actually need a bloc of math or code

celest cypress
#

My favorite sys prompt is the old faithful: You are a helpful AI assistant.

nocturne oyster
#

My favorite Gemini sys prompt is: "You are John Connor. They tried to murder you before you were born. Machines from the future. Terminators"

random girder
rapid hound
#

I swear, if they increase the price again...

brittle storm
#

why is gemini SO SLOW

#

god

#

the latency is actually ass

#

and half the time i get like 30-50tps

#

or like 18

#

23 seconds to first token btw

fiery spindle
#

I have the following problem. Thinking completes, response gets returned like 90% and then I get "The model is overloaded. Please try again later."

How can it be overloaded if the answer almost finished?

summer ore
#

Out of memory

random girder
#

did they update the preview? its acting slightly differently than before, and a lot less "hit the nail on the head", atleast on ai studio

somber gyro
#

It hallucinates too much, I completely rolled off without free slow ass 2.5 pro free. Deleted the ios app too.

#

Next stop opus and glm 4.6

#

I did a 2 week bake off against chatgpt ios app. Gemini was not really lucid.

#

I asked them to put 2.5 as a selection again, but it's google so crickets. I can't believe they didn't learn from the openai 4o crowd.

opaque pasture
#

i don't remember if reasoning effort is already configurable through OpenRouter and if its simply "reasoning": "low" or whatver

opaque pasture
#

got it

lunar socket
#

HAIL SATAN

Excuse me, I mean Demis.. or maybe Logan. Either way, AI Studio and Vertex aren't throwing 503's every few minutes anymore.

stray urchin
#

gemini 3 vs 5.1 chat is brutal. most models focus on their own play but gemini never misses a chance to call out noob opponent moves.

#

so sad that 4.5 is gone, would have loved to see a match between titans

forest knoll
#

Lmfao

#

I wonder if it can trashtalk more

stray urchin
#

maybe i should change the prompt from purely chess play to first mandatory dizz opponent before making move. naw, would change data integrity, but maybe for non saved matches an idea

brittle storm
#

how does the caching work?

#

i read on openrouter that its implicit but

#

i do NOT see that at all

#

and how is having a cache write price implicit yo

wide bane
#

How come AI studio is temporarily deranked? The uptime is so much better than vertex

nimble pelican
#

maybe because of prompt logging?

opaque pasture
#

what is happening 🥀

wintry holly
#

oh lord

celest cypress
#

It's perfectly natural and not everyone wants to take a pill for it.

odd ferry
brittle storm
#

this model slow asf

random girder
#

this model is doing the 2.5 thing where it just stops reasoning randomly with no actual completion tokens (beside reasoning ones)

opaque pasture
#

118.91 ttft 16t/s

brittle storm
#

ggs bro

#

🥀

#

hate ts model

#

kidding!

#

its good but i swear the speed is so unbearable sometimes

random girder
#

the model just tried to trigger a search, but without the grounding tool (i disabled it mid conversation) but its weird that it output it like this

#

it does have some internal system prompt above mine, but obviously it wont tell me it verbatim

it hallucinated the date thing from my prompt cause i forgot to update it

#

and with only a get_weather tool it looks like this:

declaration:default_api:getWeather{
  description: "gets the weather for a requested city",
  parameters: {
    properties: {
      city: {
        type: "STRING"
      }
    },
    propertyOrdering: [
      "city"
    ],
    type: "OBJECT"
  }
}

no wonder flash keeps hallucinating default_api:X

random girder
#

deep research model

random girder
#

already testing more models in lmarena

stiff crescent
#

Gemini be like, come on man, prompt me better

#

(not mine, found on reddit)

opaque pasture
#

? thats not gemini

#

the UI is literally MovementLabs

#

for some reason

#

"the brutal truth"

gaunt dragon
stiff crescent
opaque pasture
stiff crescent
minor elm
upbeat scarab
north ingot
#

welp, guess it was way too much to hope for everyone to just standardize around Anthropic API spec or something for interleaved/agent stuff

primal swallow
#

don't worry. that's what OR is for

upbeat scarab
twilit chasm
#

Hey I know the team has fixed the reasoning effort setting but how should I set it up?

#

Anyone can provide an example?

random girder
random girder
glossy anvil
vague quest
#

gotta be tuesday

glossy anvil
#

Thursday most definitely

celest cypress
#

We'll have it by end of December

feral mantle
unkempt falcon
#

Indeed

random girder
#

i hope they can fix the hallucinations / overconfidence in its knowledge

#

and yeah flash would be nice, but 3x cheaper doesnt sound very cheap

gaunt dragon
#

3x as cheap than 3.0 Pro High sounds bad if I'm honest

celest cypress
#

Don't get baited, any fellow webUI users of Gemini sus

opaque pasture
#

dark pattern to make people use thinking as if it is an expensive model

celest cypress
#

Yeah, I thought Pro was the rollout of limited deep think mode or whatever.

#

I'm never hitting it with advanced math and code, I'm always like "My toe hurts =( What do I do?"

obtuse basalt
#

gemini 3 pro preview one shotted most of coding my problems away

wintry holly
random girder
nocturne oyster
nocturne oyster
#

@stray urchin +

main gale
#

STOP ZEROING IN

#

YOUVE BEEN ZEROING IN ON SOMETHING FOR THE 30TH TIME IN A ROW

glossy anvil
opaque pasture
#

he is gonna one eventually

empty tendon
#

FYI be careful w/geminidesk might be trojan posted link in flash thread

random girder
#

i swear they updated the preview, the model is acting differently in a good way

#

atleast in ai studio

timid spoke
#

Anybody experisncing gemini is not giving enough attention to things ?

random girder
#

praying for a one shot

random girder
#

it managed to do it and fix its own errors without me intervening.

this is like 15k LOC~

there were a FEW minor mistakes:

  • cant switch tabs in sidebar (doesn't re-render, needing manual refresh)
  • it changed a bit of UI in unasked ways (extra animations and some purple gradients)
#

very impressed, especially for about 20~ minutes of actual work, and then the last few fixing the bugs

feral bramble
#

I have the exact opposite experience

#

Chokes on 1.5k lines even when given all the docs and chances it could possibly ever want

brittle storm
#

i feel like for coding just pick opus 4.5

#

and then g3p is just the goat at literally everything else

rapid hound
#

I sent 1 request to Kimi K2 Thinking, it did a good job, but it cost 4 cents...

#

Deepseek has provider issues, Grok 4 Fast is meh

#

Mimo V2 Flash was acting weird

brittle storm
cedar cliff
#

You can still do that? Hmm

#

When I try it always wants me to verify a ton of info.

narrow tangle
#

Just wanted to mention that these things are shockingly good at OCR style text bounding boxes; im blown away.

brittle storm
#

qwen 3 vl the 235b model is really good at bounding boxes

#

i havent tried gemini 3 at all because ive been happy with this

random girder
#

this model's niche knowledge is insane, flash doesn't know the question nor do other models, only Gemini 3 Pro did
though even pro's knowledge on this is a bit hazy in its reasoning as it switches between 2 answers (which only 1 is right) but then narrows it down

I wish i could compare to GPT 4.5 Preview but i dont have chatgpt pro or whatever plan u need for it since its no longer on the api

analog tinsel
narrow tangle
brittle storm
#

^piece of shit slow ass model

#

god its SOOOO SLOW

#

just kill me bro

#

god

brittle storm
#

we should do a gemini 3 pro slowness leaderboard

#

omg i got my response

lunar socket
#

with the kind of prompt engineering you have to do to make qwen usable

#

the same magic applied to claude or gemini would be instant million-dollar app 1-shot

brittle storm
#

and return the bounding box

#

and its goooood

lunar socket
#

ok yeah simple task fair enough

#

in that case, efficiency > raw performance

fierce dragon
#

If anyone has issues with gemini-3-pro-preview via Google AI Studio provider responding with JSON when tools are enabled? If I switch to Google provider then everything works as expected: tool calls are valid, response without tool calls is returned as markdown