#Gemini 2.5 Pro

1 messages · Page 2 of 1

midnight venture
#

You can’t turn off thinking

open pond
#

ya

#

that is just how it goes

true token
#

The new Gemini 2.5 is still very good if not better for my use cases.

abstract plover
#

I see people rarely talk about the "hidden" costs of running reasoning models

#

I mean sure , its 10/mtoken which is cheaper than sonent but sonnet doesnt think.

crimson moon
#

It's always been like this. It's super annoying.

abstract plover
#

damn this new bastard thinks alot

potent coral
#

Crazy drop 😂

#

Thinking longer than previous model but dumber if it not about coding task

copper pilot
#

Just noticed the new 2.5 Pro no longer inserts hidden reasoning when prefilling, which was wonky before.

#

Understood, here's the summary:
Before: 1k reasoning, 1k response (doubling the output cost)
After: 1k response (faster vs no prefill too, so it's not just not reporting it)

plush bridge
sturdy ether
#

i try to account for this in my personal project lmb's price scatter chart by using dubesor's reasoning token usage data as a multiplier for the output cost

abstract plover
#

still you got a point there.

#

streaming a summarized COT seems the only way to combat this

plush bridge
#

On the UX side, maybe the API can return something like: "<think_stats>Thought for 2.2 seconds using 2548 tokens.</think_stats>" as part of streaming response.

abstract plover
#

3.5 is still the best model imo

plush bridge
#

On my personal evals, Gemini 2.5 Pro is behind GPT-4.1 and Claude 3.5 #1354107710437724221 message

I should really add back Claude 3.5 to my tests, since 3.5 is indeed better than 3.7 in some.

abstract plover
#

I hope deepseek focuses more on long context

plush bridge
#

I wish there's only 3 labs putting out SOTA models, but now we have 4.

random panther
#

problem with deepseek v3 is it's slow as fuck on all providers. why use it when gemini 2.5 can hit 500 tps? it really limits its use cases in comparison. I ain't waiting 2 minutes for it to write a dozen lines of code

sturdy ether
random panther
#

nice, good it's progressing but not immediately useful. no mention of serverless offerings?
I don't have the $2000 a day they want for an endpoint

sturdy ether
#

guess not

#

seems the only other option is sambanova which is only 200 tps

plush bridge
#

It's just some providers don't optimize it well enough to make it fast.

random panther
#

yoour'e right some providers were hidden

plush bridge
#

@restive locust we really need better UX for the provider list. The best providers in terms of speed is sometimes hidden and neglected, which gives a wrong impression on how fast the model can be. 👆

unreal marsh
hot willow
#

Guys, if I add $10 billing to open router, do I get 1000 RPD for 2.5 Pro?

plush bridge
#

Anyone else find Gemini 2.5 Pro not great in practice? It is consistently worse than other SOTA models for me in coding and writing tasks. I mainly care about instruction following and whether the response was concise.

torpid lake
open pond
#

2.5 pro is wonderful

#

i just use it for planning, not actual coding

plush bridge
torpid lake
#

I personally recommend treating default response style and word choices orthogonal to the model performance itself

#

can change response style, can't change how smart it is

plush bridge
plush bridge
#

i have to think about this more, whether this makes sense and how to go about evaluating the models.

torpid lake
#

style control is a thing, and instructions for the LLM are in plain English, so simpler to come up than, say, python

#

imagine LLM as an evil genie - if you don't tell it what to do, it'll do the worst possible way. If you tell it what to do, it'll follow the instruction but in worst possible way. Just like programming - you need to be precise in what you want and cover as many bases as possible.

plush bridge
torpid lake
#

even model versions in same family will treat same phrases differently

#

due to the nature of the machine learning, that's inevitable

plush bridge
#

yeah i am aware of those accutely

torpid lake
#

so in your case "minimise prose" worked in model 1, but won't work in model 2, and you'll need to find a different phrasing of "minimise prose" that works

#

"be concise" in my experience is more universal and worked since gpt3.5-turbo, but again, how exactly concise - depends on the model

#

some treat it as "respond in two words", some treat it as "two paragraphs"

plush bridge
#

yeah thanks for sharing. i am just thinking about how to evalute them objectively given these understanding

torpid lake
#

also location of the instruction is important. system instruction > end of prompt > beginning of prompt

#

in case you weren't aware

#

not as relevant for CoT models, but non-CoT models give priority to instructions in the bottom

plush bridge
torpid lake
#

due to how SFT teaches them, that's what they infer:
- question 1
- answer 1
- question 2
- answer 2
- question 3

which question should LLM answer to? 1, 2 or 3? the one in the bottom. Almost all LLM's generalize that for instructions.
Sometimes they generalize that question 3 should take into account question 1, so they also prioritize question 1 + question 3 when answering, therefore instructions at top and bottom have more strength than ones in the middle. openai's cookbook has same recommendation about top+bottom

#

(This is for non-CoT)

#

but because this is thread about gemini 2.5 pro, that doesn't apply, since you can't have non-CoT version of it

#

but still I think it's a generally good advice and something to look out for when comparing models

plush bridge
#

thanks for sharing. I have been experimenting F vs B vs F+B for a while and observed no signifant difference.

torpid lake
torpid lake
#

but that's off-topic for gemini 2.5 pro

#

your issue still seems to be style control (not enough or too much text)

#

so I recommend taking a typical task for your usecase, and then make separate style control prompts for each model until responses match. If you want strict template, you can provide that template for it to fill. If you parse programmatically, then you can ask for JSON with JSON schema or JSON template. With JSON make sure to disable penalties and set temperature to 0 wherever you can, or use JSON mode (gemini and openai support that).

plush bridge
plush bridge
torpid lake
#

I don't

clever whale
#

is it possible to use a paid version of 03-25 on OR?

dim ibex
#

i hope google brings back gemini 03-25, the latest snapshot is quite stupid imo

restive locust
#

We are working on fully supporting Gemini 2.5 Pro implicit caching, but for now, if you route to AI Studio you will get the implicit cache (read at .625 price, since we are currently implementing context length cache costs)

kind condor
#

hey! what's the difference between Google's Vertex and AI studio providers in practice?

restive locust
#

otherwise basically the same

kind condor
#

huh ok! thanks

floral skiff
restive locust
#

ai studio

floral skiff
#

Or is it behind the scene ?

restive locust
#

what model are you using? how are you making the calls? how are you checking for caching?

#

in my testing it works, our data shows it's working, but it's not always going to just happen automatically, it is not very consistent

floral skiff
#

2.5 pro preview throught roo code and checking in my activity tab

#

So you think it’s maybe fault on roo code side ?

restive locust
#

does your activity show you hitting AI Studio or Vertex?

floral skiff
#

Both

restive locust
#

you have to be consistently hitting AI Studio

#

and it if switches from your key to ours it could break

#

so doesn't seem to be roo code issue

#

just tricky to get it to happen consistently until we have full support with cache stickiness

floral skiff
#

Actually I can’t move from page 1 to page 2 at the activity to go there where it was jumping

restive locust
#

you should try this setting in your openrouter settings

#

or ignore vertex

floral skiff
floral skiff
floral skiff
# floral skiff

@restive locust this issue is at your side too at the activity while switching sides ?

restive locust
#

yes I just flagged to the team

abstract plover
#

I would prefer using explict caching , implicits are hit or mis s

random panther
abstract plover
random panther
#

you need a good amount of usage to justify the explicit cache before it saves you any money

abstract plover
#

btw , even with the same prefix sometimes you will miss cache. Its been a known issue with claude, oai and deepseek

random panther
# abstract plover you dont pay for storage in implicit cache?

no it does not appear you have to pay for it. it's probably significantly lower TTL than explicit (likely 5min)
Implicit caching is enabled by default for all Gemini 2.5 models. We automatically pass on cost savings if your request hits caches. There is nothing you need to do in order to enable this. It is effective as of May 8th, 2025. The minimum input token count for context caching is 1,024 for 2.5 Flash and 2,048 for 2.5 Pro.

random panther
#

having support for explicit is great as well of course

abstract plover
dusky kettle
abstract plover
#

Interesting

shrewd plaza
#

Oh wow, that's really nice to hear, and makes OR's life a lot easier

restive locust
#

yes, yes it does

abstract plover
#

Now all we have to wait for is TTL

restive locust
#

I'll time it now...

abstract plover
restive locust
#

also asked the AI Studio team

#

well it's at least <3min in my n=1 sample

abstract plover
#

hmm , makes sense if its implicits TTL is less than explicits

restive locust
#

i mean with explicit you set the TTL

#

by default it's 1hr but you can pay for 1281298219328 hours if you want lol

wheat quest
torpid cedar
#

hey, does anyone know how to change gemini safety setting so it didnt being to aggresive at rejecting input?
before the update my costumer input didnt seems to be a problem but now it just keep on rejecting and rejecting their input, thanks if anyone can help me with both of the google aI studio and google vertex.

restive locust
#

it means you are being flagged as breaking TOS

abstract plover
#

stop gooning so much yal

restive locust
restive locust
#

yeah

#

you set your own TTL

#

but obviously you pay for the cache token input price + storage price

abstract plover
#

implicity has no gurranted TTL? and we only have option of 5 min TTL for explicit through OR

#

?

#

we need AGI to parse through all of ORs if and else

restive locust
#

implicit has no guaranteed TTL. Explicit through OR has a 5m TTL yes

restive locust
restive locust
#

implicit caching with full proper pricing (long context etc) should be live through AI Studio in ~5 mins.

distant shell
shrewd plaza
#

Already feels so much better. 10c calls become 4c calls.

abstract plover
abstract plover
restive locust
abstract plover
#

"5-10 minutes of inactivity, though sometimes lasting up to a maximum of one hour during off-peak periods."

abstract plover
#

Does google too ?

restive locust
#

not officially

abstract plover
#

2.5 pro has gone to shit

#

unneccsary thinking tokens

#

lol , and the code doesnt even work.

potent coral
# abstract plover 2.5 pro has gone to shit

They to focus on coding fine-tune which make it have less knowledge of wider domain coding problem

Their previous version is better Imo and right now feels like downgrade

ancient burrow
#

Ratatatatata

wheat quest
floral skiff
#

@restive locust Caching not working with ai studio after ignoring vertex provider

restive locust
#

it either works or it doesn't, not really up to me haha

floral skiff
#

Ok

abstract plover
#

-Logan

restive locust
#

yep

wheat quest
#

meanwhile I've never been able to trigger the implicit cache ever

restive locust
#

it works in our chatroom

#

proof from my testing yesterday KEKcry

wheat quest
restive locust
prime wave
#

@restive locust does it ONLY work in the chatroom, because I'm sending IDENTICAL prompts and there is no cache hit. I assume that if the prompt is exactly the same, there would be a cache hit.

restive locust
#

There's nothing OpenRouter is doing for it to work or not work

#

it's not consistent or guaranteed even

#

sometimes you have to send the same thing 3, 4, 5 times for it to work

#

if you keep going on a long multiturn convo, sometimes it works and hits like half of the convo

#

it also helps if you send the requests pretty quickly one after the other

prime wave
#

okay, so that sounds like basically I shouldn't count on it then.

boreal island
boreal island
copper pilot
#

oh my bad I didn't read it (I haven't used it, just remembered someone made it for Claude)

While designed primarily for Claude Sonnet, it works with other models as well.
sounds like it will just send again every x minutes

abstract plover
#

unusuable right now

#

damn

royal ocean
#

It literally is unusable ☹️ Slow, overthinking, etc.

#

We need a thinking budget at least

merry geode
arctic vessel
plush bridge
abstract plover
plush bridge
#

Vertex AI is quite explicit on only Flash, but Google AI Studio doesn't mention it.

#

so i tested @google/genai, setting thinkingBudget = 0 for Gemini 2.5 Pro doesn't actually cause any errors, but it indeed doesn't stop the model from thinking. interesting behavior.

#

maybe they do plan to support it in the future

floral skiff
copper pilot
#

Speed is okay today, but implicit caching is shaky. Had it work for the first few times then suddenly stopped, even during swipes.

true token
#

I have the impression that new Pro 05-06 consumes slightly more thinking tokens than before

merry geode
indigo jasper
#

Sigh, I'm disappointed in the new 2.5 pro

#

Aider bench confirms that it's taking 2x as long to complete each task.

#

basically the same as my experience

#

sorry, 3x as long

#

Seconds per case : 165.3 (new)
Seconds per case : 45.3 (new)

abstract plover
#

Yup same experience

#

cant do much , will have to use this shitter model.

#

Nice way for google to curb multiple requests and make each requests cost more

true token
#

yes

#

at least for my coding use cases it is still very good, slower yes

midnight venture
merry geode
#

I feel like this version is smarter, at least for coding, but it is way slower than the previous version

abstract plover
abstract plover
sturdy ether
merry geode
abstract plover
sturdy ether
merry geode
#

now 2.5 pro is less attractive for my use case

abstract plover
sturdy ether
abstract plover
sturdy ether
#
    "OpenRouter Free": {
      model: "google/gemini-2.5-pro-exp-03-25",
      fixedPrice: OR_PRICE,
    },
    "Vertex": {
      model: "google/gemini-2.5-pro-exp-03-25",
      fixedPrice: equivalentPrice(1000),
    },
    "Google": {
      model: "gemini-2.5-pro-exp-03-25",
      fixedPrice: equivalentPrice(25),
      maxTokens: 250000,
    },
#

heavily limited on openrouter

#

somewhat limited on ai studio

#

barely limited on vertex

abstract plover
#

this is still the new model

sturdy ether
abstract plover
#

OR didnt fix the name

midnight venture
sturdy ether
midnight venture
#

exp didn’t update

abstract plover
#

@restive locust can you confrim?

midnight venture
#

But idk why people think it did

midnight venture
abstract plover
#

Because logan said there are no endpoints for old models

restive locust
#

the experimental endpoint does not point to the new model. Only preview

from our vertex rep

abstract plover
#

Hmm got it

merry geode
#

Ugh this model is such a pain to use now

#

sometimes it thinks for so long that it times out

potent coral
# sturdy ether https://openrouter.ai/google/gemini-2.5-pro-exp-03-25

It's not good choice, they heavily limited it.

I hope someone from OR actually contacting googl and said to them that their updated model are worse for a lot of people than their older one then told them to redeploy the older checkpoint.

Making it so we have 2 endpoint and let exp gone replace by it.

dim ibex
#

gemini 2.5 pro is unusable for me.
see all request have the same cache, (looks like its only a system instruction)
all problem i hate are existing on gemini 2.5 (slow,expensive,nocache)
it seems google Implicit caching are very bad.
from screenshot its only 4 request, but i made several request like its a 10, its all have the same cached tokens / usage_cache

abstract plover
abstract plover
#

Logan is working on thinking budget btw

#

I think they dumbed the model down to save cost and upped the thinking to mitigate some of the retardness

#

will give us thinking budget to milk more money

plush bridge
abstract plover
dry ingot
digital warren
abstract plover
dry ingot
#

the newest gemini 2.5 seems to overthink almost everything

abstract plover
ancient burrow
#

gemini 2.5 pro is the first model that could fix overlapping UI elements in an app I gave it

#

pygame app

boreal island
restive locust
#

it likely won't last though\

restive locust
#

just not the exp

boreal island
digital warren
copper pilot
#

At the very least I would like for Google to return the actual model name in modelVersion of the response. I.e. aliasing is fine (as opposed to outright error), but tell us what it's aliased to.

boreal island
#

Frustrating part is everyone went silent on this from Google's end

midnight venture
digital warren
#

makes any data collection a pain in the butt. need to carefully inspect each timestamp and cross reference.

boreal island
#

Well that's the guess anyway

#

Yours is as good as mine

novel flower
abstract plover
#

so many rate limits , insnae

plush bridge
#

But I agree Google is lacking experience in terms of rolling out models compared to OAI or Anthropic.

abstract plover
#

2.5 pro is basically unusuable right now

plush bridge
plush bridge
#

i think OR should update the coloring for uptime, i wouldn't consider 96.61% to be green, maybe slightly more yellowish?

plush bridge
#

lol google ai docs is also down it seems, and dashboard is returning 500

dry ingot
#

(Google AI Studio) Provider returned error: {"error":{"code":500,"status":"INTERNAL"}}

slim dome
#

Hello gemini down for u too?

restive ridge
#

I am interested in setting the thinking budget to 0 to add iteration speed

carmine spoke
#

so gemini is down rn then?

slim dome
#

Ye

carmine spoke
#

for how long?

restive ridge
#

it's fine here

floral skiff
#

@restive locust Any solution?

slim dome
#

Gemini still out for me too

midnight venture
#

Gemini is heavily rated limited on OR, also through AI Studio i think
But vertex has been slightly better from my experience

restive locust
restive locust
floral skiff
floral skiff
midnight venture
restive locust
#

no we're not

midnight venture
#

Oops

midnight venture
floral skiff
midnight venture
#

Sorry I completely misread it

slim dome
#

Why i have 429 return i have account with credits ?

slim dome
#

I have errorsjson.loads

#

@restive locust

#

Do you have an idea?

restive locust
midnight venture
#

@restive locust it only affects exp or also preview?

restive locust
#

only exp

#

anything I can do to make that announcement clearer?

midnight venture
restive locust
#

no worries! I added a sentence to note that it doesn't impact preview

#

if you had the question I am sure others would too

novel flower
#

what they doing to my pro exp

#

😭

distant shell
#

well free high demand model, I'm surprised they haven't cut it off already

restive locust
#

Shipped implicit caching and reasoning summaries through vertex, model should be more stable now

celest idol
#

When i do normal webdev i find its better but deepseek r1 and 3.5 can compere

#

3.7 is gold too

carmine spoke
#

so pro exp is being depreciated?

novel flower
#

not yet

carmine spoke
#

wdym not yet? its only erroring out now with no actual response it seems unusable

novel flower
#

seems its down on ai studio @restive locust , also its very rate limited today @carmine spoke

#

@carmine spoke #announcements message

carmine spoke
#

yes i saw the announcement with it being further rate limit and it being odwn for half a day + it seems like they are getting rid of it

novel flower
#

i hope not, im still using it on vertex i hope its back up tomorrow on ai studio

#

i be sad if they completely remove it

potent coral
#

I think people come back to using exp because the new version aren't the same as previous pro preview lol

novel flower
#

probably

#

any news from ai studio team @restive locust ?

restive locust
#

nope

novel flower
#

sadge

restive locust
#

they are seeing what they can do, but it does sound like they need the capacity for the paid endpoint

#

the 429 error now directs you to the paid preview model

#

"You exceeded your current quota. Please migrate to Gemini 2.5 Pro Preview (models/gemini-2.5-pro-preview-03-25) for higher quota limits."

novel flower
#

noooo please give me 1 more week 😔

potent coral
#

Could we get representative to talk with Google so they also deployed the older version of pro preview?

I mean they can deploy multiple sonnet version, are there reason for them to not be able to host multiple pro preview version.

restive locust
#

these are preview models, sonnet models are production models

#

they are not comparable

rancid stream
novel flower
#

my vertex key is still going hope i dont get rate limited

#

u.u

carmine spoke
#

if it does get deleted is there any similar free models?

plush bridge
#

i have no problem with gemini-2.5-pro-exp-03-25 via @google/genai directly. occasionally 429, but otherwise pretty fast response (faster than yesterday in fact)

plush bridge
#

nvm i take that back, i am getting all 429 now...

slim dome
#

😭😭

abstract shoal
#

I'm receiving empty strings as response now.

plush bridge
#

{"error":{"code":500,"message":"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting","status":"INTERNAL"}}

anyone having issues with gemini 2.5 pro tool call? i get random 500 errors but i am not sure what is the issue. same prompt sometimes work but sometimes return 500.

potent coral
slim dome
#

If we add Google Studio a free API key as a provider, does OpenRouter accumulate its API and that of Google?

north plume
#

how come the pro preview previously didnt have reasoning, and now it does?

potent coral
midnight venture
#

It’s strange that when trying to use exp through OR, you get rate limited with a specific “OpenRouter traffic is heavily throttled” message (even with BYOK)
But when you run through vertex directly you don’t have any issue at all

midnight venture
#

Sometimes a 500 will come up here and there, but mostly smooth sailing

#

OR on the other side just flat out doesn’t work

plush bridge
# midnight venture Nope

nice let me try. i am using the google official sdk, not sure how to get it to work with vertex

midnight venture
wheat quest
#

Vertex has a 10RPM quota across all Gemini experimental models, so OR permanently has no quota for it given the amount of demand.

midnight venture
plush bridge
midnight venture
#

I did get a lot of 500s

plush bridge
#

Vertex does use a different set of crentials from MLDev it seems, so maybe we can double the quota by using both. 🤔

midnight venture
#

I’m way over 10 RPM lol

plush bridge
#

let me try

plush bridge
slim dome
#

Hello, why is the 2.5 pro exp model blocked for those who have 10 credits even when we import our key from gemini ia studio.??

midnight venture
slim dome
#

@restive locust

midnight venture
#

You can get around it with Vertex (I think, works for me but not verified)

slim dome
#

But i have a gemini api key

midnight venture
#

I also have a BYOK setup and it doesn’t change the outcome

plush bridge
#

i think it is the same api key for google ai studio and vertex, let me check...

slim dome
#

Yes on my account with more than 10$ and add my gemini key it's working

#

But not on my account with less than 10$

midnight venture
midnight venture
#

So I don’t think it’s a question of credits

slim dome
#

Working for me are u sure ??

#

I have a return message limited to people who have more than 10$...

#

But want to use my own api key

#

Why intégrations was blocked

#

Is blocked

abstract shoal
#

Does Gemini 2.5 Pro available as API in Google AI Studio? I can only see older flash versions

wheat quest
#

Google is currently scrubbing all references to 2.5 Pro Experimental from their docs, it's very likely they'll pull it entirely (my guess is during Google IO next week)

wheat quest
#

@restive locust google just yoinked all quota for all users on experimental on AI Studio

distant shell
#

well preview becomes launched

slim dome
midnight venture
slim dome
#

Idk

abstract plover
#

2.5 pro is gonna go public on i/0

novel flower
#

Vertex still works?

plush bridge
abstract plover
plush bridge
#

I wasn't able to test vertex because I don't have access to vertex express mode (I'm an existing GCP user). Will test out the proper gcloud cli authentication required for vertex api soon.

plush bridge
#

i just tested via vertex ai and gcloud cli authentication, gemini-2.5-pro-exp-03-25 still works there. but strangely, all the stats (except QPS) are empty.

#

if you have vertex express mode access (via the same Gemini api key as Googel AI Studio), you likely can also use it, though i can't test it.

novel flower
#

how to use gemini-2.5-pro-exp-03-25 , get vertex key?

plush bridge
novel flower
carmine spoke
#

i wonder how long "paused" means

slim dome
#

How use vertex ?

novel flower
pallid wren
midnight venture
#

I still don’t understand why vertex lives in a realm of its own when it comes to models and limits
They should unify ai studio and vertex into a single product

#

Or atleast the AI side of stuff

slim dome
#

How get an api key i dont understand... @midnight venture

midnight venture
# slim dome How get an api key i dont understand... <@388487848451637249>
2.    ⁠Open the Google Cloud Console.
3.    ⁠Create a new Google Cloud project.
4.    ⁠Enable billing for your newly created Google Cloud project.
5.    ⁠Enable the Vertex AI API.
6.    ⁠Enable the Gemini API from the API overview page.
7.    ⁠In your project dashboard, navigate to APIs & Services → Credentials.
8.    ⁠Click "Create Credentials" → "API Key".
9.    ⁠Copy the generated API key and save it securely```
potent coral
midnight venture
#

Took this from Reddit, idk how accurate it is
I think there’s some extra steps involving service accounts but I’m not sure tbh

midnight venture
potent coral
#

Rate limit and capacities.

Then there also things that you need to check and figure it out by yourself, this are the most important difference

midnight venture
#

Idk about what the check and figure out part is

slim dome
#

Thanks but i saw lot of people cry on google for jump their invoice with billing activation

midnight venture
#

I’ve also been using gcloud for like almost 10 years so maybe it’s different for me
I’m still a noob at it but I got a lot of the basics over a long time ago

slim dome
#

How could u set a 1$ limit it's very hard to see all option vertex have lot of thing's that we can do

plush bridge
midnight venture
plush bridge
#

I had no idea Claude models were available on Vertex AI. TIL.

#

So now you can have a first-party model served by another first-party provider that didn't develop the model. 🤯

midnight venture
novel flower
copper pilot
#

👀 (cropped)

novel flower
#

o.o

#

👀

copper pilot
carmine spoke
#

whats the most similar model to 2.5 while its down? thats also free, i tried a few but they all seem kinda worse

novel flower
#

that's a good question

ancient burrow
restive ridge
#

I think Claude 3.7 is still extremely popular but I have never tried it.

novel flower
abstract plover
novel flower
abstract plover
#

comes close and I mean like 60% of that quaility

abstract plover
novel flower
#

yeah he asked for free

abstract plover
#

ahh nvm

autumn agate
restive locust
autumn agate
abstract plover
#

The gemini reasonings are worthless

#

I suspect they arent even summaries of the reasoning

dim ibex
#

it seems the Gemini 2.5 preview now is becoming so much better compare to last week.

  • very smart in coding task (Gemini pro 2.5 vs Claude is like : Gundam vs robocop (claude) )
  • Fast ( sometimes it uses reasoning delta, sometimes not)
  • Cache is getting better, but there's still a room for big improvement, still not consistent for atleast 5 minutes, most of the time its only 2-3 minutes

i hope they make the cache system to renew if its actively being used (like claude)
efficiency Claude are still probably cheaper for long running task.

kudos to google team.

abstract plover
#

yes I agree

boreal island
#

But yeah 3.7 can't cut it atm

abstract plover
#

even grok is dropping a model or two

abstract plover
#

insane rate limits on this model these days

graceful robin
#

there's a banner on the model page explaining

carmine spoke
#

isnt that saying just use the paid version the free is dead now?

graceful robin
#

it's not an ex model yet, it has not ceased to be

carmine spoke
#

the banner does say to use the paid version and twitter does have it paused

dim ibex
boreal island
#

Think Opus tier

graceful robin
midnight venture
carmine spoke
umbral bane
#

Does 2.5 pro support tool calling now?

true token
kind condor
distant shell
#

But I think that's more a combination of the software and Claude.

novel flower
#

thanks sir chapel @distant shell

distant shell
#

it does help I paid for Claude max, and as such feel like I don't have to worry about nickle and diming myself

#

I've had 3 going at the same time, work + two personal projects

#

just setup a task list and let them go

#

as long as you're okay picking up the pieces if shit hits the fan

plush bridge
distant shell
# plush bridge Curious for what's max duration can Claude Code go on autonomously and continuou...

Umm, well that kind of depends on how you want to define it. On time, I've let it go for a while (like over 30 minutes) but I can't say to the clock how long it was running because I was doing other things and it may have completed before then. The other aspect is sometimes their AI is congested and slower, so it isn't always 100% active even if still working. But if told to keep working, it having a clear todolist and actions to complete, and in general enough user instructions about when it should touch basis, it will do quite a bit for better or worse on its own. I've queued up a job from 0 with a fleshed out multi document design foundation, had it make a task list and go. It did about 30% of the total on its own by the time I came back in the morning. Mind you that 30% was the bulk of making something useful, well beyond what other tools have done in one shot. It was also not a frontend/web app.

#

There's no coded limit to what it can do for how long though like cursor and others. It will churn through tokens, and with the max plan, they have their own tracking so that happens at the API layer and not on the client. It's actually quite nice to get full claude capacity, versus other tools that restrict it, which is probably why it feels so dumb.

plush bridge
#

I'm trying to figure out where Claude Code fits in to in the spectrum of Devin to Cursor.

distant shell
#

its more a partner than a tool on that front

#

but not fully autonomous, though tbh not that far from it, with some type of meta layer to drive it

#

I've thought about creating a wrapper that uses gemini or something, and have it think about the big picture and meta stuff, and when claude comes back it does checks and verifies things and then sends claude back out

#

two cool things about claude code, built in todos (I've been leveraging it to put things in even for myself) and it can batch tool calls, so if it decides it needs to do a bunch of things at once, it can set them up to go, that includes changes and I think it can trigger sub agents, there's been at least one time where it batched a bunch of things, then went "I need to edit this file... looks at file.... oh I guess my batch task already did it..." heh I was like wat?

#

note I don't have many mcp servers, I have tried context7, it can be useful, but claude can search the web too and that can be just as useful in some cases, the postgres server is useful for having them discover data and schemas themselves when writing code that touches it

#

I barely explicitly give context (including a file) I almost always just refer to it by name or vague naming, or sometimes just describe what I think it is, much I would like a coworker, and let it find it iself. If I were paying for the api calls, I might be a bit more conservative, but I'm gonna make that $100 worth it

novel flower
plucky gazelle
#

Quick question here... I tried using google vertex in openrouter but it just doesn't work. Can someone help?

sick latch
#

How do I use latest gemini-2.5 pro weights through openrouter? Coz I think the gemini 2.5 pro listed on openrouter points to gemini-2.5-pro-preview-03-25 and not to gemini-2.5-pro-preview-05-06.

restive locust
#

our version points to the latest checkpoint. there is no way to hit the march checkpoint for the preview model

#

(this is a google limitation)

sick latch
#

ohh, great. Thanks for correcting me. @restive locust

restive locust
#

no worries!

sick latch
#

@restive locust Do you find this most recent Gemini difficult to control, or is it just me? I've invested a lot of effort in carefully guiding it to produce the results in the format I want. Gemini has bothered me more than any other model.

restive locust
#

I definitely think the newer reasoning models are worse at instruction following, yeah

#

I have seen people use gemini to plan / architect, and GPT-4.1 or Sonnet 3.5 to implement

wheat quest
#

AI Studio has just rolled out batch requests for the Gemini 2.5 and 2.0 series of models

void elm
#

j

restive locust
#

2.5 Pro Experimental is officially deprecated

novel flower
#

you will be missed

merry geode
#

I probably used hundreds of dollars worth of tokens

restive locust
midnight venture
#

Long live exp

#

Good model

novel flower
#

good model

#

maybe we get to enjoy another good model in the future

plush bridge
#

Looking forward to the next experimental / stealth model!

boreal island
#
novel flower
void elm
#

cause gemini is not LOTS OF MONEY..... ill start more video games now...
and enjoy me unemployency payment

#

and anime

#

until next vibecoding free sota model

midnight venture
novel flower
void elm
novel flower
void elm
#

ben stop it

copper pilot
#

Whoaaaa I'm watching it stream thought summaries as a single bolded header plus paragraph every 3 seconds, meaning they're processing their thoughts near real time. There's 35 headers in this one for a total of 14.8k output.

mystic zodiac
#

Hi !

#

Is there any way to make gemini-2.5-pro-preview on OR also give its reasoning tokens ?

graceful robin
novel flower
#

any similar model to gemini 2.5 03 25 for coding on openai or any other?

flint lion
#

So what are the odds we'll see a 3.0 Gemini pro experimental sometime within the month?

flint lion
plush bridge
#

If OpenAI, Claude and DeepSeek are good examples, it won't be 3.0, but a new checkpoint for 2.5.

#

gemini-2.5-pro-05-20 or something. GA version, not preview or experimental.

#

RIP experimental mentions and references are completely gone in the docs.

celest idol
#

nooo

graceful robin
#

Haven't scrubbed it everywhere 😅

plush bridge
wheat quest
#

gemini-2.5-pro-deepthink POGGERS

restive locust
wheat quest
#

no free tier though

pallid wren
#

gemini-2.5-ultra-5-20

novel flower
#

wow

#

sidkcmodel

boreal island
abstract plover
#

gemini-2.5-ultra-pro-max-6-09

plush bridge
#

lol the new Google AI Studio usage tab is classifying 2.5 Pro Preview as 2.5 Pro Exp

wheat quest
plush bridge
#

must be a nightmare to get the GenAI APIs working with GCP infra

wheat quest
#

Looks like Gemini is getting a urlContext built in tool that can fetch the contents of URLs to feed into the model.

#

the built in googleSearch tool is also getting the ability to specify a time range of results to search

plush bridge
#

Damn. So many startups killed again.

#

Might need to think extra hard what AI to build now.

wheat quest
#
  • the built in search tool also now allows specify a lat/lon location to geolocate searches
  • the Google SDKs are getting MCP support
  • you will be able to set your own video FPS to sample videos at instead of the fixed 1FPS
  • live API is getting multi speaker support
#

urlContext built in tool is now live.

{"contents":[{"role":"user","parts":[{"text":"Hi there! What are the headlines on https://bbc.com?"}]}],"generationConfig":{"thinkingConfig":{"includeThoughts":true},"temperature":0,"seed":0},"tools":{"urlContext":{}}}
languid crest
#

Gemini has the most confusing names, everything else I can follow just fine

mellow turret
#

Basically

Model + family number + fluff + date

Model is going to be Gemini
The currently relevant family numbers are 2.0 and 2.5
Some fluff we've seen:

  • exp: Experimental, free models. Huge no for production use (Google prohibits that in their terms), very few guarantees
  • preview: Slightly less experimental, paid models. Still not fit for production use, though at least google doesn't straight up prohibit it
  • thinking: Used in the 2.0 family, as these were not hybrid models. There's a separate 2.0 thinking model, unlike 2.5 Flash
  • flash:: Fast
  • pro: Slower but better
  • lite: Cheaper and worse than Flash

And then the date is given in format mm-dd. In actual production releases, they may use incremental numbers instead of dates (like they e.g. did with Gemini 1.5 Pro and Gemini 1.5 Pro 002)

novel flower
#

thanks kyle

slow sage
#

thanks kyle

open pond
#

thanks kyle

wheat quest
#

Gemini 2.5 Pro is getting audio output today at $20/million audio tokens

abstract plover
#

Damn

plush bridge
#

Damn

elder rain
#

Damn

kindred matrix
#

🥲

fringe rapids
#

Damn

restive locust
#

Damn

digital warren
#

They have removed raw thoughts from aistudio, replacing it with summaries only. This is a major bummer 😦

boreal island
#

They don't want people stealing reasoning via training keka

digital warren
#

but I didn't train on it, I just liked reading it, it was valuable content.

spark obsidian
#

That is a bummer.

boreal island
#

Unfortunately they called it a feature earlier in I/O

#

In other news flash 2.5 05-20 better than 2.5 Pro for RP keka

spark obsidian
dry ingot
#

no thinking budget for 2.5 pro?

wheat quest
#

2.5 Pro thinking budget is coming in June, closer to when it goes GA

restive ridge
#

Ah they pushed it back a month

#

Probably sucks then lol

solemn vigil
#

anyone else find that the new pro preview absolutely sucks compared to the march version ? Its such a frustrating model to use now

potent coral
#

Google has make the model to be more dumber by making it more tammer

novel flower
#

like @potent coral said enshittification

potent coral
#

It's actually quite funny that the older version, which is smarter. Actually able to see bad in good and good in bad, don't totally rejecting the concept when you argue with it and provide a good argument when it did while still able see and understand the difference view and possibility.

But the new one just rejecting it without even providing a good arguments.

novel flower
#

so now gemini 2.5 flash thinking is good or o4 mini

open pond
#

they are def pushing more compute into the new shiny model for a few weeks (flash)

#

so keep that in mind :p although 2.5 pro > flash generally no matter wat imo

novel flower
#

yeah for now 2.5 flash very good, in a few weeks it will get the same treatment and become useless again

#

2.5 pro 03 25 was so fucking good and so fast, i fucking miss it 😭

#

oh well when 2.5 flash goes to shit ill go with either claude 3.7 thinking or o4 mini/o3 if i still have free tokens

abstract plover
#

damn this new model is dogshit

#

the new model is as smart as a rock

#

thought 120 seconds for a basic python task.

fringe rapids
#

Yep, same experience here

novel flower
solemn vigil
#

I'm glad im not the only one finding this. my twitter feed was uncharacteristically quiet on the matter but seems a common experience here & on /bard reddit

solemn vigil
solemn vigil
novel flower
dry ingot
#

still no thinking budget

wheat quest
#

Google's going down the Anthropic route of providing signed thoughts to be able to reuse thoughts in subsequent requests.

ionic solar
ionic solar
#

Thanks !

abstract plover
#

Token speed dropped from 400 to 100~ now , which is an artifical limit. Summaries have much more BS and model is a bit dumber.

pallid wren
ionic solar
hexed rapids
#

RP/ERP has also worsened compared to EXP-03-25 (now with very long contexts it suffers from repetitions).
It seemed strange to me that Google was getting them all right!

novel flower
dry ingot
#

theu keep ruining good shit

novel flower
novel flower
#

Hehehe

dry ingot
novel flower
dry ingot
#

for gemini 2.5 pro

novel flower
#

not quite sure

restive locust
#

they said june

indigo jasper
#

I'm curious about what the raw # of calls looks like for this

#

assuming it stayed pretty constant, this chart seems like great evidence of how 2.5 pro has become such a yapper 😂

graceful robin
#

interesting how the gemini webapp after deepresearch will offer to generate an infographic - this was one: https://www.jdoodle.com/ih/1HBq
and the prompt is something like this

  • Tailwind CSS and Chart.js loaded via CDN.
  • The "Brilliant Blues" color palette applied throughout.
  • Responsive design with a grid layout for content sections.
  • Chart.js visualizations for Context Window Comparison, Architectural Pillars (Doughnut), and MRCR Benchmark Performance. These charts include the required label wrapping for labels longer than 16 characters and the specified tooltip configuration.
  • Chart containers are styled according to the requirements (full width of parent, max-width, centered, controlled responsive height).
  • HTML/CSS diagrams for the "Thinking Model" paradigm and "Context Caching" process, avoiding SVG and Mermaid JS.
  • Content derived from the "Gemini 2.5 Long Context Excellence" report, with introductory paragraphs for each section and explanatory text for all visualizations.
  • No SVG or Mermaid JS has been used.
  • The output starts with <!DOCTYPE html> and ends with </html>, with no extraneous characters or comments (the planning comments present in the <style> block during generation are not functional HTML/CSS/JS comments and are for context; they wouldn't appear in a rendered page's comment section and are within the rules provided).
abstract plover
abstract plover
graceful robin
dry ingot
#

gemini 2.5 pro degraded alot in performance

abstract plover
#

and sadly its still the best

celest idol
#

aider benchmarks have been retried

#

-10%

fringe rapids
#

jeez

#

why do companies love to do that

#

they didn't have any issues with capacity

#

i was really betting on google...

midnight venture
# fringe rapids why do companies love to do that

Only theory which makes sense to me is Google realised they hit a ceiling on Gemini improvement, quickly retired the insanely good experimental checkpoint in favour for a lighter counterpart
Next release will be an improved exp checkpoint, so people will feel the exp rush all over again as it will easily crush all other competitors and show a "massive" improvement over previous versions

#

When in reality its just a better exp checkpoint which was retired early

fringe rapids
#

Maybe they are nerfing the pro model for their upcoming ultra model

#

So the difference is larger

#

Maybe they made the old pro the new ultra xD

midnight venture
abstract plover
midnight venture
#

google has been doing a lot of work towards quantisation and training, but thats a separate topic and requires training from scratch
On top of that, your flagship model shouldnt ideally be quantised, especially if you're google

indigo jasper
celest idol
#

on discord

indigo jasper
celest idol
#

other peopp\le

#

but when i run it

#

i also get similar results

indigo jasper
#

I wouldn't trust it, people get very varied results on aider benchmarks that Paul doesn't get

#

up to and exceeding 10% often

#

(I don't know why it varies, but the same thing happened with GPT 4.1 / Quasar, people reporting very different results than what he got)

restive ridge
#

I ran the bad run, it was just 1 run, we would need more runs to draw conclusions.

steady kite
#

anyone has problem with json_schema structured output on gemini? somehow if I use @google/genai directly to aistudio, the JSON response is correct; but using openai sdk via openrouter, the structure got messed up (especially with literals)

lyric owl
#

any way to get gemini 2.5 pro free on open router like before?

novel flower
lyric owl
#

Is the current version not free? What changed

ancient burrow
lyric owl
true token
#

The google gemini (2.5 pro) API is very weird sometimes. One complex prompt, takes almost 2 minutes to complete, gives a very very high quality response. Shortly after, I give it another even harder prompt. Instantly, almost real-time, it replies with a very high quality response. lmao

fringe rapids
abstract plover
true token
#

Yeah... Maybe the tokens per second went BRRR suddenly

royal ocean
#

The thinking is rough after the update, it used almost 14k tokens one time

restive ridge
#

RE: thinking, I tried flash without thinking and was getting some weird behavior. I will keep playing with it

#

Excited for june release of pro thinking budgets

wheat quest
#

another checkpoint in a few days notlikethis

abstract plover
#

Deathmax is the source

#

he is damn good at this

abstract plover
#

Pro suddenly aint giving reasoning summaries?

wheat quest
# plush bridge Still no GA?

Signs point to 2.5 Flash going GA with the current 05-20 checkpoint, but we're getting another preview model for Pro before GA

indigo jasper
#

😄

indigo jasper
wheat quest
sturdy ether
#

at least secondary sources align

novel flower
ancient burrow
novel flower
celest idol
restive ridge
potent coral
# ancient burrow

Seems even people outside of this community also realise how bad of dowgrade the new 2.5 pro are in terms of knowledge and understanding outside of coding domain.

visual loom
#

Makes you wonder if they didn't even do an AB test and instead only looked at benchmarks

solemn seal
#

or did they purposefully degraded it so deepseek like models can not use its data?

#

🥸

#

🤷

tacit ingot
runic ibex
runic ibex
#

Honestly every model release has been weird recently, none of them just a straightforward upgrade.

#

R1 drops on EQBench's creative writing. Then beats the original on long context until 64k where it drops horribly? Maybe a fluke? And that's probably the most uncontested pure upgrade

runic ibex
#

The 2.5 Pro upgrade seems to flipflop on like, everything

novel flower
#

you mean 05 06?

runic ibex
#

Yeah

novel flower
#

ah, just wait for the new endpoint in some weeks

#

according to gosucoder, 05 06 performs a bit better on cline

#

i sent you a video @runic ibex

runic ibex
#

I saw his other video on the new R1 but I'll check it out. Trying out windsurf rn, free so why not. Already used cursor and cline

visual loom
slim turret
#

Hi

runic ibex
sturdy ether
potent coral
sturdy ether
#

some would say

runic ibex
#

Their servers were pretty badly under load a while ago, so could be legit

#

It went up on the UGI knowledge benchmark though, and that's usually positively correlated with model size. So who the hell knows

boreal island
boreal island
# ancient burrow

I hope so but with the kind of rugpull google did with 03-25 people aren't going to trust them until it is stable for what? At least 3 months? keka

sturdy ether
#

general availability

plush bridge
# boreal island What's GA?

General availability. A term that cloud companies use to signal that the product is out of beta and can be used for production workload with SLAs and proper support.

slender ginkgo
#

in other words, "give us your money now if you weren't already"

#

but also they know you were already

royal ocean
#

Where did the thinking summaries go 😑

pulsar quest
#

anyone having horrible gemini hallucinations today

#

what the heck all gemini models, especially this one just having a bad time

indigo jasper
pulsar quest
#

who is deathmax

indigo jasper
#

new 2.5 pro should be released tmrw

#

or very soon if not tomorrow

#

but leaks and semi-public info suggest tmrw

pulsar quest
#

do we know why like

#

a bunch of the gemini models today in general

#

have kind of been tweaking

indigo jasper
#

nope

pulsar quest
#

welp

#

lets hope things stabalize

#

._.

indigo jasper
tranquil drift
wheat quest
#

If plans don't change, we'll get a new checkpoint in a few hours.

midnight venture
#

wen

wheat quest
#

Looks like model is landing on Thursday instead, with thinking budget support

#

Thinking budget for 2.5 Pro will be disabled or 64-32768

abstract plover
#

also , 2.5 flash GA ? Hope they release the model on batch api

#

batch api is still stuck with 2.0 flash 001

wheat quest
ancient burrow
#

Found on reddit post

tacit ingot
#

Hmm

wheat quest
#

And from my teaser on another server

midnight venture
ancient burrow
#

What is diff-fenced

abstract plover
indigo jasper
# wheat quest

wait a sec so despite not being an insider, you got it - the api for new gemini is exposed to the public? 😂

wheat quest
#

👀

indigo jasper
#

me rn

abstract plover
restive locust
indigo jasper
#

yeah

abstract plover
indigo jasper
#

I've seen independent confirmation of the same as deathmax in another server from an insider

restive locust
#

deathmax is not an insider haha

indigo jasper
indigo jasper
abstract plover
#

wait deathmax might be insider her

indigo jasper
#

just saying it confirms that deathmax is the goat :)

wheat quest
#

Shrug they turned it off

indigo jasper
#

like

#

just now??

abstract plover
#

calling it fake , puts on deathmax

indigo jasper
#

noooo

#

they're watching this chat

#

😭

abstract plover
#

I bet 100$ toven leaked it

wheat quest
#

window wasn't open that long

indigo jasper
#

do you just have scripts going monitoring this kind of thing

#

like how plinny gets the system prompt changes

#

so... is this more or less completion_tokens than 05-06

#

because the cost is more

#

and I'm concerned that it's gonna still be a slow loser

#

Seconds per case : 45.3
gemini 2.5 pro 03-25...

#

Seconds per case : 165.3
05-06...

#

so only a slight improvement over current 05-06

#

that's quite sad

abstract plover
#

well this model was sucked at everything but coding , so I assume the next model is going to be better at rest of the task with slight coding degradation?

wheat quest
#

I wouldn't read too much into the test time

indigo jasper
indigo jasper
wheat quest
#

throughput would have been jank given the situation

indigo jasper
#

fair

#

rip no token count at this point

#

it is interesting that the total_cost went up assuming costs are accccurate... are they raising prices?

novel flower
#

raising prices?

#

oh no no no

true token
#

You will pay for it

#

And you will enjoy it

true token
#

🤑

ancient burrow
#

disnejoy it so much

true token
ancient burrow
#

cuz like, why use api if i got a model right there ready to answer, that i already paid for

abstract plover
#

chatgpt subs is a good vfm

ancient burrow
#

but even if i did mainly use api i'd be pissed to find out there is a price markup for no reason

novel flower
#

chatgpt plus that good huh?

abstract plover
# ancient burrow but even if i did mainly use api i'd be pissed to find out there is a price mark...

Why do reasoning models cost more than non-reasoning ones even though they have the same architecture? This video provides a great explanation!

I am seeing a lot of people confused about why reasoning models cost more than their non-reasoning counterparts even if they share exactly the same architecture. It has everything to do with the fact th...

#

for the Nth time , there is nothing called as thinking tax

ancient burrow
#

does it go on about token count?

#

i'm ttalking about price per token

#

if it's about context length, it looks like they could just implement context length-specific pricing, like they've already done with 2.5 pro

#

otherwise i could rack up a lot of context on 2.5 flash non-thinking, and have it cost them just the same, but for some reason they'd be charging me less

#

it just smells like they have it cost more only because the user gets better results, and nothing else.

true token
#

Depending on your use case, (and the model)

#

The thinking is very much worth it

abstract plover
ancient burrow
#

I think I answered all the points on the slide on the video

indigo jasper
#

Deepseek has somewhat disproved this

#

with R1

#

and them releasing their figures on it

ancient burrow
#

@wheat quest omg ur famous

#

Wasnt me who posted

novel flower
#

famous deathmax

celest idol
#

my predictiom

#

its overfitted slop

ancient burrow
celest idol
#

but overfitted

ancient burrow
#

Especially when they have the edge

celest idol
#

i mean i think r1 has the edge imo

#

but eh

#

btw evidence shows deepseek distilled from gemini lol

#

i saw an article proving it

ancient burrow
#

2.5 pro has completed a couple coding tasks I gave it that no model that i tested before it was able to

#

One of them was figuring out which elements in a pygame app overlapped and fixing the UI

celest idol
#

i mean for me i like switching models

#

sometimes o3 or o4-minj can solve a problem r1 cant

#

sometimes 2.5 pro is better

#

and sometimes sonnet 4 takes the win

#

but in general ive been using r1 the mpst

foggy flax
#

86.2%

#

that's not even their upcoming deep think

plush bridge
#

Aider at this point is probably leaked, reward hacked, overfitted and outdated for agentic flows.

#

Probably need a aider benchmark V3 to become useful again.

ancient burrow
indigo jasper
#

I've stopped using 2.5 pro ENTIRELY recently

#

the fact that it takes 3+ minutes on many tasks

#

is insane

#

even if it gets way better in the next update, if it's not a lot faster, I'm not sure I'll use it!

midnight venture
novel flower
novel flower
indigo jasper
novel flower
celest idol
#

yes

celest idol
#

not leaked

plush bridge
celest idol
#

ah

#

ngl we need someone trusted to make a closed source benchmark

plush bridge
#

ARC-AGI being one

ionic solar
true token
#

I switch between o3, Gemini pro 2.5, r1 and sometimes sonnet 4

#

It all depends

#

Pro 2.5 and sonnet 4 explain code better on average

plush bridge
#

lol google messed up (or genius marketing). even i can see it.

sleek cave
#

King fall has only 64k context weird…

midnight venture
plush bridge
#

not working for me lol

midnight venture
#

64k is still a lot tbh

plush bridge
#

i think just some intern messed up probably, not the actual new 2.5 pro model

#

and it's gone!

true token
#

gone

kind condor
#

nah why would they label CONFIDENTIAL in a publicly available service lmao

#

messed up hard or genius marketing

plush bridge
#

cheap way to generate hype and get attention lol

true token
#

should have been named Kingfall YOLO 360 noscope GPT Killer x

#

to scare people even more

kind condor
#

sam altman or elon musk would

fringe rapids
celest idol
fringe rapids
celest idol
#

oh

celest idol
#

i mean if arc was open source llms prob wouldve gotten like 50%

spark obsidian
celest idol
#

gemma doesnt support structured output, code execution,metc

#

unless its gemma 4?

#

but it seems a bit early

shrewd plaza
#

Google has launched exp models with 32-64k context lengths before.

raven fractal
#

I dont know if this was here before but seems like ai studio now has framerate and time options for video attachments

runic ibex
#

Google has been cooking on absolutely everything except 05-06 so I'm expecting good things.

pallid wren
#

So today is the expected new model?

solemn seal
#

Well Logan K didn't tweet "Gemini" yet

#

Which he usually does before releasing something

solemn seal
#

What!

#

I just checked his account and didn't saw that 😠

#

X is broken 🚬

solemn seal
dry ingot
#

so where is this new model at

mortal solstice
#

no yet

wheat quest
#

gemini-2.5-pro-preview-06-05 is now rolling out.

tacit ingot
restive locust
mellow turret
#

👀

heavy aspen
kind condor
#

when will it NOT be a preview version?

restive locust
#

updated model coming out in ~5mins

#

AI Studio first, vertex up next

copper pilot
#

05-06 and 06-05 is great naming, guys

lyric pilot
hardy osprey
near ore
#

hey]

#

google/gemini-2.5-pro-preview

#

this was unavailavle for 5 mins

restive locust
#

yeah sorry. back up now

#

it's the new endpoint now

hardy osprey
open pond
#

knowing google this is gonna be op for 4 days

#

then go to shit

#

unfortunately

near ore
#

@restive locust did the model got a update ?

#

or just endpoint refersh

hardy osprey
#

I tried passing "'extra_json': {'reasoning': {'max_tokens': 0}}", but got a 400 error.

{'error': {'message': 'Provider returned error', 'code': 400, 'metadata': {'raw': '{\n "error": {\n "code": 400,\n "message": "The thinking budget (0) is invalid.",\n "status": "INVALID_ARGUMENT"\n }\n}\n', 'provider_name': 'Google AI Studio'}}

#

But Google says the new pro model has already support budget control.

restive locust
#

you can't set thinking budget 0

#

minimum 128

open pond
hardy osprey
#

Alright, I've confirmed this from Google's doc. Thank you.

kind condor
#

better? worse? can't be worse right?

tacit ingot
plush bridge
dry ingot
copper pilot
#

Minimum thinking: "Alright, the user wants me to [whatever user just input]." -> [begin response] 🙄

steady pelican
#

so we can only get one version of gemini 2.5 pro via openrouter, it always points to latest?