Gemini 2.5 Pro | OpenRouter | Page 2

midnight venture May 7, 2025, 8:26 PM

#

You can’t turn off thinking

open pond May 7, 2025, 9:42 PM

#

ya

#

that is just how it goes

true token May 7, 2025, 9:49 PM

#

The new Gemini 2.5 is still very good if not better for my use cases.

abstract plover May 7, 2025, 10:06 PM

#

I see people rarely talk about the "hidden" costs of running reasoning models

#

I mean sure , its 10/mtoken which is cheaper than sonent but sonnet doesnt think.

crimson moon May 7, 2025, 10:48 PM

#

It's always been like this. It's super annoying.

abstract plover May 7, 2025, 11:33 PM

#

damn this new bastard thinks alot

potent coral May 8, 2025, 12:21 AM

#

Crazy drop 😂

#

Thinking longer than previous model but dumber if it not about coding task

copper pilot May 8, 2025, 3:19 AM

#

Just noticed the new 2.5 Pro no longer inserts hidden reasoning when prefilling, which was wonky before.

#

Understood, here's the summary:
Before: 1k reasoning, 1k response (doubling the output cost)
After: 1k response (faster vs no prefill too, so it's not just not reporting it)

plush bridge May 8, 2025, 3:33 AM

#

abstract plover I see people rarely talk about the "hidden" costs of running reasoning models

I've been thinking about this a lot. Maybe the providers should not charge for thinking tokens if they are not exposed. The non-exposed nature makes counting and tracking usage very tricky for both developers and users.

sturdy ether May 8, 2025, 3:35 AM

#

i try to account for this in my personal project lmb's price scatter chart by using dubesor's reasoning token usage data as a multiplier for the output cost

abstract plover May 8, 2025, 3:36 AM

#

plush bridge I've been thinking about this a lot. Maybe the providers should not charge for t...

I mean its expensive for them to serve a reasoning model and competitors WILL distill their model

#

still you got a point there.

#

streaming a summarized COT seems the only way to combat this

plush bridge May 8, 2025, 3:39 AM

#

On the UX side, maybe the API can return something like: "<think_stats>Thought for 2.2 seconds using 2548 tokens.</think_stats>" as part of streaming response.

abstract plover May 8, 2025, 3:47 AM

#

3.5 is still the best model imo

plush bridge May 8, 2025, 3:51 AM

#

abstract plover 3.5 is still the best model imo

You are not alone

#

On my personal evals, Gemini 2.5 Pro is behind GPT-4.1 and Claude 3.5 #1354107710437724221 message

I should really add back Claude 3.5 to my tests, since 3.5 is indeed better than 3.7 in some.

abstract plover May 8, 2025, 4:02 AM

#

plush bridge On my personal evals, Gemini 2.5 Pro is behind GPT-4.1 and Claude 3.5 https://di...

new v3 is slept on

#

I hope deepseek focuses more on long context

plush bridge May 8, 2025, 4:03 AM

#

abstract plover new v3 is slept on

It's also doing well on my eval. Just that humans have a hard time focusing on more than 3 things.

#

I wish there's only 3 labs putting out SOTA models, but now we have 4.

random panther May 8, 2025, 4:30 AM

#

problem with deepseek v3 is it's slow as fuck on all providers. why use it when gemini 2.5 can hit 500 tps? it really limits its use cases in comparison. I ain't waiting 2 minutes for it to write a dozen lines of code

sturdy ether May 8, 2025, 4:31 AM

#

random panther problem with deepseek v3 is it's slow as fuck on all providers. why use it when ...

[funny you mention that...](#general message)

random panther May 8, 2025, 4:32 AM

#

nice, good it's progressing but not immediately useful. no mention of serverless offerings?
I don't have the $2000 a day they want for an endpoint

sturdy ether May 8, 2025, 4:33 AM

#

guess not

#

seems the only other option is sambanova which is only 200 tps

plush bridge May 8, 2025, 4:36 AM

#

random panther problem with deepseek v3 is it's slow as fuck on all providers. why use it when ...

Honestly DeepSeek V3.1 is not that slow. It can get to 60 tokens per second on Fireworks. On par with Claude 3.5 Sonnet, etc. See my tests here: #1369678362330529875 message

#

It's just some providers don't optimize it well enough to make it fast.

random panther May 8, 2025, 4:37 AM

#

yoour'e right some providers were hidden

plush bridge May 8, 2025, 4:41 AM

#

@restive locust we really need better UX for the provider list. The best providers in terms of speed is sometimes hidden and neglected, which gives a wrong impression on how fast the model can be. 👆

unreal marsh May 8, 2025, 4:43 AM

#

plush bridge <@165587622243074048> we really need better UX for the provider list. The best p...

yep good point. you can always sort by throughput!

hot willow May 8, 2025, 8:32 AM

#

Guys, if I add $10 billing to open router, do I get 1000 RPD for 2.5 Pro?

plush bridge May 8, 2025, 9:09 AM

#

Anyone else find Gemini 2.5 Pro not great in practice? It is consistently worse than other SOTA models for me in coding and writing tasks. I mainly care about instruction following and whether the response was concise.

torpid lake May 8, 2025, 9:34 AM

#

plush bridge Anyone else find Gemini 2.5 Pro not great in practice? It is consistently worse ...

try "be concise" in system instructions, otherwise you're essentially comparing default styles instead of the model's capabilities

open pond May 8, 2025, 9:35 AM

#

2.5 pro is wonderful

#

i just use it for planning, not actual coding

plush bridge May 8, 2025, 9:48 AM

#

torpid lake try "be concise" in system instructions, otherwise you're essentially comparing ...

thanks. this indeed improve the style to my personal liking. i tend to not mess with system instructions, but this works well enough in normal prompts.

torpid lake May 8, 2025, 9:53 AM

#

I personally recommend treating default response style and word choices orthogonal to the model performance itself

#

can change response style, can't change how smart it is

plush bridge May 8, 2025, 9:56 AM

#

torpid lake I personally recommend treating default response style and word choices orthogon...

makes sense. maybe not performance, but i'd say still a consideration for personal preference.

plush bridge May 8, 2025, 10:16 AM

#

torpid lake I personally recommend treating default response style and word choices orthogon...

well actually adding "be concise" improve response style for coding tasks, but made writing tasks perform way worse, now the response is way too short. so this is not a universal silver bullet for fixing gemini.

#

i have to think about this more, whether this makes sense and how to go about evaluating the models.

torpid lake May 8, 2025, 10:20 AM

#

plush bridge well actually adding "be concise" improve response style for coding tasks, but m...

well, there many things you can put into system message to control what you want to get, "be concise" is just an example. Can be "give response for at least 200 words, but no more than 500" (it won't give exact amount of words, but it'll be longer than be concise)

#

style control is a thing, and instructions for the LLM are in plain English, so simpler to come up than, say, python

#

imagine LLM as an evil genie - if you don't tell it what to do, it'll do the worst possible way. If you tell it what to do, it'll follow the instruction but in worst possible way. Just like programming - you need to be precise in what you want and cover as many bases as possible.

plush bridge May 8, 2025, 10:23 AM

#

torpid lake imagine LLM as an evil genie - if you don't tell it what to do, it'll do the wor...

i am very aware of giving specific instructions. in fact i already had "minimize prose" in all prompts since 2023. this had worked well for all previous models until gemini 2.5 pro came out, which forces me to add "be concise".

torpid lake May 8, 2025, 10:23 AM

#

plush bridge i am very aware of giving specific instructions. in fact i already had "minimize...

every model treats same phrase differently, so you can't have universal prompt that works on every model, despite OpenRouter offering easy switch between them

#

even model versions in same family will treat same phrases differently

#

due to the nature of the machine learning, that's inevitable

plush bridge May 8, 2025, 10:24 AM

#

yeah i am aware of those accutely

torpid lake May 8, 2025, 10:24 AM

#

so in your case "minimise prose" worked in model 1, but won't work in model 2, and you'll need to find a different phrasing of "minimise prose" that works

#

"be concise" in my experience is more universal and worked since gpt3.5-turbo, but again, how exactly concise - depends on the model

#

some treat it as "respond in two words", some treat it as "two paragraphs"

plush bridge May 8, 2025, 10:26 AM

#

yeah thanks for sharing. i am just thinking about how to evalute them objectively given these understanding

torpid lake May 8, 2025, 10:26 AM

#

also location of the instruction is important. system instruction > end of prompt > beginning of prompt

#

in case you weren't aware

#

not as relevant for CoT models, but non-CoT models give priority to instructions in the bottom

plush bridge May 8, 2025, 10:26 AM

#

torpid lake also location of the instruction is important. system instruction > end of promp...

this is not universal, OpenAI has different recommendations from Anthropic

torpid lake May 8, 2025, 10:27 AM

#

plush bridge this is not universal, OpenAI has different recommendations from Anthropic

different recommendations yes, but in practice instruction in the end works better than instruction in the beginning for both openai, anthropic and gemini (all non-cot)

#

due to how SFT teaches them, that's what they infer:
- question 1
- answer 1
- question 2
- answer 2
- question 3

which question should LLM answer to? 1, 2 or 3? the one in the bottom. Almost all LLM's generalize that for instructions.
Sometimes they generalize that question 3 should take into account question 1, so they also prioritize question 1 + question 3 when answering, therefore instructions at top and bottom have more strength than ones in the middle. openai's cookbook has same recommendation about top+bottom

#

(This is for non-CoT)

#

but because this is thread about gemini 2.5 pro, that doesn't apply, since you can't have non-CoT version of it

#

but still I think it's a generally good advice and something to look out for when comparing models

plush bridge May 8, 2025, 10:34 AM

#

thanks for sharing. I have been experimenting F vs B vs F+B for a while and observed no signifant difference.

torpid lake May 8, 2025, 10:35 AM

#

plush bridge thanks for sharing. I have been experimenting F vs B vs F+B for a while and obse...

I've found it to depend on complexity of the prompt. If it's "2+2=?", then it won't matter. If it's something out-of-distribution, then in my experience it's bottom for non-CoT models (in 2023-2024, didn't recheck in 2025).

torpid lake May 8, 2025, 10:36 AM

#

plush bridge thanks for sharing. I have been experimenting F vs B vs F+B for a while and obse...

one example is clickbait detection. gpt-4-0613 consistently detected at better rate, with less false positives, if instruction was after the document.

#

but that's off-topic for gemini 2.5 pro

#

your issue still seems to be style control (not enough or too much text)

#

so I recommend taking a typical task for your usecase, and then make separate style control prompts for each model until responses match. If you want strict template, you can provide that template for it to fill. If you parse programmatically, then you can ask for JSON with JSON schema or JSON template. With JSON make sure to disable penalties and set temperature to 0 wherever you can, or use JSON mode (gemini and openai support that).

plush bridge May 8, 2025, 10:47 AM

#

torpid lake so I recommend taking a typical task for your usecase, and then make separate st...

believe or not, that's exactly what i do as my main job now 😆

plush bridge May 8, 2025, 12:11 PM

#

torpid lake so I recommend taking a typical task for your usecase, and then make separate st...

Do you mind if I dm you to chat more on the topic?

torpid lake May 8, 2025, 12:11 PM

#

I don't

clever whale May 8, 2025, 3:14 PM

#

is it possible to use a paid version of 03-25 on OR?

dim ibex May 8, 2025, 3:22 PM

#

i hope google brings back gemini 03-25, the latest snapshot is quite stupid imo

restive locust May 8, 2025, 7:12 PM

#

We are working on fully supporting Gemini 2.5 Pro implicit caching, but for now, if you route to AI Studio you will get the implicit cache (read at .625 price, since we are currently implementing context length cache costs)

kind condor May 8, 2025, 7:46 PM

#

hey! what's the difference between Google's Vertex and AI studio providers in practice?

restive locust May 8, 2025, 7:46 PM

#

kind condor hey! what's the difference between Google's Vertex and AI studio providers in pr...

how quickly they support new features mostly haha

#

otherwise basically the same

kind condor May 8, 2025, 7:47 PM

#

huh ok! thanks

floral skiff May 8, 2025, 7:48 PM

#

restive locust We are working on fully supporting Gemini 2.5 Pro implicit caching, but for now,...

So the api key from aistudio or one created at gcp ?

restive locust May 8, 2025, 7:48 PM

#

ai studio

floral skiff May 8, 2025, 7:49 PM

#

I’m using it from https://aistudio.google.com/apikey

and doesn’t see any caching

Sign in - Google Accounts

#

Or is it behind the scene ?

restive locust May 8, 2025, 7:49 PM

#

what model are you using? how are you making the calls? how are you checking for caching?

#

in my testing it works, our data shows it's working, but it's not always going to just happen automatically, it is not very consistent

floral skiff May 8, 2025, 7:50 PM

#

2.5 pro preview throught roo code and checking in my activity tab

#

So you think it’s maybe fault on roo code side ?

restive locust May 8, 2025, 7:50 PM

#

does your activity show you hitting AI Studio or Vertex?

floral skiff May 8, 2025, 7:51 PM

#

Both

restive locust May 8, 2025, 7:51 PM

#

you have to be consistently hitting AI Studio

#

and it if switches from your key to ours it could break

#

so doesn't seem to be roo code issue

#

just tricky to get it to happen consistently until we have full support with cache stickiness

floral skiff May 8, 2025, 7:52 PM

#

Actually I can’t move from page 1 to page 2 at the activity to go there where it was jumping

restive locust May 8, 2025, 7:52 PM

#

you should try this setting in your openrouter settings

#

or ignore vertex

floral skiff May 8, 2025, 7:53 PM

#

floral skiff May 8, 2025, 7:54 PM

#

restive locust or ignore vertex

Have set it up now. Let’s check

floral skiff May 8, 2025, 7:56 PM

#

floral skiff

@restive locust this issue is at your side too at the activity while switching sides ?

restive locust May 8, 2025, 7:56 PM

#

yes I just flagged to the team

abstract plover May 8, 2025, 9:00 PM

#

I would prefer using explict caching , implicits are hit or mis s

random panther May 8, 2025, 9:59 PM

#

abstract plover I would prefer using explict caching , implicits are hit or mis s

implicits are great. no work required, automatic, no downside. explicit caching with gemini is a lot of work to implement and expensive

abstract plover May 8, 2025, 10:03 PM

#

random panther implicits are great. no work required, automatic, no downside. explicit caching ...

why is it expensive? also impicits are known to miss , they arent reliable.

random panther May 8, 2025, 10:07 PM

#

abstract plover why is it expensive? also impicits are known to miss , they arent reliable.

of course they're known to miss you have to start at the same prefix and it only goes as far as your content is static
it's epxensive for gemini because you have to rent the cache. it's not just pay for write like anthropic

#

you need a good amount of usage to justify the explicit cache before it saves you any money

abstract plover May 8, 2025, 10:09 PM

#

random panther of course they're known to miss you have to start at the same prefix and it only...

you dont pay for storage in implicit cache?

#

btw , even with the same prefix sometimes you will miss cache. Its been a known issue with claude, oai and deepseek

random panther May 8, 2025, 10:13 PM

#

abstract plover you dont pay for storage in implicit cache?

no it does not appear you have to pay for it. it's probably significantly lower TTL than explicit (likely 5min)
Implicit caching is enabled by default for all Gemini 2.5 models. We automatically pass on cost savings if your request hits caches. There is nothing you need to do in order to enable this. It is effective as of May 8th, 2025. The minimum input token count for context caching is 1,024 for 2.5 Flash and 2,048 for 2.5 Pro.

random panther May 8, 2025, 10:13 PM

#

abstract plover btw , even with the same prefix sometimes you will miss cache. Its been a known ...

still I'm not going to complain about free, mostly reliable implicit caching as long as it has no downsides

#

having support for explicit is great as well of course

abstract plover May 8, 2025, 10:14 PM

#

random panther no it does not appear you have to pay for it. it's probably significantly lower ...

idk will have to test test

abstract plover May 8, 2025, 10:14 PM

#

random panther having support for explicit is great as well of course

true that

dusky kettle May 8, 2025, 10:17 PM

#

Logan confirmed that there's no storage cost for implicit caching, see https://x.com/OfficialLoganK/status/1920530345553748480 👍

Logan Kilpatrick (@OfficialLoganK) on X

@ClassicMain @mag_pl No storage cost for implicit caching

abstract plover May 8, 2025, 10:39 PM

#

Interesting

shrewd plaza May 8, 2025, 10:40 PM

#

Oh wow, that's really nice to hear, and makes OR's life a lot easier

restive locust May 8, 2025, 10:42 PM

#

yes, yes it does

abstract plover May 8, 2025, 10:46 PM

#

Now all we have to wait for is TTL

restive locust May 8, 2025, 10:48 PM

#

abstract plover Now all we have to wait for is TTL

i'm testing our pending PR now, hard to say what the TTL is or anything

#

I'll time it now...

abstract plover May 8, 2025, 10:49 PM

#

restive locust I'll time it now...

🙏

restive locust May 8, 2025, 10:51 PM

#

also asked the AI Studio team

#

well it's at least <3min in my n=1 sample

abstract plover May 8, 2025, 10:53 PM

#

hmm , makes sense if its implicits TTL is less than explicits

restive locust May 8, 2025, 10:53 PM

#

i mean with explicit you set the TTL

#

by default it's 1hr but you can pay for 1281298219328 hours if you want lol

wheat quest May 8, 2025, 11:10 PM

#

restive locust We are working on fully supporting Gemini 2.5 Pro implicit caching, but for now,...

but if you want thinking summaries then you need to route to vertex blobreeee

torpid cedar May 8, 2025, 11:20 PM

#

hey, does anyone know how to change gemini safety setting so it didnt being to aggresive at rejecting input?
before the update my costumer input didnt seems to be a problem but now it just keep on rejecting and rejecting their input, thanks if anyone can help me with both of the google aI studio and google vertex.

restive locust May 8, 2025, 11:26 PM

#

torpid cedar hey, does anyone know how to change gemini safety setting so it didnt being to a...

we default to safety settings OFF now btw. If you are getting the "PROHIBITED_CONTENT" finish reason there's no way to adjust the safety settings to prevent that

#

it means you are being flagged as breaking TOS

abstract plover May 8, 2025, 11:31 PM

#

stop gooning so much yal

restive locust May 8, 2025, 11:44 PM

#

abstract plover Now all we have to wait for is TTL

from google:

No worries! It's closer to the latter - it's best effort. For guaranteed cache hits and TTL, we'd recommend using explicit caching

abstract plover May 9, 2025, 12:13 AM

#

restive locust from google: > No worries! It's closer to the latter - it's best effort. For gua...

guaranteed TTL ????

restive locust May 9, 2025, 12:16 AM

#

abstract plover guaranteed TTL ????

if you create your own cache

#

yeah

#

you set your own TTL

#

but obviously you pay for the cache token input price + storage price

abstract plover May 9, 2025, 12:17 AM

#

implicity has no gurranted TTL? and we only have option of 5 min TTL for explicit through OR

#

?

#

we need AGI to parse through all of ORs if and else

restive locust May 9, 2025, 12:17 AM

#

implicit has no guaranteed TTL. Explicit through OR has a 5m TTL yes

restive locust May 9, 2025, 12:19 AM

#

abstract plover we need AGI to parse through all of ORs if and else

For the meme, I've counted the if statements in your codebase's .ts files. The grand total is 17,692!
Gemini 2.5 pro via Cline w/ grep

restive locust May 9, 2025, 12:49 AM

#

implicit caching with full proper pricing (long context etc) should be live through AI Studio in ~5 mins.

distant shell May 9, 2025, 12:51 AM

#

restive locust implicit caching with full proper pricing (long context etc) should be live thro...

If tool calling can be more consistent with Gemini I'd use it 100%, with zed no Gemini model can do file edits through openrouter, while Claude and 4.1 can do the same through openrouter. It's quite weird

shrewd plaza May 9, 2025, 1:22 AM

#

Already feels so much better. 10c calls become 4c calls.

abstract plover May 9, 2025, 1:45 AM

#

restive locust > For the meme, I've counted the `if` statements in your codebase's `.ts` files....

Hahaah

abstract plover May 9, 2025, 1:46 AM

#

restive locust implicit caching with full proper pricing (long context etc) should be live thro...

So to confirm , there is no specific TTL for implicit caching ?

restive locust May 9, 2025, 1:47 AM

#

abstract plover So to confirm , there is no specific TTL for implicit caching ?

yeah

#

it varies

abstract plover May 9, 2025, 2:00 AM

#

restive locust yeah

Seems weird

restive locust May 9, 2025, 2:00 AM

#

abstract plover Seems weird

same thing as openai really

abstract plover May 9, 2025, 2:01 AM

#

"5-10 minutes of inactivity, though sometimes lasting up to a maximum of one hour during off-peak periods."

abstract plover May 9, 2025, 2:02 AM

#

restive locust same thing as openai really

Openai has some clue atleast

#

Does google too ?

restive locust May 9, 2025, 2:06 AM

#

not officially

abstract plover May 9, 2025, 3:05 AM

#

2.5 pro has gone to shit

#

unneccsary thinking tokens

#

lol , and the code doesnt even work.

potent coral May 9, 2025, 4:35 AM

#

abstract plover 2.5 pro has gone to shit

They to focus on coding fine-tune which make it have less knowledge of wider domain coding problem

Their previous version is better Imo and right now feels like downgrade

novel flower May 9, 2025, 7:52 AM

#

potent coral They to focus on coding fine-tune which make it have less knowledge of wider dom...

context7 fixes this sir?

ancient burrow May 9, 2025, 11:47 AM

#

Ratatatatata

wheat quest May 9, 2025, 1:07 PM

#

https://developers.googleblog.com/en/gemini-2-5-video-understanding/

The Gemini API now offers a 'low' media resolution parameter enabling Gemini 2.5 Pro to process ~6 hours of video with 2 million token context.
but google, 2.5 pro is still limited to 1M context

Advancing the frontier of video understanding with Gemini 2.5- Goog...

floral skiff May 9, 2025, 1:24 PM

#

@restive locust Caching not working with ai studio after ignoring vertex provider

restive locust May 9, 2025, 3:52 PM

#

floral skiff <@165587622243074048> Caching not working with ai studio after ignoring vertex ...

I mean there's no way for me to debug this, it's not something we do at all

#

it either works or it doesn't, not really up to me haha

floral skiff May 9, 2025, 4:25 PM

#

Ok

abstract plover May 9, 2025, 4:35 PM

#

restive locust not officially

Implicit caching has dynamic TTL and is approx 6 min

#

-Logan

restive locust May 9, 2025, 4:37 PM

#

yep

wheat quest May 9, 2025, 4:40 PM

#

meanwhile I've never been able to trigger the implicit cache ever

restive locust May 9, 2025, 4:41 PM

#

it works in our chatroom

#

proof from my testing yesterday KEKcry

wheat quest May 9, 2025, 4:45 PM

#

cache modCheck

abstract plover May 9, 2025, 4:50 PM

#

wheat quest meanwhile I've never been able to trigger the implicit cache ever

Same

restive locust May 9, 2025, 4:56 PM

#

proof !

#

prime wave May 9, 2025, 5:42 PM

#

@restive locust does it ONLY work in the chatroom, because I'm sending IDENTICAL prompts and there is no cache hit. I assume that if the prompt is exactly the same, there would be a cache hit.

restive locust May 9, 2025, 5:42 PM

#

There's nothing OpenRouter is doing for it to work or not work

#

it's not consistent or guaranteed even

#

sometimes you have to send the same thing 3, 4, 5 times for it to work

#

if you keep going on a long multiturn convo, sometimes it works and hits like half of the convo

#

it also helps if you send the requests pretty quickly one after the other

prime wave May 9, 2025, 5:44 PM

#

okay, so that sounds like basically I shouldn't count on it then.

boreal island May 9, 2025, 5:48 PM

#

abstract plover Implicit caching has dynamic TTL and is approx 6 min

Could use an extension if using ST @floral skiff https://github.com/OneinfinityN7/Cache-Refresh-SillyTavern

GitHub

GitHub - OneinfinityN7/Cache-Refresh-SillyTavern: Add automatic cac...

Add automatic cache refresh for chat completions to SillyTavern - OneinfinityN7/Cache-Refresh-SillyTavern

copper pilot May 9, 2025, 8:04 PM

#

boreal island Could use an extension if using ST <@1058470256420602016> https://github.com/One...

That's Claude only.

boreal island May 9, 2025, 8:05 PM

#

copper pilot That's Claude only.

Won't it work with any models that have a TTL-based cache? Damn

copper pilot May 9, 2025, 8:06 PM

#

oh my bad I didn't read it (I haven't used it, just remembered someone made it for Claude)

While designed primarily for Claude Sonnet, it works with other models as well.
sounds like it will just send again every x minutes

abstract plover May 9, 2025, 11:32 PM

#

unusuable right now

#

damn

royal ocean May 10, 2025, 3:10 AM

#

It literally is unusable ☹️ Slow, overthinking, etc.

#

We need a thinking budget at least

merry geode May 10, 2025, 3:17 AM

#

royal ocean It literally is unusable ☹️ Slow, overthinking, etc.

Yeah it's so slow now lol

arctic vessel May 10, 2025, 3:17 AM

#

floral skiff <@165587622243074048> Caching not working with ai studio after ignoring vertex ...

Do you have a sample request? Or are you asking about the implicit caching?

plush bridge May 10, 2025, 5:40 AM

#

royal ocean We need a thinking budget at least

The model supports thinking budget including zero budget right? Is it not supported in OR?

#

https://ai.google.dev/gemini-api/docs/thinking#set-budget

abstract plover May 10, 2025, 5:55 AM

#

plush bridge The model supports thinking budget including zero budget right? Is it not suppor...

thinking budget is only for 2.5 flash afaik

plush bridge May 10, 2025, 5:57 AM

#

abstract plover thinking budget is only for 2.5 flash afaik

Ah I see. This was not obvious in the docs. As usual.

#

Vertex AI is quite explicit on only Flash, but Google AI Studio doesn't mention it.

#

so i tested @google/genai, setting thinkingBudget = 0 for Gemini 2.5 Pro doesn't actually cause any errors, but it indeed doesn't stop the model from thinking. interesting behavior.

#

maybe they do plan to support it in the future

floral skiff May 10, 2025, 6:21 AM

#

arctic vessel Do you have a sample request? Or are you asking about the implicit caching?

Have it within roo code and it’s everytime nearly the same. Pricing at OR fits fit the normal pricing without caching (implicit/explicit).

The tokensize was made down to 2048 for 2.5 pro and 1024 for flash by Google

„To make more requests eligible for cache hits, we reduced the minimum request size for 2.5 Flash to 1024 tokens and 2.5 Pro to 2048 tokens.“

copper pilot May 10, 2025, 5:52 PM

#

Speed is okay today, but implicit caching is shaky. Had it work for the first few times then suddenly stopped, even during swipes.

true token May 10, 2025, 8:09 PM

#

I have the impression that new Pro 05-06 consumes slightly more thinking tokens than before

merry geode May 10, 2025, 8:12 PM

#

true token I have the impression that new Pro 05-06 consumes slightly more thinking tokens ...

You would be right, and I wish it was "slightly more"

indigo jasper May 10, 2025, 10:06 PM

#

Sigh, I'm disappointed in the new 2.5 pro

#

Aider bench confirms that it's taking 2x as long to complete each task.

#

basically the same as my experience

#

sorry, 3x as long

#

Seconds per case : 165.3 (new)
Seconds per case : 45.3 (new)

abstract plover May 10, 2025, 10:09 PM

#

Yup same experience

#

cant do much , will have to use this shitter model.

#

Nice way for google to curb multiple requests and make each requests cost more

true token May 10, 2025, 10:26 PM

#

yes

#

at least for my coding use cases it is still very good, slower yes

midnight venture May 10, 2025, 10:50 PM

#

abstract plover cant do much , will have to use this shitter model.

Can’t you just go back to exp?

merry geode May 10, 2025, 10:52 PM

#

I feel like this version is smarter, at least for coding, but it is way slower than the previous version

abstract plover May 10, 2025, 11:04 PM

#

midnight venture Can’t you just go back to exp?

Nope , no endpoints for old models.

abstract plover May 10, 2025, 11:04 PM

#

merry geode I feel like this version is smarter, at least for coding, but it is way slower t...

way slower menas more thinking , more cost.

sturdy ether May 10, 2025, 11:04 PM

#

abstract plover Nope , no endpoints for old models.

yup, yes endpoints for old models

merry geode May 10, 2025, 11:04 PM

#

abstract plover way slower menas more thinking , more cost.

I am aware, I think I would've preferred some versioning so we can use the older version if we need quick answers

abstract plover May 10, 2025, 11:04 PM

#

sturdy ether yup, yes endpoints for old models

there are no endpoints for old models

sturdy ether May 10, 2025, 11:05 PM

#

abstract plover there are no endpoints for old models

there are endpoints for old models

merry geode May 10, 2025, 11:05 PM

#

now 2.5 pro is less attractive for my use case

abstract plover May 10, 2025, 11:05 PM

#

sturdy ether there are endpoints for old models

wher?

sturdy ether May 10, 2025, 11:05 PM

#

abstract plover wher?

on major platforms

abstract plover May 10, 2025, 11:05 PM

#

sturdy ether on major platforms

can you give the link?

sturdy ether May 10, 2025, 11:05 PM

#

    "OpenRouter Free": {
      model: "google/gemini-2.5-pro-exp-03-25",
      fixedPrice: OR_PRICE,
    },
    "Vertex": {
      model: "google/gemini-2.5-pro-exp-03-25",
      fixedPrice: equivalentPrice(1000),
    },
    "Google": {
      model: "gemini-2.5-pro-exp-03-25",
      fixedPrice: equivalentPrice(25),
      maxTokens: 250000,
    },

#

https://openrouter.ai/google/gemini-2.5-pro-exp-03-25

Gemini 2.5 Pro Experimental - API, Providers, Stats

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. Run Gemini 2.5 Pro Experimental with API

#

heavily limited on openrouter

#

somewhat limited on ai studio

#

barely limited on vertex

abstract plover May 10, 2025, 11:06 PM

#

this is still the new model

sturdy ether May 10, 2025, 11:06 PM

#

abstract plover this is still the new model

03 25

abstract plover May 10, 2025, 11:06 PM

#

OR didnt fix the name

midnight venture May 10, 2025, 11:06 PM

#

abstract plover OR didnt fix the name

No, it’s 03 25

sturdy ether May 10, 2025, 11:06 PM

#

sturdy ether ```ts "OpenRouter Free": { model: "google/gemini-2.5-pro-exp-03-25", ...

it is accessible on all platforms as 03 25 ^

midnight venture May 10, 2025, 11:06 PM

#

exp didn’t update

abstract plover May 10, 2025, 11:07 PM

#

@restive locust can you confrim?

midnight venture May 10, 2025, 11:07 PM

#

But idk why people think it did

midnight venture May 10, 2025, 11:07 PM

#

abstract plover <@165587622243074048> can you confrim?

You can just go on the website where the model comes from and check

abstract plover May 10, 2025, 11:07 PM

#

Because logan said there are no endpoints for old models

restive locust May 10, 2025, 11:08 PM

#

abstract plover <@165587622243074048> can you confrim?

exp didn’t update

#

the experimental endpoint does not point to the new model. Only preview

from our vertex rep

abstract plover May 10, 2025, 11:11 PM

#

Hmm got it

merry geode May 11, 2025, 12:57 AM

#

Ugh this model is such a pain to use now

#

sometimes it thinks for so long that it times out

potent coral May 11, 2025, 1:13 AM

#

sturdy ether https://openrouter.ai/google/gemini-2.5-pro-exp-03-25

It's not good choice, they heavily limited it.

I hope someone from OR actually contacting googl and said to them that their updated model are worse for a lot of people than their older one then told them to redeploy the older checkpoint.

Making it so we have 2 endpoint and let exp gone replace by it.

novel flower May 11, 2025, 3:47 AM

#

potent coral It's not good choice, they heavily limited it. I hope someone from OR actually...

T_T

dim ibex May 11, 2025, 4:03 AM

#

gemini 2.5 pro is unusable for me.
see all request have the same cache, (looks like its only a system instruction)
all problem i hate are existing on gemini 2.5 (slow,expensive,nocache)
it seems google Implicit caching are very bad.
from screenshot its only 4 request, but i made several request like its a 10, its all have the same cached tokens / usage_cache

abstract plover May 11, 2025, 6:33 AM

#

merry geode sometimes it thinks for so long that it times out

exactly

abstract plover May 11, 2025, 6:33 AM

#

potent coral It's not good choice, they heavily limited it. I hope someone from OR actually...

exactly

#

Logan is working on thinking budget btw

#

I think they dumbed the model down to save cost and upped the thinking to mitigate some of the retardness

#

will give us thinking budget to milk more money

plush bridge May 11, 2025, 6:37 AM

#

abstract plover Logan is working on thinking budget btw

Thanks for the info.

abstract plover May 11, 2025, 6:38 AM

#

plush bridge Thanks for the info.

yeah , its available for flash they are implementing it for pro

dry ingot May 11, 2025, 9:09 AM

#

abstract plover will give us thinking budget to milk more money

thinking budget? so it will have the option to toggle thinking?

digital warren May 11, 2025, 9:27 AM

#

dry ingot thinking budget? so it will have the option to toggle thinking?

it seems to now be able to skip the thinking process entirely (I have seen this on 05-06 multiple times, but never on 03-25). It makes sense for super mundane question ("hello") but I have seen it do it on more complex stuff, too. Will see how it impacts overall capability.

abstract plover May 11, 2025, 9:35 AM

#

digital warren it seems to now be able to skip the thinking process entirely (I have seen this ...

Yup experencing this same thing

dry ingot May 11, 2025, 10:50 AM

#

abstract plover yeah , its available for flash they are implementing it for pro

any source on this btw?

#

the newest gemini 2.5 seems to overthink almost everything

abstract plover May 11, 2025, 11:07 AM

#

dry ingot any source on this btw?

Logan said it

ancient burrow May 11, 2025, 12:15 PM

#

gemini 2.5 pro is the first model that could fix overlapping UI elements in an app I gave it

#

pygame app

boreal island May 11, 2025, 7:07 PM

#

restive locust > the experimental endpoint does not point to the new model. Only preview from ...

Just to make it absolutely clear, https://openrouter.ai/google/gemini-2.5-pro-exp-03-25

This points to the March variant?

Gemini 2.5 Pro Experimental - API, Providers, Stats

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. Run Gemini 2.5 Pro Experimental with API

restive locust May 11, 2025, 7:14 PM

#

boreal island Just to make it absolutely clear, https://openrouter.ai/google/gemini-2.5-pro-ex...

yep

digital warren May 11, 2025, 7:14 PM

#

boreal island Just to make it absolutely clear, https://openrouter.ai/google/gemini-2.5-pro-ex...

restive locust May 11, 2025, 7:15 PM

#

it likely won't last though\

restive locust May 11, 2025, 7:15 PM

#

digital warren

TBF they alias / forward the march preview endpoint to may

#

just not the exp

boreal island May 11, 2025, 7:15 PM

#

cringeharold Pain

digital warren May 11, 2025, 7:16 PM

#

restive locust TBF they alias / forward the march preview endpoint to may

ugh, i hate when they do that. aliasing a generic name (e.g. 2.5 pro preview) is fine, but if I explicitly call 03-25, then an alias to a non 03-25 is dodgy

boreal island May 11, 2025, 7:17 PM

#

digital warren ugh, i hate when they do that. aliasing a generic name (e.g. 2.5 pro preview) is...

Don't worry everyone does

copper pilot May 11, 2025, 7:18 PM

#

At the very least I would like for Google to return the actual model name in modelVersion of the response. I.e. aliasing is fine (as opposed to outright error), but tell us what it's aliased to.

boreal island May 11, 2025, 7:19 PM

#

Frustrating part is everyone went silent on this from Google's end

midnight venture May 11, 2025, 7:21 PM

#

boreal island Frustrating part is everyone went silent on this from Google's end

I feel there’s a small team handling most of this and it everyone else is just sitting in the dark

digital warren May 11, 2025, 7:21 PM

#

makes any data collection a pain in the butt. need to carefully inspect each timestamp and cross reference.

boreal island May 11, 2025, 7:26 PM

#

midnight venture I feel there’s a small team handling most of this and it everyone else is just s...

Na it's all hands on deck for I/O so they're probably making mental notes to hopefully have this not repeat going forwards

#

Well that's the guess anyway

#

Yours is as good as mine

novel flower May 11, 2025, 9:07 PM

#

restive locust it likely won't last though\

1 more month pleaseeeeeeeeeeee

abstract plover May 11, 2025, 9:30 PM

#

so many rate limits , insnae

plush bridge May 12, 2025, 4:09 AM

#

digital warren ugh, i hate when they do that. aliasing a generic name (e.g. 2.5 pro preview) is...

Technically it's still a preview model so Google is entitled to point it to a new version. They won't do it with a stable model I assume.

Screenshot_2025-05-12-12-08-58-241_com.android.chrome-edit.jpg

#

But I agree Google is lacking experience in terms of rolling out models compared to OAI or Anthropic.

abstract plover May 12, 2025, 6:46 AM

#

2.5 pro is basically unusuable right now

plush bridge May 12, 2025, 7:03 AM

#

true

plush bridge May 12, 2025, 7:38 AM

#

i think OR should update the coloring for uptime, i wouldn't consider 96.61% to be green, maybe slightly more yellowish?

plush bridge May 12, 2025, 8:34 AM

#

lol google ai docs is also down it seems, and dashboard is returning 500

dry ingot May 12, 2025, 8:46 AM

#

(Google AI Studio) Provider returned error: {"error":{"code":500,"status":"INTERNAL"}}

slim dome May 12, 2025, 12:38 PM

#

Hello gemini down for u too?

restive ridge May 12, 2025, 12:38 PM

#

I am interested in setting the thinking budget to 0 to add iteration speed

carmine spoke May 12, 2025, 12:55 PM

#

so gemini is down rn then?

slim dome May 12, 2025, 1:00 PM

#

Ye

carmine spoke May 12, 2025, 1:01 PM

#

for how long?

restive ridge May 12, 2025, 1:04 PM

#

it's fine here

floral skiff May 12, 2025, 2:36 PM

#

@restive locust Any solution?

slim dome May 12, 2025, 2:38 PM

#

Gemini still out for me too

midnight venture May 12, 2025, 2:40 PM

#

Gemini is heavily rated limited on OR, also through AI Studio i think
But vertex has been slightly better from my experience

restive locust May 12, 2025, 2:41 PM

#

midnight venture Gemini is heavily rated limited on OR, also through AI Studio i think But verte...

no we're not being rate limited on the preview endpoint

restive locust May 12, 2025, 2:41 PM

#

floral skiff <@165587622243074048> Any solution?

this is a google error unfortunately

floral skiff May 12, 2025, 2:41 PM

#

midnight venture Gemini is heavily rated limited on OR, also through AI Studio i think But verte...

Im using preview paid one. No rate limit there

midnight venture May 12, 2025, 2:41 PM

#

restive locust no we're not being rate limited on the preview endpoint

Oh

floral skiff May 12, 2025, 2:41 PM

#

restive locust this is a google error unfortunately

Why it’s working with directly api access ?

midnight venture May 12, 2025, 2:41 PM

#

floral skiff Im using preview paid one. No rate limit there

Well OR is getting rate limited on that endpoint (even if you paid)

restive locust May 12, 2025, 2:41 PM

#

no we're not

floral skiff May 12, 2025, 2:41 PM

#

midnight venture Well OR is getting rate limited on that endpoint (even if you paid)

Not 429 issue

midnight venture May 12, 2025, 2:43 PM

#

Oops

midnight venture May 12, 2025, 2:43 PM

#

floral skiff Not 429 issue

Yeah they aren’t limited on the preview endpoint, my bad

floral skiff May 12, 2025, 2:43 PM

#

restive locust this is a google error unfortunately

Are you sure ? As it’s related to a timeout error 🤔

midnight venture May 12, 2025, 2:43 PM

#

Sorry I completely misread it

slim dome May 12, 2025, 2:53 PM

#

Why i have 429 return i have account with credits ?

slim dome May 12, 2025, 3:29 PM

#

I have errorsjson.loads

#

@restive locust

#

Do you have an idea?

restive locust May 12, 2025, 4:01 PM

#

slim dome Why i have 429 return i have account with credits ?

#announcements message

midnight venture May 12, 2025, 4:31 PM

#

@restive locust it only affects exp or also preview?

restive locust May 12, 2025, 4:32 PM

#

only exp

#

anything I can do to make that announcement clearer?

slim dome May 12, 2025, 4:39 PM

#

restive locust https://discord.com/channels/1091220969173028894/1092729520181739581/13715175612...

Arf okay

#

New quota ?

midnight venture May 12, 2025, 4:45 PM

#

restive locust anything I can do to make that announcement clearer?

Announcement is actually clear, sorry for pressing you with questions but I somehow connected the preview conversation we had with this exp announcement
My bad

restive locust May 12, 2025, 4:45 PM

#

no worries! I added a sentence to note that it doesn't impact preview

#

if you had the question I am sure others would too

novel flower May 12, 2025, 5:01 PM

#

what they doing to my pro exp

#

😭

distant shell May 12, 2025, 5:01 PM

#

well free high demand model, I'm surprised they haven't cut it off already

restive locust May 12, 2025, 5:49 PM

#

Shipped implicit caching and reasoning summaries through vertex, model should be more stable now

celest idol May 12, 2025, 7:42 PM

#

plush bridge Anyone else find Gemini 2.5 Pro not great in practice? It is consistently worse ...

Nope, but the tasks i useare dependent on training data (godot)

#

When i do normal webdev i find its better but deepseek r1 and 3.5 can compere

#

3.7 is gold too

carmine spoke May 13, 2025, 12:57 AM

#

so pro exp is being depreciated?

novel flower May 13, 2025, 12:59 AM

#

not yet

carmine spoke May 13, 2025, 1:02 AM

#

wdym not yet? its only erroring out now with no actual response it seems unusable

novel flower May 13, 2025, 1:04 AM

#

carmine spoke wdym not yet? its only erroring out now with no actual response it seems unusabl...

#

seems its down on ai studio @restive locust , also its very rate limited today @carmine spoke

#

@carmine spoke #announcements message

carmine spoke May 13, 2025, 1:05 AM

#

yes i saw the announcement with it being further rate limit and it being odwn for half a day + it seems like they are getting rid of it

novel flower May 13, 2025, 1:08 AM

#

i hope not, im still using it on vertex i hope its back up tomorrow on ai studio

#

i be sad if they completely remove it

potent coral May 13, 2025, 1:15 AM

#

I think people come back to using exp because the new version aren't the same as previous pro preview lol

novel flower May 13, 2025, 1:15 AM

#

probably

#

any news from ai studio team @restive locust ?

restive locust May 13, 2025, 1:15 AM

#

nope

novel flower May 13, 2025, 1:16 AM

#

sadge

restive locust May 13, 2025, 1:16 AM

#

they are seeing what they can do, but it does sound like they need the capacity for the paid endpoint

#

the 429 error now directs you to the paid preview model

#

"You exceeded your current quota. Please migrate to Gemini 2.5 Pro Preview (models/gemini-2.5-pro-preview-03-25) for higher quota limits."

novel flower May 13, 2025, 1:17 AM

#

noooo please give me 1 more week 😔

potent coral May 13, 2025, 1:28 AM

#

Could we get representative to talk with Google so they also deployed the older version of pro preview?

I mean they can deploy multiple sonnet version, are there reason for them to not be able to host multiple pro preview version.

restive locust May 13, 2025, 4:04 AM

#

these are preview models, sonnet models are production models

#

they are not comparable

rancid stream May 13, 2025, 5:06 AM

#

restive locust "You exceeded your current quota. Please migrate to Gemini 2.5 Pro Preview (mode...

yes but this is the message direct from google. not OR. I tried bypassing OR and hitting this as of today or maybe it was yesterday.

novel flower May 13, 2025, 5:21 AM

#

my vertex key is still going hope i dont get rate limited

#

u.u

carmine spoke May 13, 2025, 5:39 AM

#

if it does get deleted is there any similar free models?

plush bridge May 13, 2025, 7:05 AM

#

i have no problem with gemini-2.5-pro-exp-03-25 via @google/genai directly. occasionally 429, but otherwise pretty fast response (faster than yesterday in fact)

plush bridge May 13, 2025, 7:58 AM

#

nvm i take that back, i am getting all 429 now...

slim dome May 13, 2025, 8:24 AM

#

😭😭

abstract shoal May 13, 2025, 8:55 AM

#

I'm receiving empty strings as response now.

plush bridge May 13, 2025, 9:05 AM

#

{"error":{"code":500,"message":"An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting","status":"INTERNAL"}}

anyone having issues with gemini 2.5 pro tool call? i get random 500 errors but i am not sure what is the issue. same prompt sometimes work but sometimes return 500.

potent coral May 13, 2025, 9:13 AM

#

restive locust these are preview models, sonnet models are production models

Is it tho? Seems like google already make it into production from the way is it

See it already on other goolge product, even they already collaborated with some company for their AI coding production.

slim dome May 13, 2025, 9:13 AM

#

If we add Google Studio a free API key as a provider, does OpenRouter accumulate its API and that of Google?

north plume May 13, 2025, 2:51 PM

#

how come the pro preview previously didnt have reasoning, and now it does?

potent coral May 13, 2025, 3:06 PM

#

north plume how come the pro preview previously didnt have reasoning, and now it does?

Previously they have it but it's on the background

midnight venture May 13, 2025, 3:48 PM

#

It’s strange that when trying to use exp through OR, you get rate limited with a specific “OpenRouter traffic is heavily throttled” message (even with BYOK)
But when you run through vertex directly you don’t have any issue at all

#

https://tenor.com/view/shrek-reaction-really-gif-14971007105596864554

Tenor

plush bridge May 13, 2025, 3:57 PM

#

midnight venture It’s strange that when trying to use exp through OR, you get rate limited with a...

No 429 on vertex?

midnight venture May 13, 2025, 4:00 PM

#

plush bridge No 429 on vertex?

Nope

#

Sometimes a 500 will come up here and there, but mostly smooth sailing

#

OR on the other side just flat out doesn’t work

plush bridge May 13, 2025, 4:00 PM

#

midnight venture Nope

nice let me try. i am using the google official sdk, not sure how to get it to work with vertex

midnight venture May 13, 2025, 4:01 PM

#

plush bridge nice let me try. i am using the google official sdk, not sure how to get it to w...

I just switched roo to Vertex instead of OR

wheat quest May 13, 2025, 4:01 PM

#

Vertex has a 10RPM quota across all Gemini experimental models, so OR permanently has no quota for it given the amount of demand.

midnight venture May 13, 2025, 4:03 PM

#

wheat quest Vertex has a 10RPM quota across **all Gemini experimental models**, so OR perman...

I just checked, seems 10 RPM yes

plush bridge May 13, 2025, 4:06 PM

#

midnight venture I just checked, seems 10 RPM yes

429?

midnight venture May 13, 2025, 4:07 PM

#

plush bridge 429?

I’m not getting a 429

#

I did get a lot of 500s

plush bridge May 13, 2025, 4:07 PM

#

Vertex does use a different set of crentials from MLDev it seems, so maybe we can double the quota by using both. 🤔

midnight venture May 13, 2025, 4:07 PM

#

I’m way over 10 RPM lol

plush bridge May 13, 2025, 4:07 PM

#

let me try

plush bridge May 13, 2025, 4:11 PM

#

midnight venture I’m way over 10 RPM lol

are you using the same api for vertex as ai studio? or are you authenticating with gcloud cli?

slim dome May 13, 2025, 4:14 PM

#

Hello, why is the 2.5 pro exp model blocked for those who have 10 credits even when we import our key from gemini ia studio.??

midnight venture May 13, 2025, 4:14 PM

#

plush bridge are you using the same api for vertex as ai studio? or are you authenticating wi...

I’m using what roo code uses under the hood, so let me check

slim dome May 13, 2025, 4:14 PM

#

@restive locust

midnight venture May 13, 2025, 4:14 PM

#

slim dome Hello, why is the 2.5 pro exp model blocked for those who have 10 credits even w...

It’s not gonna work through OR, they’re blocked

#

You can get around it with Vertex (I think, works for me but not verified)

slim dome May 13, 2025, 4:15 PM

#

But i have a gemini api key

midnight venture May 13, 2025, 4:15 PM

#

slim dome But i have a gemini api key

It doesn’t matter

#

I also have a BYOK setup and it doesn’t change the outcome

plush bridge May 13, 2025, 4:15 PM

#

i think it is the same api key for google ai studio and vertex, let me check...

slim dome May 13, 2025, 4:16 PM

#

Yes on my account with more than 10$ and add my gemini key it's working

#

But not on my account with less than 10$

midnight venture May 13, 2025, 4:16 PM

#

plush bridge i think it is the same api key for google ai studio and vertex, let me check...

I’m using the “@anthropic-ai/vertex-sdk”

midnight venture May 13, 2025, 4:16 PM

#

slim dome Yes on my account with more than 10$ and add my gemini key it's working

It won’t work for me and I have more than $10

#

So I don’t think it’s a question of credits

slim dome May 13, 2025, 4:17 PM

#

Working for me are u sure ??

#

I have a return message limited to people who have more than 10$...

#

But want to use my own api key

#

Why intégrations was blocked

#

Is blocked

abstract shoal May 13, 2025, 6:33 PM

#

Does Gemini 2.5 Pro available as API in Google AI Studio? I can only see older flash versions

wheat quest May 13, 2025, 6:46 PM

#

Google is currently scrubbing all references to 2.5 Pro Experimental from their docs, it's very likely they'll pull it entirely (my guess is during Google IO next week)

wheat quest May 13, 2025, 7:16 PM

#

@restive locust google just yoinked all quota for all users on experimental on AI Studio

distant shell May 13, 2025, 8:40 PM

#

wheat quest Google is currently scrubbing all references to 2.5 Pro Experimental from their ...

makes sense, launch 2.5 pro officially, remove exp and preview

#

well preview becomes launched

slim dome May 13, 2025, 8:49 PM

#

midnight venture May 13, 2025, 8:58 PM

#

slim dome

But vertex still works?

slim dome May 13, 2025, 8:59 PM

#

Idk

abstract plover May 13, 2025, 9:25 PM

#

2.5 pro is gonna go public on i/0

novel flower May 13, 2025, 10:59 PM

#

Vertex still works?

plush bridge May 14, 2025, 5:19 AM

#

abstract plover 2.5 pro is gonna go public on i/0

You mean out of experimental / preview into GA?

abstract plover May 14, 2025, 5:19 AM

#

plush bridge You mean out of experimental / preview into GA?

yeah

plush bridge May 14, 2025, 5:20 AM

#

I wasn't able to test vertex because I don't have access to vertex express mode (I'm an existing GCP user). Will test out the proper gcloud cli authentication required for vertex api soon.

plush bridge May 14, 2025, 6:10 AM

#

i just tested via vertex ai and gcloud cli authentication, gemini-2.5-pro-exp-03-25 still works there. but strangely, all the stats (except QPS) are empty.

#

if you have vertex express mode access (via the same Gemini api key as Googel AI Studio), you likely can also use it, though i can't test it.

novel flower May 14, 2025, 6:39 AM

#

how to use gemini-2.5-pro-exp-03-25 , get vertex key?

plush bridge May 14, 2025, 6:41 AM

#

novel flower how to use gemini-2.5-pro-exp-03-25 , get vertex key?

if you have never used GCP before, you should be able to access the express mode and use the same api key as google ai studio: https://cloud.google.com/vertex-ai/generative-ai/docs/start/express-mode/overview

novel flower May 14, 2025, 6:44 AM

#

plush bridge if you have never used GCP before, you should be able to access the express mode...

thanks sir i will follow your guide

carmine spoke May 14, 2025, 8:55 AM

#

i wonder how long "paused" means

slim dome May 14, 2025, 11:18 AM

#

How use vertex ?

novel flower May 14, 2025, 11:55 AM

#

carmine spoke i wonder how long "paused" means

O.o

pallid wren May 14, 2025, 1:30 PM

#

carmine spoke i wonder how long "paused" means

I'm going to guess until I/O day when they release something else and 2.5 pro is no longer the new hotness

#

midnight venture May 14, 2025, 2:32 PM

#

slim dome How use vertex ?

Just use the vertex anthropic sdk it should be pretty easy that way

#

I still don’t understand why vertex lives in a realm of its own when it comes to models and limits
They should unify ai studio and vertex into a single product

#

Or atleast the AI side of stuff

slim dome May 14, 2025, 3:07 PM

#

How get an api key i dont understand... @midnight venture

midnight venture May 14, 2025, 3:17 PM

#

slim dome How get an api key i dont understand... <@388487848451637249>

2.    ⁠Open the Google Cloud Console.
3.    ⁠Create a new Google Cloud project.
4.    ⁠Enable billing for your newly created Google Cloud project.
5.    ⁠Enable the Vertex AI API.
6.    ⁠Enable the Gemini API from the API overview page.
7.    ⁠In your project dashboard, navigate to APIs & Services → Credentials.
8.    ⁠Click "Create Credentials" → "API Key".
9.    ⁠Copy the generated API key and save it securely```

potent coral May 14, 2025, 3:17 PM

#

midnight venture I still don’t understand why vertex lives in a realm of its own when it comes to...

Nah..
It's better if they separated, I already see the difference between them

midnight venture May 14, 2025, 3:17 PM

#

Took this from Reddit, idk how accurate it is
I think there’s some extra steps involving service accounts but I’m not sure tbh

midnight venture May 14, 2025, 3:17 PM

#

potent coral Nah.. It's better if they separated, I already see the difference between them

What’s the difference between the two?

potent coral May 14, 2025, 3:20 PM

#

Rate limit and capacities.

Then there also things that you need to check and figure it out by yourself, this are the most important difference

midnight venture May 14, 2025, 3:22 PM

#

potent coral Rate limit and capacities. Then there also things that you need to check and fi...

From my understanding the differences in rate limits and capacity are minimal

#

Idk about what the check and figure out part is

slim dome May 14, 2025, 3:35 PM

#

Thanks but i saw lot of people cry on google for jump their invoice with billing activation

midnight venture May 14, 2025, 3:39 PM

#

slim dome Thanks but i saw lot of people cry on google for jump their invoice with billing...

I never had that issue, but then again I set a $1 spend limit so idk how google could even charge me lol

#

I’ve also been using gcloud for like almost 10 years so maybe it’s different for me
I’m still a noob at it but I got a lot of the basics over a long time ago

slim dome May 14, 2025, 3:46 PM

#

How could u set a 1$ limit it's very hard to see all option vertex have lot of thing's that we can do

plush bridge May 14, 2025, 4:28 PM

#

midnight venture ```1. ⁠Create or log into your Google Cloud account. 2. ⁠Open the Google C...

This looks wrong. For Google cloud project route you don't have any api key, you authenticate via gcp cli. For API key you need to use vertex express mode which I linked docs above.

midnight venture May 14, 2025, 4:29 PM

#

plush bridge This looks wrong. For Google cloud project route you don't have any api key, you...

I never used gcp cli so not sure how that works

plush bridge May 14, 2025, 4:30 PM

#

midnight venture I never used gcp cli so not sure how that works

You linked vertex AI using anthropic sdk, which is simpler. It also uses gcloud auth application-default login

#

https://docs.anthropic.com/en/api/claude-on-vertex-ai#making-requests

Anthropic

Vertex AI API - Anthropic

Anthropic’s Claude models are now generally available through Vertex AI.

#

I had no idea Claude models were available on Vertex AI. TIL.

#

So now you can have a first-party model served by another first-party provider that didn't develop the model. 🤯

midnight venture May 14, 2025, 4:39 PM

#

plush bridge You linked vertex AI using anthropic sdk, which is simpler. It also uses `gcloud...

I didn’t know, that’s cool

novel flower May 14, 2025, 7:49 PM

#

slim dome Thanks but i saw lot of people cry on google for jump their invoice with billing...

As long as you use the free endpoint why would they charge you?

copper pilot May 14, 2025, 8:16 PM

#

👀 (cropped)

novel flower May 14, 2025, 8:21 PM

#

o.o

#

👀

copper pilot May 14, 2025, 8:23 PM

#

.

carmine spoke May 14, 2025, 9:01 PM

#

whats the most similar model to 2.5 while its down? thats also free, i tried a few but they all seem kinda worse

novel flower May 14, 2025, 9:01 PM

#

that's a good question

ancient burrow May 14, 2025, 9:34 PM

#

carmine spoke whats the most similar model to 2.5 while its down? thats also free, i tried a f...

It's the best out there. There's nothing as good.

restive ridge May 14, 2025, 10:00 PM

#

I think Claude 3.7 is still extremely popular but I have never tried it.

abstract plover May 14, 2025, 11:05 PM

#

carmine spoke whats the most similar model to 2.5 while its down? thats also free, i tried a f...

flash thinking

novel flower May 14, 2025, 11:14 PM

#

abstract plover flash thinking

huh? which one?

novel flower May 14, 2025, 11:14 PM

#

ancient burrow It's the best out there. There's nothing as good.

T_T

abstract plover May 14, 2025, 11:14 PM

#

novel flower huh? which one?

gemini 2.5 flash thinking

novel flower May 14, 2025, 11:15 PM

#

abstract plover gemini 2.5 flash thinking

not free though isn't it?

abstract plover May 14, 2025, 11:15 PM

#

comes close and I mean like 60% of that quaility

abstract plover May 14, 2025, 11:15 PM

#

novel flower not free though isn't it?

yeah i tsnot

novel flower May 14, 2025, 11:15 PM

#

yeah he asked for free

abstract plover May 14, 2025, 11:18 PM

#

ahh nvm

autumn agate May 15, 2025, 4:03 AM

#

copper pilot .

what gui is that for the api

restive locust May 15, 2025, 4:05 AM

#

autumn agate what gui is that for the api

SIllyTavern

autumn agate May 15, 2025, 4:06 AM

#

restive locust SIllyTavern

ty

abstract plover May 15, 2025, 4:57 AM

#

The gemini reasonings are worthless

#

I suspect they arent even summaries of the reasoning

dim ibex May 15, 2025, 6:40 AM

#

it seems the Gemini 2.5 preview now is becoming so much better compare to last week.

very smart in coding task (Gemini pro 2.5 vs Claude is like : Gundam vs robocop (claude) )
Fast ( sometimes it uses reasoning delta, sometimes not)
Cache is getting better, but there's still a room for big improvement, still not consistent for atleast 5 minutes, most of the time its only 2-3 minutes

i hope they make the cache system to renew if its actively being used (like claude)
efficiency Claude are still probably cheaper for long running task.

kudos to google team.

abstract plover May 15, 2025, 7:07 AM

#

yes I agree

novel flower May 15, 2025, 7:26 AM

#

dim ibex it seems the Gemini 2.5 preview now is becoming so much better compare to last w...

thanks

boreal island May 15, 2025, 12:34 PM

#

dim ibex it seems the Gemini 2.5 preview now is becoming so much better compare to last w...

Hearing rumours of Anthropic dropping something next week

#

PagMan

#

But yeah 3.7 can't cut it atm

abstract plover May 15, 2025, 12:39 PM

#

even grok is dropping a model or two

abstract plover May 15, 2025, 1:03 PM

#

insane rate limits on this model these days

graceful robin May 15, 2025, 1:25 PM

#

there's a banner on the model page explaining

#

https://openrouter.ai/google/gemini-2.5-pro-exp-03-25

Gemini 2.5 Pro Experimental - API, Providers, Stats

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. Run Gemini 2.5 Pro Experimental with API

carmine spoke May 15, 2025, 1:27 PM

#

isnt that saying just use the paid version the free is dead now?

graceful robin May 15, 2025, 1:29 PM

#

it's not an ex model yet, it has not ceased to be

carmine spoke May 15, 2025, 1:34 PM

#

the banner does say to use the paid version and twitter does have it paused

dim ibex May 15, 2025, 1:35 PM

#

boreal island Hearing rumours of Anthropic dropping something next week

hoping its better than 2.5 pro

boreal island May 15, 2025, 1:37 PM

#

Think Opus tier

graceful robin May 15, 2025, 1:37 PM

#

https://www.cnbc.com/2025/05/14/youtube-gemini-ai-feature-will-target-ads-when-viewers-most-engaged.html they had to cuck it a little to spare some compute for the real moneymaker

CNBC

YouTube announces Gemini AI feature to target ads when viewers are ...

"Peak Points" will allow advertisers to use Google's Gemini AI model to target ads to viewers when they are most engaged with a YouTube video.

midnight venture May 15, 2025, 2:23 PM

#

graceful robin https://www.cnbc.com/2025/05/14/youtube-gemini-ai-feature-will-target-ads-when-v...

Wow

#

Trough full of slop

midnight venture May 15, 2025, 2:24 PM

#

carmine spoke the banner does say to use the paid version and twitter does have it paused

Vertex still works so this is just for AI studio I assume

carmine spoke May 15, 2025, 2:25 PM

#

midnight venture Vertex still works so this is just for AI studio I assume

yea but vertex is a bit more complicated to sign up for than Ai studio its probably why its still up to get people buying into their stuff

umbral bane May 15, 2025, 5:21 PM

#

Does 2.5 pro support tool calling now?

true token May 15, 2025, 7:55 PM

#

dim ibex it seems the Gemini 2.5 preview now is becoming so much better compare to last w...

Gemini 2.5 preview is faster here too.

They just have had some serious problems. It was very slow.

kind condor May 15, 2025, 9:59 PM

#

umbral bane Does 2.5 pro support tool calling now?

it does and i think it always did

distant shell May 16, 2025, 3:01 AM

#

boreal island But yeah 3.7 can't cut it atm

Using 3.7 in Claude code has been a revelation, it's just so much better and more useful than ones built into editors.

#

But I think that's more a combination of the software and Claude.

novel flower May 16, 2025, 3:31 AM

#

distant shell Using 3.7 in Claude code has been a revelation, it's just so much better and mor...

o.o

#

thanks sir chapel @distant shell

distant shell May 16, 2025, 4:14 AM

#

it does help I paid for Claude max, and as such feel like I don't have to worry about nickle and diming myself

#

I've had 3 going at the same time, work + two personal projects

#

just setup a task list and let them go

#

as long as you're okay picking up the pieces if shit hits the fan

plush bridge May 16, 2025, 4:43 AM

#

distant shell just setup a task list and let them go

Curious for what's max duration can Claude Code go on autonomously and continuously, while making progress without human intervention?

distant shell May 16, 2025, 4:52 AM

#

plush bridge Curious for what's max duration can Claude Code go on autonomously and continuou...

Umm, well that kind of depends on how you want to define it. On time, I've let it go for a while (like over 30 minutes) but I can't say to the clock how long it was running because I was doing other things and it may have completed before then. The other aspect is sometimes their AI is congested and slower, so it isn't always 100% active even if still working. But if told to keep working, it having a clear todolist and actions to complete, and in general enough user instructions about when it should touch basis, it will do quite a bit for better or worse on its own. I've queued up a job from 0 with a fleshed out multi document design foundation, had it make a task list and go. It did about 30% of the total on its own by the time I came back in the morning. Mind you that 30% was the bulk of making something useful, well beyond what other tools have done in one shot. It was also not a frontend/web app.

#

There's no coded limit to what it can do for how long though like cursor and others. It will churn through tokens, and with the max plan, they have their own tracking so that happens at the API layer and not on the client. It's actually quite nice to get full claude capacity, versus other tools that restrict it, which is probably why it feels so dumb.

plush bridge May 16, 2025, 4:57 AM

#

distant shell Umm, well that kind of depends on how you want to define it. On time, I've let i...

Nice. So definitely can go beyond 10 minutes on average with a clear todo list?

#

I'm trying to figure out where Claude Code fits in to in the spectrum of Devin to Cursor.

distant shell May 16, 2025, 4:59 AM

#

its more a partner than a tool on that front

#

but not fully autonomous, though tbh not that far from it, with some type of meta layer to drive it

#

I've thought about creating a wrapper that uses gemini or something, and have it think about the big picture and meta stuff, and when claude comes back it does checks and verifies things and then sends claude back out

#

two cool things about claude code, built in todos (I've been leveraging it to put things in even for myself) and it can batch tool calls, so if it decides it needs to do a bunch of things at once, it can set them up to go, that includes changes and I think it can trigger sub agents, there's been at least one time where it batched a bunch of things, then went "I need to edit this file... looks at file.... oh I guess my batch task already did it..." heh I was like wat?

#

note I don't have many mcp servers, I have tried context7, it can be useful, but claude can search the web too and that can be just as useful in some cases, the postgres server is useful for having them discover data and schemas themselves when writing code that touches it

#

I barely explicitly give context (including a file) I almost always just refer to it by name or vague naming, or sometimes just describe what I think it is, much I would like a coworker, and let it find it iself. If I were paying for the api calls, I might be a bit more conservative, but I'm gonna make that $100 worth it

novel flower May 16, 2025, 5:32 AM

#

distant shell note I don't have many mcp servers, I have tried context7, it can be useful, but...

context7 is good but i think using crawl4ai and building RAG content for your llm might be worth a shot

plucky gazelle May 16, 2025, 11:39 AM

#

Quick question here... I tried using google vertex in openrouter but it just doesn't work. Can someone help?

sick latch May 16, 2025, 1:33 PM

#

How do I use latest gemini-2.5 pro weights through openrouter? Coz I think the gemini 2.5 pro listed on openrouter points to gemini-2.5-pro-preview-03-25 and not to gemini-2.5-pro-preview-05-06.

restive locust May 16, 2025, 1:36 PM

#

sick latch How do I use latest gemini-2.5 pro weights through openrouter? Coz I think the g...

you can see what endpoint we use in this icon here

#

our version points to the latest checkpoint. there is no way to hit the march checkpoint for the preview model

#

(this is a google limitation)

sick latch May 16, 2025, 1:38 PM

#

ohh, great. Thanks for correcting me. @restive locust

restive locust May 16, 2025, 1:38 PM

#

no worries!

sick latch May 16, 2025, 1:40 PM

#

@restive locust Do you find this most recent Gemini difficult to control, or is it just me? I've invested a lot of effort in carefully guiding it to produce the results in the format I want. Gemini has bothered me more than any other model.

restive locust May 16, 2025, 1:41 PM

#

I definitely think the newer reasoning models are worse at instruction following, yeah

#

I have seen people use gemini to plan / architect, and GPT-4.1 or Sonnet 3.5 to implement

wheat quest May 16, 2025, 2:56 PM

#

AI Studio has just rolled out batch requests for the Gemini 2.5 and 2.0 series of models

void elm May 16, 2025, 3:42 PM

#

j

restive locust May 17, 2025, 3:43 AM

#

2.5 Pro Experimental is officially deprecated

novel flower May 17, 2025, 3:47 AM

#

you will be missed

#

tubbySalute

merry geode May 17, 2025, 4:16 AM

#

I probably used hundreds of dollars worth of tokens

restive locust May 17, 2025, 4:18 AM

#

merry geode I probably used hundreds of dollars worth of tokens

yeah I did the math. it was a lot of inference. like a lot.

novel flower May 17, 2025, 4:20 AM

#

merry geode I probably used hundreds of dollars worth of tokens

same

midnight venture May 17, 2025, 7:39 AM

#

merry geode I probably used hundreds of dollars worth of tokens

We all did

#

Long live exp

#

Good model

novel flower May 17, 2025, 7:41 AM

#

good model

#

maybe we get to enjoy another good model in the future

plush bridge May 17, 2025, 8:02 AM

#

Looking forward to the next experimental / stealth model!

boreal island May 17, 2025, 10:20 AM

#

restive locust 2.5 Pro Experimental is officially deprecated

SadgeInTheRain

#

Vertex preview doesn't point to a different model like the google post says?

https://discuss.ai.google.dev/t/urgent-feedback-call-for-correction-a-serious-breach-of-developer-trust-and-stability-update-google-formally-responds-8-days-later/82399/54

Google AI Developers Forum

Urgent Feedback & Call for Correction: A Serious Breach of Develope...

This is very unfortunate. I spent a lot of time working on our use cases in the area of legal text comparison with gemini-2.5-pro-exp-03-25. The results were great. Now things are broken. I hope there’s a way to get back API access to 03-25.

novel flower May 17, 2025, 11:39 AM

#

plush bridge Looking forward to the next experimental / stealth model!

Same friend, same

void elm May 17, 2025, 8:05 PM

#

cause gemini is not LOTS OF MONEY..... ill start more video games now...
and enjoy me unemployency payment

#

and anime

#

until next vibecoding free sota model

dry ingot May 17, 2025, 8:17 PM

#

restive locust 2.5 Pro Experimental is officially deprecated

wtf

midnight venture May 17, 2025, 8:41 PM

#

boreal island Vertex preview doesn't point to a different model like the google post says? ht...

It’s not that easy, everything is pretty thrashed now

novel flower May 17, 2025, 8:57 PM

#

void elm cause gemini is not LOTS OF MONEY..... ill start more video games now... and enj...

Wtf invite me

novel flower May 17, 2025, 8:57 PM

#

void elm until next vibecoding free sota model

Same

void elm May 17, 2025, 8:57 PM

#

novel flower Wtf invite me

inv to what

novel flower May 17, 2025, 8:58 PM

#

midnight venture It’s not that easy, everything is pretty thrashed now

😦

novel flower May 17, 2025, 8:58 PM

#

void elm inv to what

Unemployment payment and video games

void elm May 17, 2025, 8:58 PM

#

ben stop it

copper pilot May 18, 2025, 12:02 AM

#

Whoaaaa I'm watching it stream thought summaries as a single bolded header plus paragraph every 3 seconds, meaning they're processing their thoughts near real time. There's 35 headers in this one for a total of 14.8k output.

mystic zodiac May 18, 2025, 12:54 AM

#

Hi !

#

Is there any way to make gemini-2.5-pro-preview on OR also give its reasoning tokens ?

graceful robin May 18, 2025, 2:37 AM

#

mystic zodiac Is there any way to make gemini-2.5-pro-preview on OR also give its reasoning to...

No, Google has to provide it and they only provide it to allowlisted large accounts

novel flower May 18, 2025, 4:16 AM

#

any similar model to gemini 2.5 03 25 for coding on openai or any other?

flint lion May 18, 2025, 5:23 AM

#

So what are the odds we'll see a 3.0 Gemini pro experimental sometime within the month?

novel flower May 18, 2025, 5:56 AM

#

flint lion So what are the odds we'll see a 3.0 Gemini pro experimental sometime within the...

maybe in i/o google?

flint lion May 18, 2025, 5:58 AM

#

novel flower maybe in i/o google?

That's pretty much what I was angling around

plush bridge May 18, 2025, 7:08 AM

#

If OpenAI, Claude and DeepSeek are good examples, it won't be 3.0, but a new checkpoint for 2.5.

#

gemini-2.5-pro-05-20 or something. GA version, not preview or experimental.

#

RIP experimental mentions and references are completely gone in the docs.

Screenshot_2025-05-18-15-11-31-090_com.android.chrome-edit.jpg

celest idol May 18, 2025, 10:46 AM

#

nooo

graceful robin May 18, 2025, 11:48 AM

#

Haven't scrubbed it everywhere 😅

plush bridge May 18, 2025, 1:53 PM

#

graceful robin Haven't scrubbed it everywhere 😅

Their marketing department is weird. I'm literally using it everyday and I still get ads for Gemini everyday.

wheat quest May 18, 2025, 6:09 PM

#

gemini-2.5-pro-deepthink POGGERS

restive locust May 18, 2025, 6:11 PM

#

wheat quest `gemini-2.5-pro-deepthink` <:POGGERS:467828321129070608>

claude ultrathink vibes

wheat quest May 18, 2025, 6:12 PM

#

no free tier though

pallid wren May 18, 2025, 10:32 PM

#

gemini-2.5-ultra-5-20

novel flower May 18, 2025, 11:42 PM

#

pallid wren gemini-2.5-ultra-5-20

??????

#

wow

#

sidkcmodel

boreal island May 19, 2025, 5:46 AM

#

pallid wren gemini-2.5-ultra-5-20

kekwuh

abstract plover May 19, 2025, 6:00 AM

#

gemini-2.5-ultra-pro-max-6-09

plush bridge May 19, 2025, 9:51 AM

#

lol the new Google AI Studio usage tab is classifying 2.5 Pro Preview as 2.5 Pro Exp

wheat quest May 19, 2025, 9:53 AM

#

plush bridge lol the new Google AI Studio usage tab is classifying 2.5 Pro Preview as 2.5 Pro...

It's lifting the data from the quota service, so gemini-2.0-pro-exp maps to gemini-exp-1206, gemini-2.0-pro-exp-02-05 and gemini-2.5-pro-exp-03-25, and gemini-2.5-pro-exp maps to gemini-2.5-pro-preview-03-25 and gemini-2.5-pro-preview-05-06.

plush bridge May 19, 2025, 9:55 AM

#

wheat quest It's lifting the data from the quota service, so `gemini-2.0-pro-exp` maps to `g...

i can imagine the Google AI Studio team taking every chance to cut corners and ship fast while the GCP team sighs and shakes head lol

#

must be a nightmare to get the GenAI APIs working with GCP infra

wheat quest May 19, 2025, 12:17 PM

#

Looks like Gemini is getting a urlContext built in tool that can fetch the contents of URLs to feed into the model.

#

the built in googleSearch tool is also getting the ability to specify a time range of results to search

plush bridge May 19, 2025, 2:38 PM

#

Damn. So many startups killed again.

#

Might need to think extra hard what AI to build now.

wheat quest May 19, 2025, 4:24 PM

#

the built in search tool also now allows specify a lat/lon location to geolocate searches
the Google SDKs are getting MCP support
you will be able to set your own video FPS to sample videos at instead of the fixed 1FPS
live API is getting multi speaker support

#

urlContext built in tool is now live.

{"contents":[{"role":"user","parts":[{"text":"Hi there! What are the headlines on https://bbc.com?"}]}],"generationConfig":{"thinkingConfig":{"includeThoughts":true},"temperature":0,"seed":0},"tools":{"urlContext":{}}}

📎 urlcontextresp.json

novel flower May 19, 2025, 5:10 PM

#

wheat quest * the built in search tool also now allows specify a lat/lon location to geoloca...

damn

languid crest May 20, 2025, 12:07 AM

#

Gemini has the most confusing names, everything else I can follow just fine

mellow turret May 20, 2025, 12:18 AM

#

Basically

Model + family number + fluff + date

Model is going to be Gemini
The currently relevant family numbers are 2.0 and 2.5
Some fluff we've seen:

exp: Experimental, free models. Huge no for production use (Google prohibits that in their terms), very few guarantees
preview: Slightly less experimental, paid models. Still not fit for production use, though at least google doesn't straight up prohibit it
thinking: Used in the 2.0 family, as these were not hybrid models. There's a separate 2.0 thinking model, unlike 2.5 Flash
flash:: Fast
pro: Slower but better
lite: Cheaper and worse than Flash

And then the date is given in format mm-dd. In actual production releases, they may use incremental numbers instead of dates (like they e.g. did with Gemini 1.5 Pro and Gemini 1.5 Pro 002)

novel flower May 20, 2025, 12:38 AM

#

thanks kyle

slow sage May 20, 2025, 7:52 AM

#

thanks kyle

open pond May 20, 2025, 7:57 AM

#

thanks kyle

wheat quest May 20, 2025, 8:18 AM

#

Gemini 2.5 Pro is getting audio output today at $20/million audio tokens

abstract plover May 20, 2025, 11:17 AM

#

Damn

plush bridge May 20, 2025, 11:19 AM

#

Damn

elder rain May 20, 2025, 12:50 PM

#

Damn

kindred matrix May 20, 2025, 2:25 PM

#

🥲

fringe rapids May 20, 2025, 2:34 PM

#

Damn

restive locust May 20, 2025, 2:35 PM

#

Damn

digital warren May 20, 2025, 3:41 PM

#

They have removed raw thoughts from aistudio, replacing it with summaries only. This is a major bummer 😦

boreal island May 20, 2025, 5:13 PM

#

digital warren They have removed raw thoughts from aistudio, replacing it with summaries only. ...

Of course they're doing that

#

They don't want people stealing reasoning via training keka

digital warren May 20, 2025, 5:21 PM

#

but I didn't train on it, I just liked reading it, it was valuable content.

spark obsidian May 20, 2025, 6:26 PM

#

That is a bummer.

boreal island May 20, 2025, 6:29 PM

#

Unfortunately they called it a feature earlier in I/O

#

Deadge

#

In other news flash 2.5 05-20 better than 2.5 Pro for RP keka

spark obsidian May 20, 2025, 6:30 PM

#

digital warren but I didn't train on it, I just liked reading it, it was valuable content.

The new Gemini Flash seems to have the full thinking in AI studio

wraith crown May 20, 2025, 8:10 PM

#

boreal island In other news flash 2.5 05-20 better than 2.5 Pro for RP <a:keka:118736111112578...

How? Did you test this?

clear parcel May 20, 2025, 8:21 PM

#

boreal island In other news flash 2.5 05-20 better than 2.5 Pro for RP <a:keka:118736111112578...

https://tenor.com/view/hot-day-are-you-ready-for-some-football-gif-16278812894835706568

Tenor

#

lmao flash is trash for RP

#

unless we doing RP in 2023 with the first LLMs

dry ingot May 21, 2025, 8:58 AM

#

wheat quest Gemini 2.5 Pro is getting audio output today at $20/million audio tokens

wow

#

no thinking budget for 2.5 pro?

wheat quest May 21, 2025, 9:16 AM

#

2.5 Pro thinking budget is coming in June, closer to when it goes GA

restive ridge May 21, 2025, 2:51 PM

#

Ah they pushed it back a month

#

Probably sucks then lol

solemn vigil May 22, 2025, 1:34 AM

#

anyone else find that the new pro preview absolutely sucks compared to the march version ? Its such a frustrating model to use now

potent coral May 22, 2025, 1:37 AM

#

solemn vigil anyone else find that the new pro preview absolutely sucks compared to the march...

Yes

#

Google has make the model to be more dumber by making it more tammer

novel flower May 22, 2025, 1:42 AM

#

solemn vigil anyone else find that the new pro preview absolutely sucks compared to the march...

yes you an 95% of the ai community

#

like @potent coral said enshittification

potent coral May 22, 2025, 1:43 AM

#

It's actually quite funny that the older version, which is smarter. Actually able to see bad in good and good in bad, don't totally rejecting the concept when you argue with it and provide a good argument when it did while still able see and understand the difference view and possibility.

But the new one just rejecting it without even providing a good arguments.

novel flower May 22, 2025, 1:43 AM

#

so now gemini 2.5 flash thinking is good or o4 mini

open pond May 22, 2025, 1:44 AM

#

they are def pushing more compute into the new shiny model for a few weeks (flash)

#

so keep that in mind :p although 2.5 pro > flash generally no matter wat imo

novel flower May 22, 2025, 1:45 AM

#

yeah for now 2.5 flash very good, in a few weeks it will get the same treatment and become useless again

#

2.5 pro 03 25 was so fucking good and so fast, i fucking miss it 😭

#

oh well when 2.5 flash goes to shit ill go with either claude 3.7 thinking or o4 mini/o3 if i still have free tokens

abstract plover May 22, 2025, 10:51 AM

#

damn this new model is dogshit

#

the new model is as smart as a rock

#

thought 120 seconds for a basic python task.

fringe rapids May 22, 2025, 9:45 PM

#

Yep, same experience here

novel flower May 22, 2025, 10:37 PM

#

abstract plover the new model is as smart as a rock

what use now sir

solemn vigil May 22, 2025, 11:57 PM

#

I'm glad im not the only one finding this. my twitter feed was uncharacteristically quiet on the matter but seems a common experience here & on /bard reddit

solemn vigil May 22, 2025, 11:58 PM

#

open pond so keep that in mind :p although 2.5 pro > flash generally no matter wat imo

price to performance IDK. flash is killing it for me still in the usecases where I would want to use flash, but pro sucks at pro usecases... pro now feels like a 10x price flash for 1.2x performance .... when previous iteration was a huge improvement for those usecases

novel flower May 22, 2025, 11:59 PM

#

solemn vigil price to performance IDK. flash is killing it for me still in the usecases where...

enshitification

solemn vigil May 23, 2025, 12:06 AM

#

novel flower enshitification

definitely feels that way

novel flower May 23, 2025, 12:06 AM

#

solemn vigil definitely feels that way

you're not alone, everyone feels the same

dry ingot May 23, 2025, 5:49 PM

#

still no thinking budget

wheat quest May 23, 2025, 7:11 PM

#

Google's going down the Anthropic route of providing signed thoughts to be able to reuse thoughts in subsequent requests.

ionic solar May 23, 2025, 8:05 PM

#

solemn vigil price to performance IDK. flash is killing it for me still in the usecases where...

I just came here investigating the same. We pushed out Gemini pro 2.5 a couple of weeks back and now everything is breaking in production. It is randomly stopping in between a response, refusing to do stuff, and sometimes just filling up the thinking response with repetitive garbage. Shocking move by google. Do you think flash is better than Sonnet 3.5 ?

novel flower May 23, 2025, 8:13 PM

#

ionic solar I just came here investigating the same. We pushed out Gemini pro 2.5 a couple o...

#1375116913109372968 message

ionic solar May 23, 2025, 8:26 PM

#

Thanks !

abstract plover May 23, 2025, 8:34 PM

#

wheat quest Google's going down the Anthropic route of providing signed thoughts to be able ...

huh wdym?

abstract plover May 23, 2025, 8:35 PM

#

ionic solar I just came here investigating the same. We pushed out Gemini pro 2.5 a couple o...

Having a similar experience. I assume its turbulence before making it GA.

#

Token speed dropped from 400 to 100~ now , which is an artifical limit. Summaries have much more BS and model is a bit dumber.

pallid wren May 23, 2025, 10:26 PM

#

abstract plover Token speed dropped from 400 to 100~ now , which is an artifical limit. Summarie...

The number of times I've seen the summary repeat the same thing over and over is too high.

ionic solar May 24, 2025, 7:02 AM

#

abstract plover Token speed dropped from 400 to 100~ now , which is an artifical limit. Summarie...

Its crazy. Was just trying to confirm what I am seeing . - i am seeing more errors - repetitive garbage, replies being cut short - rather than issues with logic.

hexed rapids May 24, 2025, 8:57 PM

#

RP/ERP has also worsened compared to EXP-03-25 (now with very long contexts it suffers from repetitions).
It seemed strange to me that Google was getting them all right!

novel flower May 24, 2025, 9:54 PM

#

hexed rapids RP/ERP has also worsened compared to EXP-03-25 (now with very long contexts it s...

welcome to the club, 03 25 was the best they dumb down the model.. now we wait

dry ingot May 24, 2025, 10:31 PM

#

novel flower welcome to the club, 03 25 was the best they dumb down the model.. now we wait

bro i'm tired of this shit i's insane

#

theu keep ruining good shit

novel flower May 24, 2025, 11:11 PM

#

dry ingot theu keep ruining good shit

Yes brother, maybe they will release the gemini 3.0 or 2.5 GA soon

dry ingot May 24, 2025, 11:11 PM

#

novel flower Yes brother, maybe they will release the gemini 3.0 or 2.5 GA soon

big doubt

novel flower May 24, 2025, 11:13 PM

#

Hehehe

dry ingot May 24, 2025, 11:14 PM

#

novel flower Hehehe

where is 2.5 thinking budget ?

novel flower May 24, 2025, 11:45 PM

#

dry ingot where is 2.5 thinking budget ?

what mean

dry ingot May 24, 2025, 11:45 PM

#

novel flower what mean

shouldnt we be getting thinking budget?

#

for gemini 2.5 pro

novel flower May 24, 2025, 11:47 PM

#

not quite sure

restive locust May 25, 2025, 12:21 AM

#

dry ingot for gemini 2.5 pro

not available in the gemini api yet

#

they said june

indigo jasper May 25, 2025, 5:21 AM

#

I'm curious about what the raw # of calls looks like for this

#

assuming it stayed pretty constant, this chart seems like great evidence of how 2.5 pro has become such a yapper 😂

graceful robin May 25, 2025, 10:23 AM

#

interesting how the gemini webapp after deepresearch will offer to generate an infographic - this was one: https://www.jdoodle.com/ih/1HBq
and the prompt is something like this

Tailwind CSS and Chart.js loaded via CDN.

The "Brilliant Blues" color palette applied throughout.

Responsive design with a grid layout for content sections.

Chart.js visualizations for Context Window Comparison, Architectural Pillars (Doughnut), and MRCR Benchmark Performance. These charts include the required label wrapping for labels longer than 16 characters and the specified tooltip configuration.

Chart containers are styled according to the requirements (full width of parent, max-width, centered, controlled responsive height).

HTML/CSS diagrams for the "Thinking Model" paradigm and "Context Caching" process, avoiding SVG and Mermaid JS.

Content derived from the "Gemini 2.5 Long Context Excellence" report, with introductory paragraphs for each section and explanatory text for all visualizations.

No SVG or Mermaid JS has been used.

The output starts with <!DOCTYPE html> and ends with </html>, with no extraneous characters or comments (the planning comments present in the <style> block during generation are not functional HTML/CSS/JS comments and are for context; they wouldn't appear in a rendered page's comment section and are within the rules provided).

abstract plover May 25, 2025, 11:30 AM

#

graceful robin interesting how the gemini webapp after deepresearch will offer to generate an i...

could you please fix the link , it aint working

abstract plover May 25, 2025, 11:31 AM

#

indigo jasper assuming it stayed pretty constant, this chart seems like great evidence of how ...

google just converted it into a money printing machine

graceful robin May 25, 2025, 12:16 PM

#

abstract plover could you please fix the link , it aint working

fixed soz

dry ingot May 25, 2025, 12:19 PM

#

gemini 2.5 pro degraded alot in performance

abstract plover May 25, 2025, 12:37 PM

#

and sadly its still the best

celest idol May 25, 2025, 5:34 PM

#

dry ingot gemini 2.5 pro degraded alot in performance

yep

#

aider benchmarks have been retried

#

-10%

fringe rapids May 25, 2025, 5:37 PM

#

jeez

#

why do companies love to do that

#

they didn't have any issues with capacity

#

i was really betting on google...

midnight venture May 25, 2025, 5:45 PM

#

fringe rapids why do companies love to do that

Only theory which makes sense to me is Google realised they hit a ceiling on Gemini improvement, quickly retired the insanely good experimental checkpoint in favour for a lighter counterpart
Next release will be an improved exp checkpoint, so people will feel the exp rush all over again as it will easily crush all other competitors and show a "massive" improvement over previous versions

#

When in reality its just a better exp checkpoint which was retired early

fringe rapids May 25, 2025, 5:45 PM

#

Maybe they are nerfing the pro model for their upcoming ultra model

#

So the difference is larger

#

Maybe they made the old pro the new ultra xD

midnight venture May 25, 2025, 5:47 PM

#

fringe rapids So the difference is larger

definetely something around those lines

abstract plover May 25, 2025, 7:33 PM

#

midnight venture Only theory which makes sense to me is Google realised they hit a ceiling on Gem...

isnt it better to just quantize the model , make it think more to get back some intelligence and then sell this version?

midnight venture May 25, 2025, 7:35 PM

#

abstract plover isnt it better to just quantize the model , make it think more to get back some...

idk if you can just make a model think extra hard to avoid quantisation loss

#

google has been doing a lot of work towards quantisation and training, but thats a separate topic and requires training from scratch
On top of that, your flagship model shouldnt ideally be quantised, especially if you're google

indigo jasper May 25, 2025, 8:39 PM

#

celest idol -10%

where are you seeing that?

celest idol May 26, 2025, 3:13 AM

#

indigo jasper where are you seeing that?

aider benchmarks

#

on discord

indigo jasper May 26, 2025, 3:13 AM

#

celest idol aider benchmarks

ran by Paul himself or other people?

celest idol May 26, 2025, 3:13 AM

#

other peopp\le

#

but when i run it

#

i also get similar results

indigo jasper May 26, 2025, 3:13 AM

#

I wouldn't trust it, people get very varied results on aider benchmarks that Paul doesn't get

#

up to and exceeding 10% often

#

(I don't know why it varies, but the same thing happened with GPT 4.1 / Quasar, people reporting very different results than what he got)

restive ridge May 26, 2025, 3:48 AM

#

I ran the bad run, it was just 1 run, we would need more runs to draw conclusions.

steady kite May 27, 2025, 4:20 AM

#

anyone has problem with json_schema structured output on gemini? somehow if I use @google/genai directly to aistudio, the JSON response is correct; but using openai sdk via openrouter, the structure got messed up (especially with literals)

lyric owl May 27, 2025, 4:52 AM

#

any way to get gemini 2.5 pro free on open router like before?

novel flower May 27, 2025, 7:01 AM

#

lyric owl any way to get gemini 2.5 pro free on open router like before?

the free version was good, the curent version is ass

lyric owl May 29, 2025, 3:17 AM

#

Is the current version not free? What changed

lyric owl May 29, 2025, 3:17 AM

#

novel flower the free version was good, the curent version is ass

See above

ancient burrow May 29, 2025, 3:18 AM

#

lyric owl Is the current version not free? What changed

google wants money now

lyric owl May 29, 2025, 3:20 AM

#

ancient burrow google wants money now

So no more Gemini for free? I heard new DeepSeek was good

true token May 30, 2025, 7:05 PM

#

The google gemini (2.5 pro) API is very weird sometimes. One complex prompt, takes almost 2 minutes to complete, gives a very very high quality response. Shortly after, I give it another even harder prompt. Instantly, almost real-time, it replies with a very high quality response. lmao

fringe rapids May 30, 2025, 7:55 PM

#

true token The google gemini (2.5 pro) API is very weird sometimes. One complex prompt, tak...

Thinking isn’t actually very good, there are multiple papers proving this

abstract plover May 30, 2025, 8:09 PM

#

true token The google gemini (2.5 pro) API is very weird sometimes. One complex prompt, tak...

Experienced this too , it's weird indedd

true token May 30, 2025, 9:08 PM

#

Yeah... Maybe the tokens per second went BRRR suddenly

royal ocean May 30, 2025, 10:06 PM

#

The thinking is rough after the update, it used almost 14k tokens one time

restive ridge May 31, 2025, 3:12 AM

#

RE: thinking, I tried flash without thinking and was getting some weird behavior. I will keep playing with it

#

Excited for june release of pro thinking budgets

wheat quest May 31, 2025, 12:36 PM

#

another checkpoint in a few days notlikethis

plush bridge May 31, 2025, 12:52 PM

#

wheat quest another checkpoint in a few days <:notlikethis:408727526261915660>

Still no GA?

ancient burrow May 31, 2025, 1:29 PM

#

wheat quest another checkpoint in a few days <:notlikethis:408727526261915660>

Source

abstract plover May 31, 2025, 1:51 PM

#

Deathmax is the source

#

he is damn good at this

abstract plover May 31, 2025, 2:15 PM

#

Pro suddenly aint giving reasoning summaries?

novel flower May 31, 2025, 7:55 PM

#

https://cdn.discordapp.com/emojis/408727526261915660.webp?size=44

wheat quest May 31, 2025, 8:07 PM

#

plush bridge Still no GA?

Signs point to 2.5 Flash going GA with the current 05-20 checkpoint, but we're getting another preview model for Pro before GA

indigo jasper May 31, 2025, 8:44 PM

#

abstract plover Deathmax is the source

I thought @wheat quest you said all your info was from public sources

#

😄

indigo jasper May 31, 2025, 8:44 PM

#

wheat quest another checkpoint in a few days <:notlikethis:408727526261915660>

Honestly thank god, it’s been such a drought dealing with much dumber models like Claude 4 sonnet

wheat quest May 31, 2025, 8:59 PM

#

indigo jasper I thought <@167683885172391936> you said all your info was from public sources

It is public, but saying where exactly tends to get things patched (like how Google avoids updates to their open source SDKs after flash thinking got leaked)

indigo jasper May 31, 2025, 9:06 PM

#

wheat quest It is public, but saying where exactly tends to get things patched (like how Goo...

True true

sturdy ether May 31, 2025, 9:20 PM

#

at least secondary sources align

novel flower Jun 1, 2025, 12:19 AM

#

sturdy ether at least secondary sources align

1000 requests for free, insane if true

ancient burrow Jun 1, 2025, 12:34 AM

#

novel flower Jun 1, 2025, 12:53 AM

#

ancient burrow

couple of weeks? 😭

celest idol Jun 1, 2025, 6:23 AM

#

sturdy ether at least secondary sources align

whaaaa?

restive ridge Jun 1, 2025, 11:01 AM

#

wheat quest Signs point to 2.5 Flash going GA with the current 05-20 checkpoint, but we're g...

My guess would be it has thinking budget

potent coral Jun 1, 2025, 2:41 PM

#

ancient burrow

Seems even people outside of this community also realise how bad of dowgrade the new 2.5 pro are in terms of knowledge and understanding outside of coding domain.

visual loom Jun 1, 2025, 2:57 PM

#

Makes you wonder if they didn't even do an AB test and instead only looked at benchmarks

solemn seal Jun 1, 2025, 3:02 PM

#

or did they purposefully degraded it so deepseek like models can not use its data?

#

🥸

#

🤷

tacit ingot Jun 1, 2025, 6:18 PM

#

sturdy ether at least secondary sources align

Is it true?

runic ibex Jun 1, 2025, 9:58 PM

#

visual loom Makes you wonder if they didn't even do an AB test and instead only looked at be...

In their own model card they show it dropping on literally every benchmark aside from code.

runic ibex Jun 1, 2025, 10:28 PM

#

Honestly every model release has been weird recently, none of them just a straightforward upgrade.

#

R1 drops on EQBench's creative writing. Then beats the original on long context until 64k where it drops horribly? Maybe a fluke? And that's probably the most uncontested pure upgrade

novel flower Jun 1, 2025, 10:38 PM

#

runic ibex R1 drops on EQBench's creative writing. Then beats the original on long context ...

o.o

runic ibex Jun 1, 2025, 10:38 PM

#

The 2.5 Pro upgrade seems to flipflop on like, everything

novel flower Jun 1, 2025, 10:39 PM

#

runic ibex The 2.5 Pro upgrade seems to flipflop on like, everything

huh, what upgrade?

#

you mean 05 06?

runic ibex Jun 1, 2025, 10:39 PM

#

Yeah

novel flower Jun 1, 2025, 10:40 PM

#

ah, just wait for the new endpoint in some weeks

#

according to gosucoder, 05 06 performs a bit better on cline

#

i sent you a video @runic ibex

runic ibex Jun 1, 2025, 11:07 PM

#

I saw his other video on the new R1 but I'll check it out. Trying out windsurf rn, free so why not. Already used cursor and cline

visual loom Jun 2, 2025, 2:09 AM

#

runic ibex In their own model card they show it dropping on literally every benchmark aside...

Yet they still decided to release it lol

slim turret Jun 2, 2025, 2:18 AM

#

Hi

runic ibex Jun 2, 2025, 2:32 AM

#

visual loom Yet they still decided to release it lol

Code is important. I think every top lab is mad that Claude is just crushing it and has been for quite a long time now

sturdy ether Jun 2, 2025, 2:38 AM

#

visual loom Yet they still decided to release it lol

it's theorized that it's more efficient

potent coral Jun 2, 2025, 2:40 AM

#

sturdy ether it's theorized that it's more efficient

Smaller model?

sturdy ether Jun 2, 2025, 2:40 AM

#

some would say

runic ibex Jun 2, 2025, 2:49 AM

#

Their servers were pretty badly under load a while ago, so could be legit

#

It went up on the UGI knowledge benchmark though, and that's usually positively correlated with model size. So who the hell knows

boreal island Jun 2, 2025, 3:58 AM

#

plush bridge Still no GA?

What's GA?

boreal island Jun 2, 2025, 3:59 AM

#

ancient burrow

I hope so but with the kind of rugpull google did with 03-25 people aren't going to trust them until it is stable for what? At least 3 months? keka

sturdy ether Jun 2, 2025, 4:18 AM

#

general availability

plush bridge Jun 2, 2025, 4:32 AM

#

boreal island What's GA?

General availability. A term that cloud companies use to signal that the product is out of beta and can be used for production workload with SLAs and proper support.

slender ginkgo Jun 2, 2025, 3:06 PM

#

in other words, "give us your money now if you weren't already"

#

but also they know you were already

royal ocean Jun 2, 2025, 7:41 PM

#

Where did the thinking summaries go 😑

pulsar quest Jun 3, 2025, 3:13 AM

#

anyone having horrible gemini hallucinations today

#

what the heck all gemini models, especially this one just having a bad time

indigo jasper Jun 3, 2025, 3:24 AM

#

pulsar quest anyone having horrible gemini hallucinations today

well if @wheat quest is right, your woes should be over tomorrow :)

pulsar quest Jun 3, 2025, 3:29 AM

#

indigo jasper well if <@167683885172391936> is right, your woes should be over tomorrow :)

can i get some context XD

#

who is deathmax

indigo jasper Jun 3, 2025, 3:29 AM

#

new 2.5 pro should be released tmrw

#

or very soon if not tomorrow

#

but leaks and semi-public info suggest tmrw

pulsar quest Jun 3, 2025, 3:30 AM

#

do we know why like

#

a bunch of the gemini models today in general

#

have kind of been tweaking

indigo jasper Jun 3, 2025, 3:30 AM

#

nope

pulsar quest Jun 3, 2025, 3:30 AM

#

welp

#

lets hope things stabalize

#

._.

indigo jasper Jun 3, 2025, 3:33 AM

#

indigo jasper new 2.5 pro should be released tmrw

thursday*

tranquil drift Jun 3, 2025, 3:39 AM

#

not sure if this has been discussed here before but i just run into this https://discord.com/channels/1091220969173028894/1379302807320137728

looks like google turned on thinking for 2.5 flash and now the first tokens streamed are thinking by default

wheat quest Jun 3, 2025, 10:29 AM

#

If plans don't change, we'll get a new checkpoint in a few hours.

novel flower Jun 3, 2025, 10:58 AM

#

wheat quest If plans don't change, we'll get a new checkpoint in a few hours.

🫡

boreal island Jun 3, 2025, 2:41 PM

#

wheat quest If plans don't change, we'll get a new checkpoint in a few hours.

@vital locust

midnight venture Jun 3, 2025, 2:48 PM

#

wen

wheat quest Jun 3, 2025, 4:41 PM

#

Looks like model is landing on Thursday instead, with thinking budget support

#

Thinking budget for 2.5 Pro will be disabled or 64-32768

abstract plover Jun 3, 2025, 4:51 PM

#

wheat quest Thinking budget for 2.5 Pro will be disabled or 64-32768

whats 64-32768?

#

also , 2.5 flash GA ? Hope they release the model on batch api

#

batch api is still stuck with 2.0 flash 001

wheat quest Jun 3, 2025, 4:54 PM

#

abstract plover whats 64-32768?

the thinking budget can be set between 64 tokens and 32K tokens.

ancient burrow Jun 3, 2025, 9:08 PM

#

wheat quest If plans don't change, we'll get a new checkpoint in a few hours.

What is this

#

Found on reddit post

#

https://www.reddit.com/r/Bard/s/ptIqGjLoQn

From the Bard community on Reddit: Almost there 🚬

Explore this post and more from the Bard community

tacit ingot Jun 3, 2025, 9:10 PM

#

Hmm

wheat quest Jun 3, 2025, 9:13 PM

#

ancient burrow What is this

#1354107710437724221 message

#

And from my teaser on another server

midnight venture Jun 3, 2025, 9:21 PM

#

wheat quest

wow, benchmaxxed slop?

ancient burrow Jun 3, 2025, 9:33 PM

#

What is diff-fenced

abstract plover Jun 3, 2025, 9:56 PM

#

ancient burrow What is diff-fenced

aider format

#

https://aider.chat/docs/more/edit-formats.html

aider

Edit formats

Aider uses various “edit formats” to let LLMs edit source files.

indigo jasper Jun 3, 2025, 10:12 PM

#

wheat quest

wait a sec so despite not being an insider, you got it - the api for new gemini is exposed to the public? 😂

wheat quest Jun 3, 2025, 10:13 PM

#

👀

indigo jasper Jun 3, 2025, 10:13 PM

#

me rn

abstract plover Jun 3, 2025, 10:14 PM

#

wheat quest

insane

restive locust Jun 3, 2025, 10:15 PM

#

indigo jasper wait a sec so despite not being an insider, you got it - the api for new gemini ...

i will continue to vouch that he is not making this stuff up KEKcry

indigo jasper Jun 3, 2025, 10:15 PM

#

yeah

abstract plover Jun 3, 2025, 10:15 PM

#

wheat quest 👀

give a review , does it think more? Sucks at frontend? Dumber than befre?

indigo jasper Jun 3, 2025, 10:15 PM

#

I've seen independent confirmation of the same as deathmax in another server from an insider

restive locust Jun 3, 2025, 10:16 PM

#

deathmax is not an insider haha

indigo jasper Jun 3, 2025, 10:16 PM

#

indigo jasper Jun 3, 2025, 10:16 PM

#

restive locust deathmax is not an insider haha

yeah I know

abstract plover Jun 3, 2025, 10:16 PM

#

wait deathmax might be insider her

indigo jasper Jun 3, 2025, 10:16 PM

#

just saying it confirms that deathmax is the goat :)

wheat quest Jun 3, 2025, 10:16 PM

#

Shrug they turned it off

indigo jasper Jun 3, 2025, 10:16 PM

#

kekwait

#

like

#

just now??

abstract plover Jun 3, 2025, 10:16 PM

#

calling it fake , puts on deathmax

indigo jasper Jun 3, 2025, 10:16 PM

#

noooo

#

they're watching this chat

#

😭

abstract plover Jun 3, 2025, 10:17 PM

#

I bet 100$ toven leaked it

wheat quest Jun 3, 2025, 10:17 PM

#

window wasn't open that long

indigo jasper Jun 3, 2025, 10:17 PM

#

do you just have scripts going monitoring this kind of thing

#

like how plinny gets the system prompt changes

#

LULW

#

so... is this more or less completion_tokens than 05-06

#

because the cost is more

#

and I'm concerned that it's gonna still be a slow loser

#

Seconds per case : 45.3
gemini 2.5 pro 03-25...

#

Seconds per case : 165.3
05-06...

#

so only a slight improvement over current 05-06

#

that's quite sad

abstract plover Jun 3, 2025, 10:21 PM

#

well this model was sucked at everything but coding , so I assume the next model is going to be better at rest of the task with slight coding degradation?

wheat quest Jun 3, 2025, 10:21 PM

#

I wouldn't read too much into the test time

indigo jasper Jun 3, 2025, 10:22 PM

#

abstract plover well this model was sucked at everything but coding , so I assume the next model...

this is showing +10 points over 05-06 on both pass 1 / pass 2

indigo jasper Jun 3, 2025, 10:22 PM

#

wheat quest I wouldn't read too much into the test time

why not?

wheat quest Jun 3, 2025, 10:22 PM

#

throughput would have been jank given the situation

indigo jasper Jun 3, 2025, 10:23 PM

#

fair

#

rip no token count at this point

#

it is interesting that the total_cost went up assuming costs are accccurate... are they raising prices?

novel flower Jun 3, 2025, 10:39 PM

#

raising prices?

#

oh no no no

true token Jun 3, 2025, 10:42 PM

#

You will pay for it

#

And you will enjoy it

novel flower Jun 3, 2025, 10:42 PM

#

true token And you will enjoy it

https://cdn.discordapp.com/emojis/1373086954136277094.webp?size=96

true token Jun 3, 2025, 10:42 PM

#

🤑

ancient burrow Jun 3, 2025, 10:43 PM

#

true token And you will enjoy it

i refused to use 2.5 flash thinking simply due to the thinking tax. I will disenjoy it

#

disnejoy it so much

true token Jun 3, 2025, 10:43 PM

#

ancient burrow i refused to use 2.5 flash thinking simply due to the thinking tax. I will disen...

I understand you... What's your use case? Do you ever use the 2.5 pro?

ancient burrow Jun 3, 2025, 10:44 PM

#

true token I understand you... What's your use case? Do you ever use the 2.5 pro?

honestly with the chatgpt plus subscription i dont rly use anything else

#

cuz like, why use api if i got a model right there ready to answer, that i already paid for

abstract plover Jun 3, 2025, 10:45 PM

#

chatgpt subs is a good vfm

ancient burrow Jun 3, 2025, 10:45 PM

#

but even if i did mainly use api i'd be pissed to find out there is a price markup for no reason

novel flower Jun 3, 2025, 10:46 PM

#

chatgpt plus that good huh?

abstract plover Jun 3, 2025, 10:46 PM

#

ancient burrow but even if i did mainly use api i'd be pissed to find out there is a price mark...

https://www.linkedin.com/posts/zainhas_why-do-reasoning-models-cost-more-than-non-reasoning-activity-7293788367043866624-ZWzt/

Why do reasoning models cost more than non-reasoning ones even thou...

Why do reasoning models cost more than non-reasoning ones even though they have the same architecture? This video provides a great explanation!

I am seeing a lot of people confused about why reasoning models cost more than their non-reasoning counterparts even if they share exactly the same architecture. It has everything to do with the fact th...

#

for the Nth time , there is nothing called as thinking tax

ancient burrow Jun 3, 2025, 10:46 PM

#

does it go on about token count?

#

i'm ttalking about price per token

#

if it's about context length, it looks like they could just implement context length-specific pricing, like they've already done with 2.5 pro

#

otherwise i could rack up a lot of context on 2.5 flash non-thinking, and have it cost them just the same, but for some reason they'd be charging me less

#

it just smells like they have it cost more only because the user gets better results, and nothing else.

true token Jun 3, 2025, 10:57 PM

#

Depending on your use case, (and the model)

#

The thinking is very much worth it

abstract plover Jun 3, 2025, 11:01 PM

#

ancient burrow does it go on about token count?

just watch teh video

ancient burrow Jun 3, 2025, 11:44 PM

#

I think I answered all the points on the slide on the video

indigo jasper Jun 4, 2025, 3:05 AM

#

Deepseek has somewhat disproved this

#

with R1

#

and them releasing their figures on it

ancient burrow Jun 4, 2025, 7:26 AM

#

@wheat quest omg ur famous

#

#

Wasnt me who posted

novel flower Jun 4, 2025, 7:37 AM

#

famous deathmax

celest idol Jun 4, 2025, 8:14 AM

#

my predictiom

#

its overfitted slop

ancient burrow Jun 4, 2025, 8:32 AM

#

celest idol its overfitted slop

Google wouldn't be so stupid to fail its community a second time in the last 30 days.

celest idol Jun 4, 2025, 8:33 AM

#

ancient burrow Google wouldn't be so stupid to fail its community a second time in the last 30 ...

my prediction is its like 03-25

#

but overfitted

ancient burrow Jun 4, 2025, 8:33 AM

#

Especially when they have the edge

celest idol Jun 4, 2025, 8:33 AM

#

i mean i think r1 has the edge imo

#

but eh

#

btw evidence shows deepseek distilled from gemini lol

#

i saw an article proving it

ancient burrow Jun 4, 2025, 8:34 AM

#

2.5 pro has completed a couple coding tasks I gave it that no model that i tested before it was able to

#

One of them was figuring out which elements in a pygame app overlapped and fixing the UI

celest idol Jun 4, 2025, 8:41 AM

#

i mean for me i like switching models

#

sometimes o3 or o4-minj can solve a problem r1 cant

#

sometimes 2.5 pro is better

#

and sometimes sonnet 4 takes the win

#

but in general ive been using r1 the mpst

foggy flax Jun 4, 2025, 9:24 AM

#

86.2%

#

that's not even their upcoming deep think

plush bridge Jun 4, 2025, 9:37 AM

#

ancient burrow One of them was figuring out which elements in a pygame app overlapped and fixin...

Did you open source the code somewhere? Would love to take a look.

#

Aider at this point is probably leaked, reward hacked, overfitted and outdated for agentic flows.

#

Probably need a aider benchmark V3 to become useful again.

ancient burrow Jun 4, 2025, 9:59 AM

#

plush bridge Did you open source the code somewhere? Would love to take a look.

I have a folder called "funnygpt" but i recently cleared it of anything i didnt wish to keep. I will check if that app is still there.

indigo jasper Jun 4, 2025, 10:29 AM

#

I've stopped using 2.5 pro ENTIRELY recently

#

the fact that it takes 3+ minutes on many tasks

#

is insane

#

even if it gets way better in the next update, if it's not a lot faster, I'm not sure I'll use it!

midnight venture Jun 4, 2025, 11:06 AM

#

indigo jasper even if it gets way better in the next update, if it's not a lot faster, I'm not...

I read this with trump voice

novel flower Jun 4, 2025, 11:08 AM

#

celest idol but in general ive been using r1 the mpst

The recent r1? R1 05 28?

novel flower Jun 4, 2025, 11:09 AM

#

indigo jasper even if it gets way better in the next update, if it's not a lot faster, I'm not...

Why you need it to be so fast sir?

indigo jasper Jun 4, 2025, 11:09 AM

#

novel flower Why you need it to be so fast sir?

3 minutes per little task is insane

novel flower Jun 4, 2025, 11:10 AM

#

indigo jasper 3 minutes per little task is insane

Not sure why for me it fast sir

celest idol Jun 4, 2025, 11:13 AM

#

yes

celest idol Jun 4, 2025, 11:24 AM

#

plush bridge Aider at this point is probably leaked, reward hacked, overfitted and outdated f...

aider benchmark is open source lol

#

not leaked

plush bridge Jun 4, 2025, 11:29 AM

#

celest idol aider benchmark is open source lol

Yeah that's what data leak means for pre-training

celest idol Jun 4, 2025, 11:29 AM

#

ah

#

ngl we need someone trusted to make a closed source benchmark

plush bridge Jun 4, 2025, 11:31 AM

#

celest idol ngl we need someone trusted to make a closed source benchmark

I remember some benchmarks have hidden or withheld datasets. Can't remember which one.

#

ARC-AGI being one

ionic solar Jun 4, 2025, 11:59 AM

#

ancient burrow One of them was figuring out which elements in a pygame app overlapped and fixin...

Seconded. 2.5 pro can do things no one else can.

true token Jun 4, 2025, 3:39 PM

#

I switch between o3, Gemini pro 2.5, r1 and sometimes sonnet 4

#

It all depends

#

Pro 2.5 and sonnet 4 explain code better on average

plush bridge Jun 4, 2025, 4:13 PM

#

https://x.com/testingcatalog/status/1930292977206800515

#

lol google messed up (or genius marketing). even i can see it.

Screenshot_2025-06-05_at_12.14.22_AM.png

sleek cave Jun 4, 2025, 4:15 PM

#

King fall has only 64k context weird…

midnight venture Jun 4, 2025, 4:16 PM

#

true token Pro 2.5 and sonnet 4 explain code better on average

2.5 pro is the explainor 100%, very easy to follow what it says

plush bridge Jun 4, 2025, 4:16 PM

#

not working for me lol

Screenshot_2025-06-05_at_12.16.02_AM.png

midnight venture Jun 4, 2025, 4:16 PM

#

sleek cave King fall has only 64k context weird…

small context -> less resources -> more compute (?)

#

64k is still a lot tbh

plush bridge Jun 4, 2025, 4:17 PM

#

i think just some intern messed up probably, not the actual new 2.5 pro model

#

and it's gone!

true token Jun 4, 2025, 4:28 PM

#

gone

kind condor Jun 4, 2025, 4:28 PM

#

nah why would they label CONFIDENTIAL in a publicly available service lmao

#

messed up hard or genius marketing

plush bridge Jun 4, 2025, 4:29 PM

#

cheap way to generate hype and get attention lol

true token Jun 4, 2025, 4:29 PM

#

should have been named Kingfall YOLO 360 noscope GPT Killer x

#

to scare people even more

kind condor Jun 4, 2025, 4:29 PM

#

sam altman or elon musk would

fringe rapids Jun 4, 2025, 4:48 PM

#

celest idol ngl we need someone trusted to make a closed source benchmark

That’s why we have dubesor!

celest idol Jun 4, 2025, 4:49 PM

#

plush bridge not working for me lol

i dont see it

fringe rapids Jun 4, 2025, 4:49 PM

#

celest idol i dont see it

It’s gone

celest idol Jun 4, 2025, 4:49 PM

#

oh

celest idol Jun 4, 2025, 4:50 PM

#

plush bridge ARC-AGI being one

ah lol

#

i mean if arc was open source llms prob wouldve gotten like 50%

spark obsidian Jun 4, 2025, 6:56 PM

#

sleek cave King fall has only 64k context weird…

That makes me wonder if it's an open source model and not Gemini. Like maybe a Gemma-based thinking model perhaps

celest idol Jun 4, 2025, 8:21 PM

#

spark obsidian That makes me wonder if it's an open source model and not Gemini. Like maybe a G...

i think kingfall is not gemma

#

gemma doesnt support structured output, code execution,metc

#

unless its gemma 4?

#

but it seems a bit early

shrewd plaza Jun 4, 2025, 8:29 PM

#

Google has launched exp models with 32-64k context lengths before.

raven fractal Jun 4, 2025, 11:10 PM

#

I dont know if this was here before but seems like ai studio now has framerate and time options for video attachments

runic ibex Jun 5, 2025, 12:55 AM

#

Google has been cooking on absolutely everything except 05-06 so I'm expecting good things.

novel flower Jun 5, 2025, 1:33 AM

#

runic ibex Google has been cooking on absolutely everything except 05-06 so I'm expecting g...

o.o

pallid wren Jun 5, 2025, 11:55 AM

#

So today is the expected new model?

solemn seal Jun 5, 2025, 1:52 PM

#

Well Logan K didn't tweet "Gemini" yet

#

Which he usually does before releasing something

mortal solstice Jun 5, 2025, 1:53 PM

#

solemn seal Well Logan K didn't tweet "Gemini" yet

https://x.com/OfficialLoganK/status/1930500218602369344

Logan Kilpatrick (@OfficialLoganK)

Gemini

solemn seal Jun 5, 2025, 1:53 PM

#

What!

#

I just checked his account and didn't saw that 😠

#

X is broken 🚬

solemn seal Jun 5, 2025, 1:55 PM

#

mortal solstice https://x.com/OfficialLoganK/status/1930500218602369344

Thanks for this BTW

dry ingot Jun 5, 2025, 3:01 PM

#

so where is this new model at

tacit ingot Jun 5, 2025, 3:12 PM

#

mortal solstice https://x.com/OfficialLoganK/status/1930500218602369344

Is it out ?

mortal solstice Jun 5, 2025, 3:13 PM

#

no yet

wheat quest Jun 5, 2025, 3:29 PM

#

gemini-2.5-pro-preview-06-05 is now rolling out.

tacit ingot Jun 5, 2025, 3:31 PM

#

wheat quest `gemini-2.5-pro-preview-06-05` is now rolling out.

Looks great

restive locust Jun 5, 2025, 3:33 PM

#

wheat quest `gemini-2.5-pro-preview-06-05` is now rolling out.

so you can set thinking budget you just can't turn it off KEKcry

mellow turret Jun 5, 2025, 3:35 PM

#

👀

heavy aspen Jun 5, 2025, 3:38 PM

#

Noo

kind condor Jun 5, 2025, 3:40 PM

#

when will it NOT be a preview version?

abstract plover Jun 5, 2025, 3:42 PM

#

wheat quest `gemini-2.5-pro-preview-06-05` is now rolling out.

https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-pro

Google Cloud

Gemini 2.5 Pro | Generative AI on Vertex AI | Google Cloud

#

damn

#

good thing 2.5 flash can be used in batch api now

restive locust Jun 5, 2025, 4:05 PM

#

updated model coming out in ~5mins

#

AI Studio first, vertex up next

copper pilot Jun 5, 2025, 4:09 PM

#

05-06 and 06-05 is great naming, guys

lyric pilot Jun 5, 2025, 4:11 PM

#

hardy osprey Jun 5, 2025, 4:13 PM

#

copper pilot 05-06 and 06-05 is great naming, guys

LMAO

near ore Jun 5, 2025, 4:13 PM

#

hey]

#

google/gemini-2.5-pro-preview

#

this was unavailavle for 5 mins

restive locust Jun 5, 2025, 4:18 PM

#

yeah sorry. back up now

#

it's the new endpoint now

hardy osprey Jun 5, 2025, 4:31 PM

#

near ore hey]

But we can not disable thinking right?

open pond Jun 5, 2025, 4:31 PM

#

knowing google this is gonna be op for 4 days

#

then go to shit

#

unfortunately

near ore Jun 5, 2025, 4:31 PM

#

@restive locust did the model got a update ?

#

or just endpoint refersh

hardy osprey Jun 5, 2025, 4:32 PM

#

I tried passing "'extra_json': {'reasoning': {'max_tokens': 0}}", but got a 400 error.

{'error': {'message': 'Provider returned error', 'code': 400, 'metadata': {'raw': '{\n "error": {\n "code": 400,\n "message": "The thinking budget (0) is invalid.",\n "status": "INVALID_ARGUMENT"\n }\n}\n', 'provider_name': 'Google AI Studio'}}

#

But Google says the new pro model has already support budget control.

restive locust Jun 5, 2025, 4:34 PM

#

you can't set thinking budget 0

#

minimum 128

open pond Jun 5, 2025, 4:34 PM

#

hardy osprey Jun 5, 2025, 4:35 PM

#

restive locust you can't set thinking budget 0

Ok, so the 2.5-pro model can not disable thinking just like the flash one?

#

Alright, I've confirmed this from Google's doc. Thank you.

kind condor Jun 5, 2025, 5:24 PM

#

better? worse? can't be worse right?

tacit ingot Jun 5, 2025, 5:46 PM

#

kind condor better? worse? can't be worse right?

It is top one model in lmarena rn

#

plush bridge Jun 5, 2025, 6:01 PM

#

copper pilot 05-06 and 06-05 is great naming, guys

Lol that's literally the worst thing you could have done. At least it's ISO order.

dry ingot Jun 5, 2025, 6:12 PM

#

restive locust you can't set thinking budget 0

lol what's the point of thinking mode button if you can't disable thinking lol

copper pilot Jun 5, 2025, 6:18 PM

#

Minimum thinking: "Alright, the user wants me to [whatever user just input]." -> [begin response] 🙄

steady pelican Jun 5, 2025, 6:26 PM

#

so we can only get one version of gemini 2.5 pro via openrouter, it always points to latest?

#Gemini 2.5 Pro