#Rate limited on google but not using fallback models?

42 messages · Page 1 of 1 (latest)

crimson mantle
#

Hi guys, I'm getting a rate limit error:

message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",\n "status": "RESOURCE_EXHAUSTED",\n "details": [\n {\n "@type": "type.googleapis.com/google.rpc.QuotaFailure",\n "violations": [\n {\n "quotaMetric": "generativelanguage.googleapis.com/generate_content_paid_tier_3_input_token_count",\n "quotaId": "GenerateContentPaidTierInputTokensPerModelPerMinute-PaidTier3",\n "quotaDimensions": {\n "location": "global",\n "model": "gemini-2.5-flash"\n },\n "quotaValue": "8000000"\n }\n ]\n },\n {\n "@type": "type.googleapis.com/google.rpc.Help

Sorry for formatting, this came from sentry.

However, I have other non-google models that are in my fallback that it's not falling back to my other models. This is how I'm sending it:

    "https://openrouter.ai/api/v1/chat/completions",
    headers={
        "Content-Type": "application/json",
        "Authorization": f"Bearer {OPENROUTER_API}",
    },
    json={
        "model": "google/gemini-2.5-flash-preview",
        "extra_body": {
            "models": ["google/gemini-2.0-flash-exp:free", "google/gemini-2.0-flash-001", "anthropic/claude-3.7-sonnet"],
        },            
        "temperature": 0.1,
        "messages": context,
        "tools": [{

Please advise on what I'm doing wrong. Thank you!

long jewelBOT
#
Automated Response

Your usage of the 'models' parameter for fallback is correct, but fallback may not trigger if the error occurs mid-stream or is non-recoverable. Google Gemini free models also impose independent strict rate limits. To better control fallback behavior, configure the 'provider' routing settings. See OpenRouter docs on model fallback and provider routing for details.

Question

Why am I getting a rate limit error on the google model and fallback models are not used?

Confidence Score

5/10

⚠️ Disclaimer

This is an AI-generated response and may not be 100% accurate. If you have further questions or need clarification, please ping @foggy ridge for assistance.

Feedback

✅ This response was marked as helpful!

chilly cove
crimson mantle
#

I guess an update, from what the automated response is telling me, it seems like openrouter's fallback models aren't super reliable so I'm just going to do a custom fallback myself

foggy ridge
#

looks like you're doing it right. I would expect a 429 to move on from your primary model into your model fallbacks

crimson mantle
foggy ridge
#

it doesn't have to be the exact one, just any of them

#

click the little arrow on the right and send me the generation url from one of the requests

crimson mantle
#

DM or is this okay to share public?

foggy ridge
#

you can share it here, no one can see it but you and our team

#

it's only the request metadata too, not prompt or completion

foggy ridge
#

yep

#

hmmm I can't really tell why this isn't working

crimson mantle
#

That's okay, it's not a blocker on my side. It's more of a nice to have, the custom solution isn't super complicated

#

But I have noticed this issue a long over the past few weeks which is why I finally reached out

foggy ridge
#

okay I think I narrowed down the request

#

we're for some reason not conisdering these other models

#

could be your request shape or maybe some kind of account setting?

#

do you have any providers ignored or data policy toggles?

crimson mantle
#

Pretty sure I have all defaults

foggy ridge
#

what about on the privacy page?

crimson mantle
#

The models work

foggy ridge
#

yeah

crimson mantle
#

It's just that sometimes, they don't

foggy ridge
#

I think the request shape is somehow wrong

crimson mantle
#

And when they don't, they don't use the fallbacks

foggy ridge
#

on my end, what I'm looking at, is that we have a potential endpoints list for your requests. none of the requests with the models param are including all of the theoretically potential endpoints

#

oh wait is this the openai client or just a raw request to us ?

#

you don't need to nest models under extra_body unless you're calling it over the OpenAI Client SDK

crimson mantle
#

That might be it, that makes sense

#

It's raw

foggy ridge
#

yeah don't nest it

#

and try that

#

should work

crimson mantle
#

ty

foggy ridge
#

FYI @chilly cove @crimson mantle the 2.5 flash model should no longer be rate limited at all