Hi guys, I'm getting a rate limit error:
message": "You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.",\n "status": "RESOURCE_EXHAUSTED",\n "details": [\n {\n "@type": "type.googleapis.com/google.rpc.QuotaFailure",\n "violations": [\n {\n "quotaMetric": "generativelanguage.googleapis.com/generate_content_paid_tier_3_input_token_count",\n "quotaId": "GenerateContentPaidTierInputTokensPerModelPerMinute-PaidTier3",\n "quotaDimensions": {\n "location": "global",\n "model": "gemini-2.5-flash"\n },\n "quotaValue": "8000000"\n }\n ]\n },\n {\n "@type": "type.googleapis.com/google.rpc.Help
Sorry for formatting, this came from sentry.
However, I have other non-google models that are in my fallback that it's not falling back to my other models. This is how I'm sending it:
"https://openrouter.ai/api/v1/chat/completions",
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {OPENROUTER_API}",
},
json={
"model": "google/gemini-2.5-flash-preview",
"extra_body": {
"models": ["google/gemini-2.0-flash-exp:free", "google/gemini-2.0-flash-001", "anthropic/claude-3.7-sonnet"],
},
"temperature": 0.1,
"messages": context,
"tools": [{
Please advise on what I'm doing wrong. Thank you!