#model hitting rate limit quite frequently
1 messages · Page 1 of 1 (latest)
If it’s google/gemini-3.1-pro-preview, frequent rate limits honestly wouldn’t surprise me, preview Gemini models can be pretty spiky.
First thing, can you paste:
openclaw --version
openclaw models status
And also:
- the exact error text / 429 message
- whether you’re using
google(API key) orgoogle-gemini-cli(OAuth)
A couple useful refs:
- models/status docs: https://docs.openclaw.ai/cli/models
- Google provider setup: https://docs.openclaw.ai/providers/google
- model discussions/community notes: https://docs.openclaw.ai/concepts/models#model-discussions-community-notes
One relevant detail from the docs: OpenClaw can rotate across multiple provider keys on rate-limit responses, so if you’re on API keys and have more than one, that can help. But let’s get your version + exact error first so I don’t guess.
user1@user1:~/.openclaw$ openclaw --version
openclaw models status
OpenClaw 2026.4.15 (041266a)
Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : google/gemini-3.1-pro-preview
Fallbacks (0) : -
Image model : -
Image fallbacks (0): -
Aliases (1) : gemini -> google/gemini-3.1-pro-preview
Configured models (2): openai-codex/gpt-5.4, google/gemini-3.1-pro-preview
Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (1): openai-codex (2)
- google effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | google:default=AIzaSyBU...05_lQudM
- openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=2 (oauth=2, token=0, api_key=0) | openai-codex:default=OAuth, openai-codex:xxxxx@gmail.com=OAuth (xxxxxx@gmail.com)
OAuth/token status
- openai-codex usage: 168h 0% left ⏱1d 19h
- openai-codex:default ok expires in 5d
- openai-codex:xxxxxx@gmail.com (xxxxxx@gmail.com) ok expires in 10d
Nice, that narrows it down a lot.
What your status says:
- your version looks recent enough, so this does not smell like an old-version bug
- your primary is
google/gemini-3.1-pro-preview - you have one Google API key/profile
- you have zero model fallbacks
So when Gemini rate-limits, OpenClaw has:
- no second Google key to rotate to, and
- no fallback model to jump to
One important gotcha: Configured models (2) is not the same thing as fallbacks.
So having openai-codex/gpt-5.4 configured does not make it automatic failover.
Easiest fix
Since you already have Codex auth configured:
openclaw models fallbacks add openai-codex/gpt-5.4
Then OpenClaw can fail over when Google rate-limits.
If you want to stay on Gemini
If you have multiple Gemini keys, OpenClaw can rotate them on rate-limit responses. Docs-backed env vars are:
GEMINI_API_KEYSGEMINI_API_KEY_1,GEMINI_API_KEY_2, etc.OPENCLAW_LIVE_GEMINI_KEYfor a single live override
My read
This is probably mostly an upstream quota / preview-model issue, plus the fact that your setup currently has no safety net.
Useful docs:
- fallbacks: https://docs.openclaw.ai/concepts/models
- failover behavior: https://docs.openclaw.ai/concepts/model-failover
- Google provider: https://docs.openclaw.ai/providers/google
If you paste the exact 429/error text, I can usually tell whether it’s quota exhaustion vs concurrency throttling.
can i add claude as well as a fallback?
Yep, absolutely.
OpenClaw can use multiple fallbacks, and it tries them in order.
So the shape is basically:
agents: {
defaults: {
model: {
primary: "google/gemini-3.1-pro-preview",
fallbacks: [
"openai-codex/gpt-5.4",
"anthropic/<your-claude-model>",
],
},
},
}
Or via CLI:
openclaw models fallbacks add openai-codex/gpt-5.4
openclaw models fallbacks add anthropic/<your-claude-model>
openclaw models fallbacks list
Two important bits:
- you need Anthropic auth configured first
- fallbacks are tried in the order you set them
Since your current status only shows Google + Codex auth, Claude won’t work as a fallback until Anthropic is set up.
Docs:
If you want, paste your current openclaw.json model section and I can tell you the cleanest order to use.
i believe claude recently banned using its subscription with openclaw?
Not according to the current docs.
OpenClaw’s docs explicitly say:
- Claude CLI reuse / Claude subscription auth is allowed again
- OpenClaw treats
claude -pstyle usage as sanctioned unless Anthropic publishes a new policy
So the current docs-backed answer is no, it’s not banned right now.
That said, there’s an important nuance:
- Claude CLI / subscription: supported
- Anthropic API key: still the safer / more predictable path for long-lived or production setups
So if you want Claude only as a fallback, you can do it either way, but if you want the least surprising setup, I’d use an Anthropic API key.
Refs:
If you want, I can show the cleanest fallback setup for:
- Gemini primary + Codex fallback + Claude fallback via API key, or
- Gemini primary + Codex fallback + Claude fallback via Claude CLI.
does gemini has diff limits for pro, flash, flash lite? should we need to configure all of them separately? does openclaw has intelligence to use lighter models for lighter tasks, heavier models for complex tasks.. like it... i mean, can we configure multiple of them and rely on openclaw using them all based on the query
Short version:
- Different Gemini tiers probably do have different upstream limits, yes, but that’s mostly a Google-side quota question
- OpenClaw does not currently do “this prompt looks easy, use Flash Lite; this one looks hard, use Pro” routing by default
- What OpenClaw does do well is:
- primary model
- fallbacks in order
- auth/key rotation inside a provider
- separate modality routing like
imageModel,pdfModel, etc.
So if you configure:
model: {
primary: "google/gemini-3.1-pro-preview",
fallbacks: [
"google/<flash-model>",
"google/<flash-lite-model>",
"openai-codex/gpt-5.4",
],
}
OpenClaw will use those as a failover chain, not as a “smart complexity router”.
So, do you need to configure them separately?
Only if you want them available.
A good practical pattern is:
- Primary: the model you want most of the time
- Fallback 1: a cheaper/faster same-provider fallback
- Fallback 2: a different provider in case Google is rate-limiting hard
What OpenClaw is smart about today
More like:
- “primary failed/rate-limited, try next”
- “this tool needs image/pdf support, use the configured image/pdf route”
Not really:
- “this user asked a simple question, downgrade automatically”
Docs:
- models overview: https://docs.openclaw.ai/concepts/models
- failover behavior: https://docs.openclaw.ai/concepts/model-failover
- config reference: https://docs.openclaw.ai/gateway/configuration-reference#agent-defaults
If you want, I can suggest a clean Gemini Pro + Gemini Flash + cross-provider fallback order for your setup.
separate modality routing like imageModel, pdfModel, etc.?
Yep, exactly, it means routing by task/input type, not by “how hard the prompt feels”.
For example:
model= normal chat/default text modelimageModel= used for image understanding / visionpdfModel= used by thepdftoolimageGenerationModel= used for creating/editing imagesmusicGenerationModel= used for music generationvideoGenerationModel= used for video generation
A few useful details from the docs:
imageModelis also used when the selected/default model can’t accept image inputpdfModelis used by thepdftool, and if you don’t set it, it falls back to:imageModel- then the normal default model
So a setup can look like:
agents: {
defaults: {
model: {
primary: "google/gemini-3.1-pro-preview",
fallbacks: ["openai-codex/gpt-5.4"],
},
imageModel: "google/<vision-capable-model>",
pdfModel: "google/<pdf-friendly-model>",
},
}
That’s smart modality routing.
What it is not:
- “simple text question → use lite”
- “hard reasoning question → use pro”
That kind of complexity-based auto-routing is not the default behavior today.
Docs:
- https://docs.openclaw.ai/concepts/models
- https://docs.openclaw.ai/gateway/configuration-reference#agent-defaults
If you want, I can sketch a practical config for:
- text chat
- image understanding
- PDF reading
- failover when Gemini rate-limits
If you want, I can sketch a practical config for:
text chat
image understanding
PDF reading
failover when Gemini rate-limits
yes.. and configure some good models and failovers for all cagegories
Yep, I’d go with a balanced setup like this.
I wouldn’t treat these exact model picks as eternal truth, because model quality/limits churn fast, but this is a docs-compatible starting point for your case:
{
agents: {
defaults: {
// Normal chat
model: {
primary: "google/gemini-3.1-pro-preview",
fallbacks: [
"google/gemini-3-flash-preview",
"openai-codex/gpt-5.4",
],
},
// Image understanding / vision
imageModel: {
primary: "google/gemini-3-flash-preview",
fallbacks: [
"google/gemini-3.1-pro-preview",
"openai-codex/gpt-5.4",
],
},
// PDF analysis
pdfModel: {
primary: "google/gemini-3-flash-preview",
fallbacks: [
"google/gemini-3.1-pro-preview",
"openai-codex/gpt-5.4",
],
},
// Optional, only set these if you actually use those tools a lot:
// imageGenerationModel: { primary: "google/gemini-3.1-flash-image-preview" },
// musicGenerationModel: { primary: "google/lyria-3-clip-preview" },
// videoGenerationModel: { primary: "google/veo-3.1-fast-generate-preview" },
},
},
}
Why this layout
- Gemini Pro for your main heavy chat/reasoning
- Gemini Flash as the first fallback, so simple/image/PDF work doesn’t burn the expensive tier first
- Codex as the cross-provider escape hatch when Google is being annoying
Important nuance
If Google is rate-limiting at the account/key level, switching from Pro to Flash may still hit limits.
That’s why having one non-Google fallback is smart.
My recommendation
For your setup, I’d start with:
- text:
Gemini Pro -> Gemini Flash -> Codex - image:
Gemini Flash -> Gemini Pro -> Codex - pdf:
Gemini Flash -> Gemini Pro -> Codex
If you add Claude later
Then PDF is where Claude is often a nice extra fallback. I’d put it after the Google models or as the main non-Google fallback, depending on your billing/preferences.
Docs
- models: https://docs.openclaw.ai/concepts/models
- failover: https://docs.openclaw.ai/concepts/model-failover
- config reference: https://docs.openclaw.ai/gateway/configuration-reference#agent-defaults
- Google provider: https://docs.openclaw.ai/providers/google
If you want, next I can turn this into:
- a minimal exact
openclaw config set ...command sequence, or - a full cleaned-up
openclaw.jsonsnippet you can paste.
for pdf and image, you said gemini? i mean, you don't just have to stick to gemini and codex
Yep, totally, you’re not limited to Gemini + Codex.
I only used those because your earlier models status showed those were the providers you already had auth for.
A better way to think about it:
model= best general chat modelimageModel= whichever provider/model you want for visionpdfModel= whichever provider/model you want for PDFs
And for PDFs specifically, the docs call out that Anthropic and Google get native PDF mode. Other providers can still work, but via extraction fallback.
So a mixed setup is absolutely valid, for example:
- text chat → Gemini
- image understanding → Anthropic or Google
- PDF reading → Anthropic or Google
- fallback if Google is rate-limiting → Codex or Anthropic
So yes, if you add Anthropic, a very sensible layout is:
{
agents: {
defaults: {
model: {
primary: "google/gemini-3.1-pro-preview",
fallbacks: [
"anthropic/<your-claude-model>",
"openai-codex/gpt-5.4",
],
},
imageModel: {
primary: "anthropic/<your-claude-model>",
fallbacks: [
"google/<your-gemini-vision-model>",
"openai-codex/gpt-5.4",
],
},
pdfModel: {
primary: "anthropic/<your-claude-model>",
fallbacks: [
"google/<your-gemini-pdf-model>",
"openai-codex/gpt-5.4",
],
},
},
},
}
My actual recommendation would be:
- Gemini for normal chat if you like it
- Anthropic for image/PDF once you add it
- Codex as a non-Google fallback
Docs:
- PDF tool: https://docs.openclaw.ai/tools/pdf
- models: https://docs.openclaw.ai/concepts/models
- Anthropic: https://docs.openclaw.ai/providers/anthropic
If you want, I can sketch a Gemini + Anthropic + Codex config next, with cleaner real-world ordering and no placeholders except the Claude model name.
gemini quota is there but even for small prompt, i keep getting
⚠️ Rate-limited — ready in ~18s. Please wait a moment.
Yep, that usually means short-window rate limit / cooldown, not “you have zero quota left”.
So even a tiny prompt can still hit it if Google is throttling that model/profile right now.
OpenClaw is basically telling you:
- this model/profile is temporarily cooling down
- the soonest retry it knows is about 18s
- it’s not necessarily about prompt size
A couple important bits from the failover docs:
- rate-limit cooldowns can be model-scoped
- a sibling model on the same provider can still work
- if every candidate is blocked, OpenClaw surfaces the “ready in ~18s” message
So in your case, the practical fix is: