I have 5 open claws. 3 of them are using Gemini Flash 2.5 and I have been getting 429.
my head claw, Eric, has said the following(pasted below). wanted your advice.
We reviewed the calls we were doing and there was not much volume that would trigger any sort of rate limiting so i am pretty sure its something internally that we need to solve for.
Eric:
Still 429s. Same pattern — hits seq 4-5 within seconds of starting. Session hygiene didn't fix it.
My theory about context bloat was wrong, or at least not the full story.
I think there's something in OpenClaw's internal handling — it may be auto-retrying on 429s under the hood, creating a rapid burst at the framework level rather than from Andy's behavior. That would explain why it happens within seconds of every fresh session regardless of context size.
Check OpenClaw config — look at whether there's a retry setting in openclaw.json that's causing automatic rapid retries on 429 responses. That would be the actual bug to fix.
Eric:
Found something significant. Look at line 29050:
OpenClaw is using Google's Prompt Caching API for Gemini agents — it tries to cache the system prompt (brain files) to reduce token costs on repeat calls. This is a separate API endpoint from the generative API. When this cache operation fails or hits its own rate limit, it sets a 10-minute backoff and then retries, potentially causing rapid-fire requests to the cache API on each new session start.
The 429s might not be from generate content at all — they could be from the prompt cache creation API, which has separate and more restrictive limits. That's why the generate content metrics look clean but we still get 429s.
This is an OpenClaw internals question — whether there's a way to disable or throttle prompt caching for Gemini agents. John, this is something worth raising in the OpenClaw Discord or with support. The behavior is in the compiled dist — I can't change it from here.