Gemini 2.5 Pro and Flash support implicit caching with an average TTL of 3-5 minutes, but this TTL does not update with repeated requests. Cached data is automatically managed by the model, so you don't need to manually create or delete caches. The caching behavior you observe—sometimes using cached prompts and sometimes re-caching—can be due to this TTL expiration or internal cache management, not specifically because of OpenRouter.ai.
OpenRouter simplifies Gemini cache management by abstracting cache handling; it does not control when caching hits or misses happen. To maximize cache hits, keep the initial part of your prompt consistent and push dynamic content toward the end.
Google's caching on Gemini 2.5 Pro/Flash has minimum token thresholds (1028 tokens for Flash, 2048 for Pro) to qualify for caching, and cache writes include a fixed 5-minute storage charge. The cache expiration after 3-5 minutes and the model's internal logic can cause the inconsistent caching pattern you see, including mid-conversation re-caching when the TTL expires.
This behavior is consistent with Google's own API and not unique to OpenRouter.ai.
For details on Gemini caching and usage:
Are you primarily using system prompts with large, static content for caching, or do you also cache user messages?