#Free-tier Gemini model configured as primary but paid model getting used instead — no fallback loggi

1 messages · Page 1 of 1 (latest)

ornate moat
#

I have google-free (free Gemini API) configured as the primary model for both image analysis and audio transcription, with the paid google provider as a fallback. Despite the config being correct and the free API key working when tested directly, the paid model is sometimes used. There's no way to trace why.

Config

Image analysis:

{
"primary": "google-free/gemini-3.1-flash-lite",
"fallbacks": ["google/gemini-3.1-flash-lite"]
}

Audio transcription:

[
{ "provider": "google-free", "model": "gemini-3.1-flash-lite" },
{ "provider": "google", "model": "gemini-3.1-flash-lite" }
]

Both google and google-free providers have their own separate API keys configured. The free key is functional — tested it directly against Google's API and it returned valid results.

What I've checked

  1. Config is correct per openclaw.json and config.get
  2. Both API keys are present and different from each other
  3. Free key works — sent a test image directly, got a valid response
  4. Searched trajectory logs for the model used by the image tool — found nothing. Trajectory logs don't record which model was selected for media processing, only the main text model

The problem

The image tool uses runWithImageModelFallback(), which builds a candidate list from primary → fallbacks[] and tries each in order. On any error (HTTP 429, timeout, empty response, etc.) it moves to the next candidate silently. When the paid model ends up being used, I can see it on the billing side but can't tell what caused the fallback — there's nothing in the logs to work from.

Looking for guidance on:

• How to add visibility into which model actually processed a given image/audio request
• Whether there are known issues with the free Gemini API key's per-minute/per-day quotas triggering fallback
• Any config changes that could help reduce the frequency of fallback to the paid tier

lapis charmBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

magic valleyBOT
#

Yeah, your read of the code matches what I’m seeing. There are two slightly different paths here:

image / pdf tools use runWithImageModelFallback() in src/agents/model-fallback.ts. That helper records attempts in the tool result, but unlike the normal text model fallback path it does not call logModelFallbackDecision() for candidate failure/success. So gateway/trajectory logs won’t reliably tell you “google-free failed with 429, then google succeeded”. That’s a real observability gap.

• automatic media/audio understanding goes through src/media-understanding/runner.ts. It builds a decision object with attempts/chosen provider+model. On total failure it warns; on success it only emits the summary under verbose logging, e.g. Media understanding audio: success ... via google-free/gemini... or via google/....

Immediate things you can do:

  1. Turn on verbose while reproducing:
    /verbose on
    then tail logs:
    openclaw logs --follow

For audio/media-understanding this should at least show the chosen provider/model on success. Docs: https://docs.openclaw.ai/help/faq

  1. If paid fallback is unacceptable, remove it temporarily. For image analysis, make fallbacks: [] or omit the paid fallback. For audio transcription, remove the second { provider: "google", ... } entry. That turns silent paid fallback into a visible failure, which is often better while debugging cost leaks.
#
  1. If you’re running from source, the clean instrumentation point is:
    src/agents/model-fallback.tsrunWithImageModelFallback()
    Add logging on both candidate failure and success, ideally reusing logModelFallbackDecision() from src/agents/model-fallback-observation.ts, or at minimum log:
    capability=image provider=<candidate.provider> model=<candidate.model> attempt=N/M outcome=failed/succeeded error=<sanitized>

For audio/media-understanding, useful places are:
src/media-understanding/runner.ts around runAttachmentEntries() / final formatDecisionSummary(decision)
src/media-understanding/runner.entries.ts where runProviderEntry() returns { provider, model }

On Gemini free tier: yes, 429 / RESOURCE_EXHAUSTED is a very plausible cause. A direct one-off API test only proves the key works; it doesn’t prove you’re under RPM/TPM/RPD limits during real OpenClaw traffic, especially with image/audio payload size and concurrent requests. Google’s free API quotas are lower and can be per-minute/per-day/project/model, so intermittent fallback to paid is exactly the shape I’d expect from quota pressure or timeout/empty-response handling.

Config knobs to reduce paid fallback frequency:

• Remove the paid fallback while measuring.
• Narrow media/audio scope so fewer attachments get transcribed automatically.
• Lower attachment size/max bytes if large media is pushing token/latency limits.
• Avoid bursts/concurrency where possible.