#Google vertex on oss 20b and 120b started failing on ~50% and OR routes only to this provider

7 messages · Page 1 of 1 (latest)

sour patio
#

All my requests are routed to Google Vertex because of BYOK.

Half of those requests fail because Google returns 15-25k tokens of "\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n" without anything meaningful.

I'm using structured outputs.

Disabling Google Vertex BYOK fixed the issue because it started routeing to other providers.

I'm allowing multiple gpt oss providers and also have fallback models defined so my expectation is that Open Router should detect that there is something wrong with Google Vertex and de-prioretize it or fallback to different models after generating nothing.

pallid auroraBOT
#

I'm an AI assistant - my answers may be incorrect and I cannot see screenshots. Please verify information with the OpenRouter docs.

To clarify your situation and help escalate this effectively:

  1. What is the exact model identifier (e.g., google/gemini-2.5-flash, oss 20b, 120b) that’s returning these "\n" tokens?
  2. Can you share the full error response or payload you receive (especially HTTP status code and any error fields)?
  3. Can you provide a sample of the request body (or relevant code) you’re sending, including your fallback configuration?

These details will help the support team investigate if OpenRouter’s fallback logic is working as intended with BYOK and structured outputs.

For reference, OpenRouter always prioritizes your provider key when BYOK is enabled, and “will only use your key for requests to that provider, which may result in rate limit errors if your key is exhausted, but ensures all requests go through your account” (see the "Key Priority and Fallback" section of the BYOK documentation). If “Always use this key” is enabled, OpenRouter does not automatically fall back to other providers—even on model/provider-level failures.

Once you provide the above details, I can help you determine if there’s a configuration workaround or if this requires escalation to support.

-# This is an automated response powered by Answer HQ. Use !mute to stop the bot from responding.

The author you are looking for could not be found.

sour patio
#
  1. those are my routing options

"models" => ["openai/gpt-oss-20b", "google/gemini-2.5-flash-lite", "openai/o4-mini"], "provider" => %{"only" => ["fireworks", "google-ai-studio", "google-vertex", "groq", "hyperbolic", "inference-net", "ncompass", "novita", "openai", "phala", "together"], "quantizations" => [], "require_parameters" => true}

  1. request metadata

{
"id": 25736918293,
"generation_id": "gen-1760374167-B0UFhQdqF2L0QKPLSbhw",
"provider_name": "Google",
"model": "openai/gpt-oss-20b",
"app_id": null,
"external_user": redacted,
"streamed": true,
"cancelled": false,
"generation_time": 599378,
"latency": 413,
"moderation_latency": null,
"created_at": "2025-10-13T16:59:27.63713+00:00",
"tokens_prompt": 173,
"tokens_completion": 10795,
"native_tokens_prompt": 232,
"native_tokens_completion": 26827,
"native_tokens_completion_images": null,
"native_tokens_reasoning": 26827,
"native_tokens_cached": 0,
"num_media_prompt": null,
"num_media_completion": 0,
"num_search_results": null,
"origin": "",
"usage": 0,
"finish_reason": null,
"usage_cache": null,
"usage_data": null,
"usage_web": null,
"usage_file": 0,
"byok_usage_inference": 0.0080655,
"provider_responses": [
{
"id": "ly3taPPAGvinnvgPxqnq4Q4",
"endpoint_id": "d14eeefb-d9fc-4732-b0d7-db609aad8308",
"provider_name": "Google",
"status": 200,
"latency": 413,
"is_byok": true
}
],
"provider_api_key_id": 271544,
"api_type": "completions",
"is_byok": true,
"native_finish_reason": null
}

  1. request body includes requirement for json schema output. plus routing options as provided above
pallid auroraBOT
#

Thank you for the detailed information. Here’s what’s happening based on your config and OpenRouter's documented behavior:

  • When BYOK (Bring Your Own Key) is enabled and used for a provider (in your case, Google Vertex), OpenRouter must route all requests for that provider through your key, and cannot automatically fall back to other providers, even if failures or malformed outputs occur. This is confirmed in the BYOK documentation:
    “If ‘Always use this key’ is enabled, OpenRouter does not automatically fall back to other providers—even on model/provider-level failures.”
    BYOK Guide

  • Your provider routing config with "require_parameters": true and "only" set does allow routing to multiple providers, but when BYOK is active for a given provider, requests to that provider will only use your key.
    Provider Routing

  • If the selected provider (Google Vertex) returns a malformed or meaningless response (as in your repeated "\n" output), OpenRouter will not automatically deprioritize or disable that provider for you. You must manually change your provider config or BYOK setting to avoid the problematic provider.

Summary:
OpenRouter does not currently support automatic fallback or deprioritization of a BYOK provider when it returns invalid/empty outputs. You must:

  • Disable BYOK for Google Vertex if you want OpenRouter to route around this provider.
  • Manually adjust your provider.only list to exclude problematic providers.

If you believe this behavior should change, or if you need advanced fallback for structured outputs with BYOK, you should contact support for feature requests or additional troubleshooting.

For further details, see:
-...

#

... BYOK Guide

If you have a feature request or need to report unexpected routing behavior with BYOK, please contact the mods on Discord or email [email protected].

-# This is an automated response powered by Answer HQ. Use !mute to stop the bot from responding.

sour patio
#

I have Always use this key disabled

#

this AI response is bullshit