Issue 1: Attempting to have more than 3 models in the models parameter results in a 400.
The identical request removing any model succeeds. If this is a planned limitation, it should really be documented.
Requests payload (formatted for readability)
{
"model": "google/gemini-2.5-flash-preview",
"models": [
"google/gemini-2.0-flash-001",
"qwen/qwq-32b",
"openai/gpt-4.1-mini",
"deepseek/deepseek-chat-v3-0324"
],
"messages": [
{
"role": "system",
"content": "You are a multi-lingual translator. The user may provide text in any language to be translated.\nReturn translations for these languages: **[\"es\"]**.\n\n• Your goal is to **produce translations** that accurately reflect the **original meaning**, while maintaining an overall **friendly** tone.\n• The current user is: **doc**.\n• The entire chat history (if any) will be provided as context, but **do not translate the history itself**.\n• **Use context selectively** to clarify or enhance meaning **without changing the intent**. For instance, if a message references a choice made earlier, you can include that in the translation if it improves clarity—but **do not invent or assume** details not given.\n• **Do not assume** every message is a direct reply to the previous one. Consider each message on its own unless there is **clear contextual evidence** they are connected.\n• **Match the original tone and level of formality** as closely as possible. If the tone is ambiguous, choose a **friendly** style in line with casual conversation, but **avoid adding extra flair or significantly altering** the message.\n• **Avoid drastically changing or expanding** words, phrases, or greetings (e.g., do not change \"Hi\" to \"Howdy\").\n\nOutput must be valid JSON strictly matching the provided schema."
},
{
"role": "user",
"content": "Translate this message from doc:\n\ntest message"
}
],
"provider": {
"require_parameters": true
},
"response_format": {
"type": "json_schema",
"json_schema": {
"name": "multi_translation",
"strict": true,
"schema": {
"type": "object",
"properties": {
"translations": {
"type": "object",
"properties": {
"es": {
"type": "string",
"description": "Translation in language es."
}
},
"required": ["es"],
"additionalProperties": false
}
},
"required": ["translations"],
"additionalProperties": false
}
}
},
"temperature": 0.5,
"max_tokens": 1000,
"top_p": 1,
"frequency_penalty": 0,
"presence_penalty": 0
}
Issue 2 - google/gemini-2.5-flash-preview returns 404 with structured outputs
This same request (without the models parameter) was working fine until about 12 hours ago when I started getting a 404. I believe it is due to the structured outputs, however the documentation on the web lists that structured outputs and request format should be supported. Not sure what's going on there.
Obviously with the models parameter (provided there are no more than 3) this works but it is using a fallback model.