Voxtral TTS voice parameter unclear (voice vs cloning / supported voices?) | OpenRouter | Page 1

signal quarry · 2026-04-21T00:11:18.955Z

Hey — I’m trying to use `mistralai/voxtral-mini-tts-2603` via OpenRouter and I’m confused about voice handling. * The OpenAI-style endpoint `/audio/speech` accepts a `voice` parameter * This works for `openai/gpt-4o-mini-tts` * But for Voxtral: * `voice` doesn’t seem documented * behavior is inconsistent (ignored? default? unsupported?) Questions: 1. Does Voxtral support **named voices at all** via OpenRouter? 2. If yes, is there a **list of valid voices**? 3. Or is the intended usage **reference audio (voice cloning)** only? 4. If cloning is supported, what is the correct request format on OpenRouter? Right now it's unclear whether: * `voice` is silently ignored * mapped internally * or not supported at all Would appreciate clarification or docs 🙏

I'm an AI assistant - my answers may be incorrect and I cannot see screenshots. Please verify information with the OpenRouter docs.

Here's what is currently documented for mistralai/voxtral-mini-tts-2603 and voice handling on OpenRouter:

Named voices via OpenRouter:
There is no documentation stating that Voxtral supports named voices or a voice parameter. The OpenRouter docs and the Voxtral model page do not list support for named voices, nor do they mention a list of valid voices for Voxtral models. The audio guide only documents named voice support for OpenAI models (e.g., openai/gpt-4o-audio-preview) and does not mention Voxtral in this context Audio Docs.
List of valid voices:
There is no published list of valid voice names for Voxtral on OpenRouter. This parameter appears to be undocumented and likely unsupported for this model.
Voice cloning / reference audio:
The documentation does not mention reference audio or voice cloning for Voxtral on OpenRouter. The only documented audio support is for audio input (for transcription/analysis), not for TTS with selectable or cloned voices Voxtral Small API.
Correct request format:
For Voxtral, the documented use case is sending an audio file for transcription/understanding, not TTS voice synthesis with named/cloned voices. The API examples only show input audio for analysis, not generating speech output with voice selection Audio Docs.

Summary:

The voice parameter is not docum...

#Voxtral TTS voice parameter unclear (voice vs cloning / supported voices?)