Can I choose the version of Conversational AI TTS model? | ElevenLabs | Page 1

twin spireBOT Jan 20, 2025, 1:06 PM

#

AI-generated Answer for kelkka7

Currently, you cannot manually select the speech model through the web interface. The model is automatically determined based on the language of the agent. For English, the system defaults to Turbo V2, while for other languages, it switches to Turbo V2.5, our multilingual model.

sleek trellis Jan 20, 2025, 1:07 PM

#

When was the multilingual model last updated?

twin spireBOT Jan 20, 2025, 1:07 PM

#

AI-generated Answer for kelkka7

The multilingual model was last updated one month ago.

sleek trellis Jan 20, 2025, 1:08 PM

#

I'd like to roll back to the previous version because the current version performs worse in Finnish than the previous one

twin spireBOT Jan 20, 2025, 1:08 PM

#

AI-generated Answer for kelkka7

Currently, there is no option to roll back to a previous version of the AI model. Variations in voice output can occur due to the non-deterministic nature of AI and periodic updates to voice models. If you're experiencing issues with Finnish, consider using properly cloned voices specifically for Finnish and utilizing the Projects feature for better pronunciation management. If you have further questions or need assistance, feel free to reach out.

sleek trellis Jan 20, 2025, 1:09 PM

#

The issue I'm facing is in the transcription of spoken Finnish, so I cannot control that?

twin spireBOT Jan 20, 2025, 1:09 PM

#

AI-generated Answer for kelkka7

Variations in voice output can occur due to several reasons, including the non-deterministic nature of AI and periodic updates to voice models. Unfortunately, there is no set timeline for ensuring voice consistency post-update, and rolling back to a previous version is not currently an option.

To improve consistency, consider using properly cloned voices specifically for Finnish, utilizing the Projects feature for better pronunciation management, writing out numbers in full, and keeping text segments below 800 characters to minimize language switching. If you have further questions or need assistance, feel free to reach out.

sleek trellis Jan 20, 2025, 1:10 PM

#

The output of the voices are fine. The issue is in that they don't understand what I'm saying because the transcription from speech to text fails.

twin spireBOT Jan 20, 2025, 1:10 PM

#

AI-generated Answer for kelkka7

The transcript of the audio is for information only and isn't used by the AI to generate the audio. To improve the accuracy of speech-to-speech conversion, ensure that the Similarity setting is high, ideally between 90-95%, and that Speaker Boost is on. Additionally, using the Multilingual STS model can help reduce mispronunciations, even for English audio. If you encounter errors, you can regenerate the audio under certain conditions, such as keeping the same prompt, voice, and model, and if the first generation was made less than two hours ago.

sleek trellis Jan 20, 2025, 1:11 PM

#

Does Speaker Boost and Similarity also apply to Conversational AI?

twin spireBOT Jan 20, 2025, 1:11 PM

#

AI-generated Answer for kelkka7

Speaker Boost and Similarity settings are specifically related to the Speech-to-Speech (STS) feature. They help in replicating the delivery aspects of the original speaker, such as accent, tone, and intonation. If you are using the Conversational AI feature, these settings may not directly apply.

sleek trellis Jan 20, 2025, 1:12 PM

#

How do I then control these for Conversational AI?

#Can I choose the version of Conversational AI TTS model?