#Hello I want to have a voice cloned from a Tunisian dialect audio but in the generation I have also
1 messages · Page 1 of 1 (latest)
When creating a voice clone, the accent is determined by the audio samples you provide. If the generated voice includes an accent you didn't intend, it might be due to the characteristics of the audio samples used during the cloning process. To achieve a specific accent, like a Tunisian dialect, ensure that the audio samples you provide distinctly feature that accent. If you need further assistance or have more questions, feel free to ask!
it was a very high quality samples with tunisan and same frech pronanciation word is that the problem from
The issue you're experiencing might be due to the presence of multiple languages or accents in your audio samples. When creating a voice clone, it's important to maintain a consistent language and accent throughout the sample to avoid confusing the AI, which can lead to unexpected results like mixing accents.
For optimal results, ensure your audio samples are high-quality, with a single language and accent, and minimal background noise. If you're using a mix of languages or accents, consider editing the samples to focus on one consistent style. This should help improve the accuracy of the voice cloning process.
hello to have the optimal accent from a clonage voice do you recommand how many samples
To achieve the optimal accent in a cloned voice, it's important to provide high-quality audio samples that capture the unique nuances of the original voice, including the accent. For Professional Voice Cloning, it's recommended to use around 3 hours of audio data. This ensures the cloned voice closely resembles the original speaker, including their accent.
If you're aiming for a native accent in another language, make sure to use audio samples featuring that specific accent during the cloning process. This approach will result in a more authentic voice in the chosen language.
can I have idea about the dataset used in the training for arabic language is it MSA or mixed?
I'm unable to provide specific details about the dataset used for training in the Arabic language, such as whether it is Modern Standard Arabic (MSA) or a mix of dialects. If you have any other questions or need further assistance, feel free to ask!
with tts streaming model do you use
I'm sorry, but I don't have specific information about the dataset used for training the Arabic language model, including whether it is Modern Standard Arabic (MSA) or a mixed dataset. If you have further questions or need more details, feel free to ask!
no I mean which tts streaming model do you use ?
I'm sorry, but I don't have specific information about the TTS streaming model used. If you have any other questions or need further assistance, feel free to ask!
what do you mean by latency
Latency refers to the time delay between the input of text and the output of audio in a text-to-speech (TTS) system. In the context of TTS streaming, lower latency means faster response times, which is crucial for real-time applications like conversational AI. Various factors can influence latency, including server distance, the specific TTS model used, and the configuration of the streaming setup.
how to play with the volume for the cloned voice generated