#If I am on a Creator Plan, can I submit my voice to be available in the Voice Changer ?
1 messages · Page 1 of 1 (latest)
Yes, if you are on a Creator Plan and create a voice clone, it will be added to your voice library. Once added, it will be available for use across all tools, including the Voice Changer.
Is it possible to train a voice that has different type of emotions ? Or is it better to stick to one particular emotion ? For example, I was thinking to record a "sad" voice and a "happy" voice for voice character
It is possible to train a voice with different types of emotions, but the AI performs best when the training samples maintain a consistent style. If you want to capture distinct emotional tones, such as "sad" and "happy," it is recommended to create separate Professional Voice Clones for each emotion. This approach ensures the AI can accurately mimic the intended emotional variations.
Using varied training samples with mixed emotions might lead to less consistent results, as the AI relies on the patterns in the provided data to generate the voice. For optimal quality, focus on training each voice clone with audio samples that clearly exhibit the desired emotional tone.
Ok I understand, thanks! And how long is the preferable lenght of the data set ? Should I remove the silences in the data set ? Or keep a "natural" editing by removing only the long gap / silences ?
For training a voice, the preferable length of the dataset depends on the type of service you're using. If you're using the Professional Voice Cloning service, you can upload up to 3 hours of high-quality audio as training data. For standard voice creation, the total audio duration of all uploaded files must not exceed 45 minutes.
Regarding silences, it’s generally recommended to use audio that is free from background noise and distractions. However, intentional long pauses can be included if they are part of the desired rhythm and pace, such as in meditation guidance recordings. These pauses can help the AI understand the intended flow of speech. For shorter silences or gaps, you can keep a natural editing style by removing only excessively long silences that might not contribute to the voice's intended characteristics.
Ok ! Any tips or best practice to record a Professional Voice Cloning ?
To achieve the best results for Professional Voice Cloning, here are some tips and best practices:
-
Recording Environment: Choose a quiet space with minimal background noise and no reverb. Avoid environments with echo or distracting sounds.
-
Microphone Position: Keep the microphone 6-8 inches from your mouth to reduce noise and ensure clarity.
-
Audio Quality: Use high-quality equipment if possible, but even an iPhone X or similar device can work. Consistency in audio quality is key.
-
Duration: Aim for at least 30 minutes of audio, though 2-3 hours is ideal for optimal results.
-
Consistency: Maintain a steady pace, tone, and volume throughout the recording. Avoid sudden changes in style or delivery.
-
Editing: Remove background noise, long pauses, and mistakes. Ensure all audio samples are of similar quality and style.
-
Short Sessions: Record in shorter sessions to avoid vocal fatigue and maintain quality.
-
Language and Style: If cloning voices in multiple languages, ensure recordings are consistent in style and quality for each language.
By following these guidelines, you can create a high-quality dataset for Professional Voice Cloning.
Thanks! Actually I am a french native speaker. I can still record in english (but with a french accent). Is ti possible to record both language in the same dataset ? If I record in french only, will my voice be available in "multilingual" ?
For Professional Voice Cloning, it is recommended to use recordings in a single language for optimal results. Mixing languages, such as French and English, in the same dataset can lead to confusion during the cloning process. If you record in French only, your cloned voice will still be able to express text in other languages, including English, using the multilingual v2 model. However, the cloned voice will perform better in the original language it was trained on.
If I want to record an english voice (Professional Voice Cloning) with a french accent, and let say I have a dataset of 2-3 hours, will it keep my accent ?