#How many pro voices can I create?
1 messages · Page 1 of 1 (latest)
Currently, you are allowed to have one professional voice clone per account. This service only permits the cloning of your own voice, not anyone else's. If you require more slots, it is recommended to try out instant voice cloning, which can yield excellent results and provides more slots.
What is the difference between instant voice cloning and pro voice cloning?
Professional Voice Cloning (PVC) and Instant Voice Cloning (IVC) are two different methods of voice cloning offered.
Instant Voice Cloning (IVC) allows you to clone voices with very short samples nearly instantaneously. It does not train or create a custom AI model, but instead, it relies on prior knowledge from training data to make an educated guess rather than training on the exact voice. This method works extremely well for a lot of voices. However, if you are trying to clone a very unique voice with a very unique accent where the AI might not have heard similar voices before during training, creating a custom model with explicit training using Professional Voice Cloning (PVC) might be the best option.
On the other hand, Professional Voice Cloning (PVC) allows you to train a hyper-realistic model of a voice. This is achieved by training a dedicated model on a large set of voice data to produce a model that’s indistinguishable from the original voice. Since the custom models require fine-tuning and training, it will take some time before you can use your voice clone. The estimated training time is roughly 3-6 hours for English PVCs and 4-8 hours for non-English PVCs.
Is it possible to train a PVC and then use IVC on that PVC for further adjustments (for instance when I want the voice to sound like a podcast host for instance)
The information provided does not explicitly state whether you can use Instant Voice Cloning (IVC) on a Professional Voice Clone (PVC) for further adjustments.
Instant Voice Cloning and Professional Voice Cloning are two different methods of voice cloning. Instant Voice Cloning allows you to clone voices with short samples nearly instantaneously. It does not train or create a custom AI model, but instead, it makes an educated guess based on prior knowledge from training data.
On the other hand, Professional Voice Cloning allows you to train a hyper-realistic model of a voice by training a dedicated model on a large set of voice data. This process takes more time and resources compared to Instant Voice Cloning.
If you have a unique voice or accent, Professional Voice Cloning might be the best option as Instant Voice Cloning might not provide a perfect replication of your voice. However, the information provided does not specify whether you can use Instant Voice Cloning on a voice that has been cloned using Professional Voice Cloning.
Low many audio does instant voice cloning need?
Instant Voice Cloning doesn't require a large amount of audio. The total runtime of the audio is more important than the number of samples. Approximately 1-2 minutes of clear audio without any reverb, artifacts, or background noise of any kind appears to be the sweet spot. The AI will attempt to mimic everything it hears in the audio; the speed of the person talking as well as the inflections, the accent and tonality, breathing pattern and strength, as well as noise and mouth clicks and everything else, including noise and artefacts which can confuse it.
It's also important to note that the AI will try to replicate the performance of the voice you provide. If you talk in a slow, monotone voice without much emotion, that is what the AI will mimic. On the other hand, if you talk quickly with much emotion, that is what the AI will try to replicate.
It is crucial that the voice remains consistent throughout all the samples, not only in tone but also in performance. If there is too much variance, it might confuse the AI, leading to more varied output between generations.
Can I also upload longer audio?
Yes, you can upload longer audio files. However, for Instant Voice Cloning, having more than 2-3 minutes of audio will yield little improvement and can, in some cases, even be detrimental to the stability of the clone. For Professional Voice Cloning, you can use between 30 minutes to 180 minutes of good audio.
It's important to note that the quality of the audio is more important than the length. The audio should be recorded professionally without any background noise, room reverb, multiple speakers, at a consistent volume with a consistent tone, and no extremely long gaps of silence.
What are the recommended gaps of silence?
The content provided does not specify the recommended gaps of silence for audio uploads.
Is there a sample input file available?
I'm sorry, but the provided content does not include information about a sample input file for voice cloning.
How many IVC am I allowed to make