I'm able to get pretty close to a cloned voice; however, the system doesn't match speed or speech cadence, tone, etc. Fortunately, these are very easy (and reasonable) controls to provide. They're easily done in an external audio editor; but it would be appropriate to add them to the UI in fine-tuning the end result. Is this something that is being planned for?
#Improved controls and adjustments needed
1 messages · Page 1 of 1 (latest)
@worn solar speed control is on the roadmap
So, I can bring these into other programs for post-processing. But, what I'm finding is that there is a LOT of trial-and-error with your AI engine. Working with those variables and not being able to set a specific "setting" for a fixed result can be frustrating, but as with all things it's growing (I get it). BUT, it eats up your credits (that's an issue). There are speed (tempo) pitch and other controls that should be reasonable to implement -- there's plenty of open source APIs for that, even. I'm guessing you're using a lot of python on the back end. Anyway, looking forward to more improvements, but consider the credit usage issue -- once things are more well defined and fine-grained, I think you can start eating up the credits, but the trail-and-error stuff, we need some wiggle room for that.
So, other features include emotion -- like humor and inflection of telling jokes, anger, frustration and other vocal nuances that will become important for an AI engine like this.