Hi, I’ve noticed that models consistently struggle with certain sounds. In my tests picking any random model and just saying a drawn out “heyyyyy” will almost guaranteed break the conversion, even in models that perform excellent otherwise. The result is usually some garbled “heyyuuuuaaaaaaaeeeee”.
I’m wondering if this is a limitation of RVC2, or just lack of training data with drawn out vowels?
I can reproduce this with any model I’ve tried so far, so I don’t think it’s me, but I’d happily try any suggestions.
