Hello,
I am trying to create a model of a character from the game Mordhau. I want the AI model to keep the intonation/accent that makes the characters voice unique.
I have attached three audio files to demonstrate:
1: A sample from the original audio that I trained on, which totaled about 5 min.
2: A sample from using so-vits-fork to generate a line using auto pitch Dio.
3: A sample from using RVC with a 700 epochs Dio V2 model. I've also trained a 250 epoch Harvest V2 model. They sound very similar / have the same issue of sounding fairly lifeless.
I recently switched to RVC instead of so-vits-fork, hoping that it would give better results. But I was disappointed to find it sounding lifeless.
I've tried most settings under "Model inference" such as the different pitch extraction algorithms, except the feature index stuff which I don't understand. I just chose the corresponding one in the dropdown.
Does anyone have any tips? I must be missing something in RVC?
