I'm a beginner in RVC training, and I'm trying to figure out what I'm doing wrong.
The issue: When training a voice model with the 42 minute dataset I created as shown in the image provided, the results are really really bad. It can't even form words, and when I load in other models created by other people, their models are noticeably way way better than my voice model.
Computer Specs:
GPU: RTX 5070 TI
CPU: I-7 14700k
RAM: 32 GB DDR5
App Used For Training: Applio V3.6.0
Pretrain: Legacy Core V1.5
(I'm unable to upload my Dataset audio since the file is too big to attach)
(I truncated the silence and normalized the audio with Audacity)
(Also, I did try the default Applio settings for preprocessing the dataset, and it was still just as bad as my custom settings)
)