#Which Data to feed the model? (languages, singing, speaking, shouting, etc)

1 messages · Page 1 of 1 (latest)

maiden berry
#

I trained my model with a few songs and a long speech. The dataset is in total over an hour. Sometimes it seems like the Voice Model doesn't recognize the correct pitch or volume when I use it. Does it make sense to record singing, or data from the same person with multiple languages, or something like shouting as well? Or does that confuse the AI? Up till now I trained it with some ballads and some normal narrator type readings.

timber shoal
#

Yes, it does make sense to record singing, as it gives RVC more data in higher tones

#

Multiple languages is also fine I guess

#

I wouldn't really put shouting too much on it, it could affect the normal voice overall

#

Just make sure it's not a big part of the model

#

...or make a whole shouting model

#

Also, you shouldn't really use datasets over an hour long, try going for 20/25 mins of that, might be better than the full hour