#Pretrain Merge Dataset

1 messages · Page 1 of 1 (latest)

cinder brook
#

Okay so, there have been a lot of pretrain being released recently but. What if we just merge all the dataset from different pretrain and train them, like Italian, Russian, VCTK, Korean, etc. I'm not talking about model merging, but rather dataset merging to a single model.

Because like the idea is to have diverse audio, different voices, difference frequency response, etc. Not just a pretrain that is only specialized for certain languages or such.

Cuz I noticed, some pretrains sounds so thin, lacks mic pops, some of it are raspy, weird static artifacts at 12Khz above, some have too much clarity, etc.

covert deltaBOT
#
Vote for this suggestion!
azure python
cinder brook
#

also, we can have raw singing samples as well, not just speeches. This website has archive multitracks sessions and contains dry vocal singing as well

cinder brook
#

anyone interested in going one by one and get the lead vox track, that would be a good start

sharp briar
honest plover
#

a100 go brrrrr

late gyro
manic lagoon
#

That's all i'll say.

lime yacht
#

Rip mustar's gpu 😭

sharp briar
#

well, lets wait for the results

slate roost
#

I'd think larger datasets with a broad focus would yield a generalist that's not a master of anything, because of how rvc generalizes.

cinder brook
#

the goal is to have as many vocals as possible, but also keeping that high quality datasets, to have diverse audio and make it more robust.

The output's model quality will still correlate to the end user's dataset quality.

slate roost
#

It's not like d or g files get bigger in response to bigger datesets, like I don't understand exactly how it works, but there's a point when it all kinda mushes together no matter how much you put in.