#Legacy Core V1.5 [refinegan, hifigan, spinv2, contentvec, 48k, 40k, 32k]

1 messages · Page 1 of 1 (latest)

obtuse mantle
#

And what has changed since v1?

languid marlin
#

I believe the model now has the same stability as the original pretrain

obtuse mantle
#

ok good

#

I'm telling you this because I used v1 for my Italian entries.

languid marlin
#

40k version of this pretrain will be posted later

languid marlin
#

Legacy Core V1.5 [hifigan, contentvec, 40k, 32k]

languid marlin
#

Legacy Core V1.5 [refinegan, hifigan, spin, contentvec, 40k, 32k]

#

released legacy core 1.5 for refinegan and hifigan spinv2

languid marlin
#

Legacy Core V1.5 [refinegan, hifigan, spinv2, contentvec, 40k, 32k]

patent osprey
#

Im gonna test refinegan 32k contentvec and hifigan contentvec tomorrow

hot rune
#

what are the improvements of this over og/default pretrain?

languid marlin
hot rune
languid marlin
# hot rune differences between 1.5 and 2.5? also it looks like 2.5 came before 1.5 .. ?

1.5 was trained using a batch size of 16, m4singer dataset, 21 singers (sid 0 and 1 are the same due to technical reasons, this is not bad, og pretrain does the same), 32k and 40k were trained in mainline, 48k is being trained in applio

2.5 was trained using a batch size of 64, using an ai upscaled version of ljspeech (22050 to 48k) and m4singer without bass singers, sid 0 and 1 are different and was trained in applio

#

i did not named this legacy core 2.6 because the dataset is not the same, this uses the same exact dataset as the first legacy core, hence why i named it 1.5

hot rune
languid marlin
#

you have to train a dataset with both pretrains and compare yourself

#

both pretrains are universal, works for both speech and singing

languid marlin
hot rune
languid marlin
#

best is 32k hifigan contentvec

languid marlin
#

Legacy Core V1.5 [refinegan, hifigan, spinv2, contentvec, 48k, 40k, 32k]

#

added 48k hifigan contentvec pretrain

west summit
languid marlin
patent osprey
#

32k tested pretty good!

uncut briar
#

God damn theres so many

languid marlin
#

The 40k have very poor sibilant and breath handling because RVC boss created a poor 40k config.json file (the OG 40k also have poor sibilant and breath handling).

48k has better sibilants than 40k; it still produces artefacts quite often, but not as much as 40k. This is a problem with RVC itself, due to limitations of the architecture. Legacy Core 48k is fine as a pre-train.

The 32k handles unvoiced sounds better than the others because it's closer to the 24k.
Refinegan is... OK, but I prefer NSF Hifigan.

The pretrain works fine though. It's just that I don't like Refinegan; it doesn't sound natural to me.
So, for me, the best options are the 32k and 48k Hifigan ContentVec versions.

Oh, and SpinV2 doesn't seem to work well with Hifigan. In this case specifically, it made my model sound robotic, probably due to technical limitations (RVC is very old and Spin was made recently).

Just use the ContentVec pretrained models for the best results.

dull rune
#

alr gonna try out ur pretrain with my kayne west dataset

#

let you know how it goes

#

using 48k pretrain , hifigan / contecvec for it

#

sounds pretty good right now, and this was all isolated by the way, 12:47 mins of dataset with a batch size of 4 (trained on rvmpe)

#

dataset did have some distortion in it

dull rune
dull rune
patent osprey
#

Wait NSF Hifigan is that something new?

languid marlin
uncut briar
#

What's the recommended batch size?

languid marlin
uncut briar
#

Alr

languid marlin
#

an user told me that this pretrain made their speech model to randomly add vibrato to held vowels in realtime
this is normal because i only used singing data to train the pretrain, and realtime infer is more delicate than normal inference, most likely this problem is not present if the inference is done in applio/rvc

uncut briar
#

1.5 vs 2.5 you think for realtime then?

plush inlet
#

1.5 is usually not recommended for realtime as it has those "wobbly pitch changes" as lyery mentioned

#

lyery's currently working on 2.9 which is trained on ears.

#

it's available on 48k as of now

steep widget
plush inlet
#

only 48k tho

steep widget
#

oh

#

I shall wait for 32k since that's peak

plush inlet
#

the only problem with 2.9 tho is the raspiness which lyery couldn't fix

steep widget
#

what caused the raspiness?

plush inlet
plush inlet
uncut briar