#RefineGAN base pretrains (experimental)

1 messages · Page 1 of 1 (latest)

foggy moat
dusty prairie
#

F I R S T

#

H A H A

foggy moat
#

Correction, the model names should like this

dusty prairie
#

o

median marlin
foggy moat
#

I'll run re-training for 32k tomorrow

#

for now I just want people to try 44k (40k and 48k may need a bit of training too)

#

I made these 40k and 48k as copies, but it seems they do need a bit of re-train to re-adjust parameters

foggy moat
#

RefineGAN VCTK base pretrain (experimental)

#

Note: My plan is to run more epoch on top of 44k model, current re-training is at 15e

hardy hornet
#

Is this one a bit undertrained or no

foggy moat
#

likely yes

hardy hornet
#

Oh rip so you need a longer than 10 to 5 minute dataset to train with this right?

foggy moat
#

for sure

#

at best it will be at the level of the og pretrain

hardy hornet
#

The OG pretrain needs more than 10 minutes?

wheat oak
hardy hornet
#

Same

foggy moat
#

Updated 44k weights with 50 epochs

hardy hornet
#

When will there be a full pretrain

foggy moat
#

wdym "full"?

hardy hornet
#

One that is fully pretrained on the original pretrain level

#

Since this one's under trained or something right?

foggy moat
#

parts of it are well trained (encoders and flow layers, same as 48k model + 70 epochs), the generator is at ~240 epochs overall

hardy hornet
#

Parts?

foggy moat
#

"This is a combination model that was assembled from the original HiFiGAN 48k pretrain components and a new RefineGAN generator (170e VCTK set)."

hardy hornet
#

So is this like a combination pre-train or at least is a combination of Hi-Fi GAN and RefineGAN

foggy moat
#

HiFi-GAN is just a vocoder, like RefineGAN is

#

basically, same car, different engine

hardy hornet
#

So RefineGAN is slightly better right

foggy moat
#

I saw it making very good results with test sets comparing to hifigan

#

not slightly

dusty prairie
#

also refinegan

hardy hornet
#

Is RefineGAN not fully open source

foggy moat
#

vs typical hifigan model

foggy moat
#

anyway, the question is how long it needs to train to get amazing results I have no idea.. this is why I've published this for peope to try

hardy hornet
#

Am I right about this HiFiGAN it's just a generator network and discriminator network exposed to a ton of speech data samples learn general speech patterns and then you can fine tune that to unseen data and Hubert or constant vec is for accent or F0 guidance and turns everything into embeddings for the GAN to synthesize or vocode

hardy hornet
foggy moat
#

not that either

hardy hornet
#

So the only real open source GAN is HiFiGAN

foggy moat
#

main problem is training a pretrain from 0 and we just dont know how it was done originally

#

there are 50 other vocoders all claiming to be superior... at 24000 sampling rate comparisons

hardy hornet
#

What if you move on from GANs to maybe diffusion models like seed VC or using transformers instead of retrieval based voice conversion

dense fractal
#

how much datasets length for this pretrain?

foggy moat
#

45h VCTK 0.92 set with 109 speakers

dusty prairie
foggy moat
#

no, english only

dusty prairie
foggy moat
#

Jan 14 changes: RefineGAN generator has been updated in Applio repo, removed the old models from huggingface, re-uploaded latest 44k model (can be used for 40/48k at your own risk of wasting time)

foggy moat
#

44k models have been updated to match the latest generator version

dusty prairie
#

yoooooo

foggy moat
#

RefineGAN 44KHz base pretrain (experimental)

lunar crest
foggy moat
#

nope

lunar crest
#

oh

#

i'll wait until someone does it ig

dull wharf
#

Using RefineGAN vocoder
Starting training...
The parameters of the pretrain model such as the sample rate or architecture do not match the selected model.

#

this is what happens when i use the 44k version on a 44k dataset with 44k selected in the model settings

#

other refinegan pretrains work just not this one

#

using codenames fork btw if that changes anything

#

but this pretrain is the best one and i really wanna test it out so

foggy moat
dusty prairie
foggy moat
#

good?

dusty prairie
#

yes it's not bad

#

ill stop when it overtrains or undertrain

foggy moat
#

i'm gonna run some more epochs on top, now that I have a lil better training set

dusty prairie
#

aight

dusty prairie
dusty prairie
#

@foggy moat is the pretrain have a singing voice?

#

cuz i tried it and it sounded bad

#

oh wait i used the wrong model

#

sorry for the ping

foggy moat
foggy moat
#

but I dont recommend high pitch singing though

#

if the audio you're trying to infer has 16KHz+ harmonics, it will be a mess

#

for regular speech it is great

dusty prairie
#

yep its
very great

dull wharf
foggy moat
#

latest code, not latest codename's fork đŸ™‚

foggy moat
#

updated 44k model with 24 more epochs of cleaned vctk + 10h of singing

late vault
#

a tad bit confused

foggy moat
#

same way you got the previous

#

pull the latest repository

late vault
#

sorry if this sounds dumb

foggy moat
late vault
foggy moat
#

if you're using windows, there was never a reason to use 'releases'

#

there's a compiled version you just unzip and run, no installation is needed

#

anyway, the refinegan code has not been released yet, but people just take the latest code snapshot and use that

dull wharf
foggy moat
#

If I'm gonna do it, I have to restart from 0

foggy moat
#

RefineGAN 32k/44KHz base pretrains (experimental)

#

added 32k base pretrain, 50e done so far

dusty prairie
foggy moat
#

i hve 120e done

dusty prairie
foggy moat
#

uploaded

dusty prairie
dull wharf
foggy moat
#

my plan is to get at least 300

dusty prairie
foggy moat
#

my electric bill will be thru the roof

dusty prairie
#

there was also a pretrain with 1000 hours right?

#

the rigel pretrain?

#

how the hell did he train that thing o - o

median marlin
dusty prairie
#

if he trained that pretrain locally electric bill would go brrrt

foggy moat
#

uploaded 200e 32k model

foggy moat
#

RefineGAN base pretrains (experimental)

#

32k 300e done, 40k/44k/48k have been converted from 300e 32k and ran for 5e, but they seems to need to be trained more.. so maybe 50e when I have time

foggy moat
#

I'm going to do a small update to Applio main for 44k models. If you made any with 44100 sampling rate, that's gonna break.
That's for both MRF and RefineGAN

foggy moat
#

44k model 150e posted

foggy moat
#

Restored 32k 200e weights back from this point

foggy moat
#

yeah

foggy moat
#

Trained 40k model with the latest code for 300 epochs

dusty prairie
#

yay

median marlin
foggy moat
#

yes, i did

median marlin
foggy moat
#

so it is a choice of training a pretrain with or without perfect zero silence, then finetuning with an opposite

median marlin
#

ah i see

median marlin
foggy moat
#

two or 10 mute files dont change much

#

consider them a requirement so the model does not forget how to handle it

foggy moat
#

posted 32k model, made by convering 40k and training for 20 epochs

dull wharf
foggy moat
median marlin
#

maybe one day we will also get applio's realtime gui with refinegan emoji_40

foggy moat
#

you need to ask @willow shard đŸ™‚