#Is There Any 40K Pretrain Model for High Pitch Singing Inferring?

1 messages · Page 1 of 1 (latest)

burnt moss
#

My GPU does not work with Python 3.10 or above, therefore, I am using Mainline RVC, not Applio. I can only use standard HifiGAN, not RefineGAN.

My training dataset has low, mid, and high pitch range in singing, however, the high pitch part is not a lot.

I'm hoping to use a pretrain model already covers the high pitch range, so after training, when I infer high pitch vocals, the high pitch part is covered, there will be no electric-like artifacts in the result.

So far, I have found KLM 4.2 made by Seoul, and SingerPretrain by Sztef, are covering high pitch. However, both models are in 32K only.

If my inference target is 1/3 English, 1/3 Chinese, 1/3 Japanese, in singing, standard 40K, in Mainline RVC, what would be the best pretrain model to use, to achieve this high pitch coverage goal?

buoyant knot
burnt moss
burnt moss
buoyant knot
burnt moss
buoyant knot
burnt moss
#

So, you are saying: X4 is better than x1, x2, and x3, for KLM 4.3?

burnt moss
#

Yes, we installed a Linux Mint in a Virtual Machine, it is pretty good!

#

However, RVC mainline is so solid on python 3.8, we simply don't bother building a new Torch of 3.10 for Applio and Mangio.

burnt moss
buoyant knot
burnt moss
#

I don't have much high pitch singing in my dataset for training, but I want to infer high pitch singing for inference, to avoid artifacts, should I go with KLM 4.3 X4?

#

The KLM 4.2 description marked in yellow, is exactly what I want. I wonder, does KLM 4.3 have the same high pitch feature as 4.2, or is 4.3 a completely different model?

buoyant knot
buoyant knot
# burnt moss

then why not try it, as well as some others having mentioned above

#

also there are cloud alternatives to train, i.e. some colab and kaggle notebooks

burnt moss
burnt moss
#

Most of our source are recorded in 44.1K, so 40K model is most suitable.

buoyant knot
#

I mean try whatever available so far, as there might be nothing satisfy 100% of your needs

burnt moss
burnt moss
buoyant knot
burnt moss
#

Also, Colab recently updated to Python 3.11 and CUDA 12.4.
I can run training on native python 3.11.
Also, I can install a Conda enviroment and run Colab on python 3.8 or 3.10.
Both are working.
Which python version would you choose for training, if you were me?

buoyant knot
burnt moss
buoyant knot
#

regarding the dynamic range, the compressor & limiter effect on vocals may reduce it

burnt moss
burnt moss
#

My understanding is, the narrower the dynamic range, the better, am I right?

buoyant knot
burnt moss
#

You want to apply compression and normalization, to keep the volume consistent as possible, right?

#

You want to normalize to -23.0 LUFS, right?

buoyant knot
burnt moss
buoyant knot
burnt moss
#

Got you hater!
You mean: Normalization does not break internal relative waveform relationship, only overall.
Compression does!

burnt moss
buoyant knot
burnt moss