Simple help | AI HUB | Page 1

distant dawn Jan 27, 2026, 10:21 AM

#

I’m trying to make a singing model and I’m confused by all the 32khz or 40khz or 48khz stuff as well as the refinegan and hifigan stuff and pre train .
Can anyone simply outlay what is the best to use for a singing model
I.e. what KHZ, refinegan or hifi, and what retrain I should use?
Thanks

spiral prawn Jan 28, 2026, 3:08 PM

#

distant dawn I’m trying to make a singing model and I’m confused by all the 32khz or 40khz or...

32000, 40000 and 48000 Hz are audio sample rates used in RVC voice models. I'm not sure what your dataset audio looks like as a frequency spectrum (not an actual file) but it depends it.

golden rain Jan 29, 2026, 10:31 AM

#

Check the frequency spectrum of your dataset and pay attention to the high frequency part of the graph (ignore outliers, if any). Multiply it by 2 and choose the next sample rate. E.g. if your spectrogram peaks at 16kHz, choose 32kHz. If it peaks at 18, choose 40 or 32. You can go lower, but shouldn't go higher than needed (e.g. if it peaks at 20, don't choose 48 as 40kHz is already saturating the spectrum).
It'll also probably depend on the pretrain, but supposedly 32kHz is in many cases supreme over higher frequency variants in terms of sibilants and breath noises.

#

Honestly just try various stuff and compare it, the best way to learn

distant dawn Jan 29, 2026, 11:19 AM

#

golden rain Honestly just try various stuff and compare it, the best way to learn

It’s hard to try when you only have limits on compute use 😭

golden rain Jan 29, 2026, 11:34 AM

#

Understandable. Especially if your dataset is not very small.
I think a good start is 32k and original pretrain or perhaps legacy core 1.5

#

(bear in mind that I'm by no means a pro in the field, I'm still learning as well)

#Simple help