#[Solved] Model sounds worse after cleaning dataset

1 messages · Page 1 of 1 (latest)

cunning cradle
#

i cant hear the samples rn coz im in school

#

Things you can improve:

#
  • Use MDX 23C 8K FFT instead of Kim Vocal 2
  • Denoise once instead of twice because of it might compress audio more so it loses some details
  • Dont normalize your audio if your audio is loud enough already
  • handpick 20 minutes of clean audio from your processed dataset
    Finally, train it then compare with your previous versions
#

also i would like to see spectrograms of your dataset, you can check vit via audacity or spek

#

im at school rn, will talk to you later

#

no worries mate

cunning cradle
#

parts from 3:40 to 6:40 is sus

#

also for this one you need to truncate silence

austere marsh
#

since no one mentioned this yet, I see that you have static noise above 15khz which can cause artifacts. never use models that have static noises in the first place. Also, the secret sauce to for this is to split your audio into 10-30 seconds and normalize each clip, yes it might sound too much work but it's how it is.

#

still not sure if there's bulk editing for normalization of audio

#

another thing, use correct sample rate for training based on your maximum frequency response on your spectrogram, it's to prevent the HiFiGAN to predict sounds that wasn't present in the dataset, that's what causes the robotic sibilances like the 'S' sounds

#

32K = 16Khz
40K = 20Khz
48K = 24Khz

cunning cradle
austere marsh
#

just match your pretrain sample rate from your audio dataset to avoid artifacting

#

32K is your best bet

south impBOT
#

Ayo? @oblique pagoda level 8 !!! lfg

oblique pagoda
#

[Solved] Model sounds worse after cleaning dataset

cunning cradle
#

it always uses a pretrain