[Solved] Model sounds worse after cleaning dataset | AI HUB | Page 1

cunning cradle Feb 18, 2024, 10:08 PM

#

i cant hear the samples rn coz im in school

#

Things you can improve:

#

Use MDX 23C 8K FFT instead of Kim Vocal 2
Denoise once instead of twice because of it might compress audio more so it loses some details
Dont normalize your audio if your audio is loud enough already
handpick 20 minutes of clean audio from your processed dataset
Finally, train it then compare with your previous versions

#

also i would like to see spectrograms of your dataset, you can check vit via audacity or spek

#

im at school rn, will talk to you later

#

no worries mate

cunning cradle Feb 19, 2024, 3:50 AM

#

parts from 3:40 to 6:40 is sus

#

also for this one you need to truncate silence

austere marsh Feb 19, 2024, 5:27 PM

#

since no one mentioned this yet, I see that you have static noise above 15khz which can cause artifacts. never use models that have static noises in the first place. Also, the secret sauce to for this is to split your audio into 10-30 seconds and normalize each clip, yes it might sound too much work but it's how it is.

#

still not sure if there's bulk editing for normalization of audio

#

another thing, use correct sample rate for training based on your maximum frequency response on your spectrogram, it's to prevent the HiFiGAN to predict sounds that wasn't present in the dataset, that's what causes the robotic sibilances like the 'S' sounds

#

32K = 16Khz
40K = 20Khz
48K = 24Khz

cunning cradle Feb 19, 2024, 6:13 PM

#

austere marsh still not sure if there's bulk editing for normalization of audio

you can do that in vegas pro

austere marsh Feb 20, 2024, 3:40 PM

#

just match your pretrain sample rate from your audio dataset to avoid artifacting

#

32K is your best bet

south impBOT Feb 20, 2024, 3:52 PM

#

Ayo? @oblique pagoda level 8 !!! lfg

oblique pagoda Feb 21, 2024, 12:53 PM

#

[Solved] Model sounds worse after cleaning dataset

cunning cradle Feb 21, 2024, 8:27 PM

#

it always uses a pretrain

#[Solved] Model sounds worse after cleaning dataset