#How to get rid of AI vocal tone
1 messages · Page 1 of 1 (latest)
can i refine my inference to prevent it somehow?
ive heard examples of almost flawless sounding ais without that sort of tonality
i feel like there has to be something i can do to aid it
Use RMVPE
That's how it sounds using RMVPE during inferencing
Ayo? @plain herald level 1 !!! 
Try decreasing index ratio
It will have less accent but will sound better
And make sure your using clean .wav vocals
this is the input for inferencing, is this clean enough?
Don’t post copyrighted content
Delete
Make sure it’s wav or flac
(Don’t convert mp3 to wav that won’t do anything)
Audio downloaded from YouTube to compressed also
Make sure to isolate vocals
I always do that using bs rofomer via UVR
k
That is definitely an improvement, is there anything I can do to aid it more, or is that about the max I can do?
I don't think so.
The only way to get an almost 1:1 model of an artist done is using raw vocal takes from sessions.
At least i think so.
That is using raw vocal takes
The only problem with the session is that a lot of takes are the same line repeated
Just in a different tone or way
Does it make sense to include that in my dataset?
I think it doesn't hurt to use repeated lines with different tone.
As long as these are raw
Then it should be fine.
How much data is too much data though?
I'm refusing to put some sessions in because they were recorded differently
as in the acoustics and whatnot
but i have a good amount of raw raw data
Umm.. over 30-35 mins is too much data
With less than 30 mins it should be fine if the audio is raw anyway
But you can also use as much data as you want
Alrightie, sounds good!
Ayo? @plain herald level 2 !!! 
I'll train with more data and we will see how it goes
It's matter of testing
Of course, different variants can turn out better
How do I know what to fine tune/change?
I can't really help you on that.
Only if your output sounds kinda noisy, you could double check the dataset and clean it further
Alright, running it with 28 minutes of raw session data
Wish me luck
OOPS swapped d and g
LMAO
we're good now
its slightly different
1 - i would suggest you to retrain with Titan 32k
2 - You misplaced the paths to the G and D.
G path is where the path to the G file would go, not D
On G path change the D for G and on D path, G for D
3 - Maybe you can also try retraining with KLM4 Test 2
I did, and I fixed that right after
Why is that?
And this?
I've tested and SCRFilms (a contributor) tested and 32k is kinda less artifacty
If your dataset got little to no noise, you can use that pretrain since it's trained with singing vocals.
Oh okay, I'll attempt with both tonight and I'll share my results here again
Thank you so much for your help btw! I really appreciate it
trying with klm4 test 2
Ayo? @plain herald level 3 !!! 
Is it the repo im using to train?
Theres no option for 32
Try clicking on v1 and then click on V2 again
That fixed it!
I wonder why its like that
Maybe it's a GUI bug
i presume so
thats pretty strange though haha
alright ill retrain and share my results
wish me luck!
In case of anything you can also use the #✨│ai-help channel.
I will if needed, thank you!
im happy with this model :D thank you for your help!!
Wow, sounds almost real.
having raw sessions helped a lot
And using an HQ pretrain like KLMV2 helped too.
Yup especially klm 4 v2
yeah i noticed that i just followed what you said haha
now to record in his tonality 😭
thats gonna be a process in of itself
Also, please when showing samples don't use instrumentals.
Delete the files where you used insts.
done!