#✨│ai-help
1 messages · Page 142 of 1
Hello friends, here is a guide that I wrote about the best way I found on making AI vocals sound natural with more body, more depth and bringing it back to life.
Hmm.... Your guide seems pretty detailed, good job.
If you think you know enough about RVC, maybe you can apply for QC/Helper role using "/jointeam" command
I'm newbie with it, and i don't know what link i should click to download
Can someone show me the specific link that i have to go?
thanks, will check it out
after model training i got a file called "G_2500.pth" is that something i training or i dont need that?
just model name.pth and .index ?
Nope, you don't need the G and D files.
You only need the .index and the .pth
aight thanks
@odd shale aww well thanks, never thought of that
Added index file yes, between 50 to 200 mb
Keep in mind that QC means you must check the models that get uploaded to the server.
And delete those ones that don't meet the minimum requirements or directly break the rules.
how big is a typicall pth file after trainig?
And helper obviously means you must help people with problems and questions they may have with RVC/model making and etc.
the final .pth will always have 50 MBs of size or a bit bigger.
ok perfect
Okok, thats something that I would see myself doing yes
Okie.
i need help with rvc
but i cant send images
ok i guess im copy pasting
JSONDecodeError Traceback (most recent call last)
<ipython-input-12-afbc89fd2d38> in <cell line: 31>()
31 if os.path.exists(config_path):
32 # File exists, proceed with creation of creds and client
---> 33 creds = Credentials.from_service_account_file(config_path, scopes=scope)
34 client = gspread.authorize(creds)
35 else:
5 frames
/usr/lib/python3.10/json/decoder.py in raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
can someone please help
Ayo? @empty saddle level 3 !!! 
FL Studio
kk
so used to it by now
I did learn a little bit Cubase over the years and slightly Ableton but I always come back
for Recording, FL Studio is absolute dog shit lmao
ya i tried cubase but it messes with my audio
it muted my headphones and i cant hear
do you have a audio interface?
just removed anything steinberg cba
if not, then its your ASIO4ALL driver that needs to be configured
probably HDMI was set in your main output
happens a lot
yeah FL is the most understandable DAW for real its user friendly
most stock plugins are good enough to work, but I recommend checking out Meldia plugin pack, its free and it has over 40 plugins
sounds good will do
RVC Guides (How to Make AI Cover)
📚 All-In-One English documentation
Full AI Voice Model Training Guide (Local)
Link: YouTube
Credits: Christopher Villanueva
Model training with Mainline RVC
Link: Rentry
credits: Raven (ravencutie21)
AICoverGen Colab Guide
Link: Google Docs
Credits: Eddy (Spanish Helper)
Create a model with RVC disconnected (colab)
Link: Google Docs
Credits: Angetyde
i downloaded this: GPT-SoVITS, by RVC-Boss GitHub. demo vid is in chinese and i have no idea how to use the TTS with the pth and index file that i already have 
i dont see either option
u using mainline or applio?
GPT-SoVITS, by RVC-Boss GitHub
Guys, does someone knows why some of voice models breathing sound sounds like a robot lol
Hello boys and girls can you help me with the problem? It jumps a lot value RES (Buf) normal
And because of this I have a shredded voice
And Lagging voice
Rvc can't really handle breathing sounds
well, not that well anyways, but you could prob make a model that's just composed of breathing fine. Just does not handle breathing very well with models that can speak.
i have a problem with "RVC Realtime"
every time it infer voice, for the first word it glitching/cut
-for example i count 1 to 5, that 1 will cut/glitching, rest are fine
-next i stop speaking for several second, and tried speak again, it happen again
-only happen everytime on first word after silence (at start of it inferring)
any guess what happen or solution?
Windows 11
i5 11400
RTX 3070 Laptop
It could be because I often press the start of the stop turn off and turn on the projection?
Ayo? @velvet geode level 1 !!! 
And what?Do I need to hold my breath when I say?
you need noise suppression on your mic
so it doesn't pick up your breathing
and other noises in general
There is silence around me and normal microphone only I say and there is silence around
Oh really? Bc when they like say S sound or F or H they sound like a robot🥲 i thought maybe there is something i can change in the settings:)
okay i downloaded applio. i see the logs folder in the directory. there is no weights folder.. do i create it?
Has anyone help to set me so that he does not lag?
But how can some people’s ai covers sounds so real?
replace the breath with better breath sound
even if its silent around you, it's still going to pick up your breathing lol
i was asking you which one you were using but anyways
open applio and go to the download tab then drop the pth and index
Now, silbants like the s, f, or h sounding weird is a model training issue.
How?
Ayo? @calm crown level 3 !!! 
and can often be made worse by someone not using the correct sample rate for their audio when training
For this I include all points in the program (Echo, Sup1, Sup2)
audacity
I dunno how well Okada's suppresion works
How to make an ai cover on android?
okay one last question
what do i select here: TTS Voices
Select the TTS voice to use for the conversion.
Ayo? @native parrot level 2 !!! 
Help
hmm do u want to use tts or nah
i gotta be real i've never used applio ¯\_(ツ)_/¯
yes! so basically i have a script. I want characters like sonic the hedgehog to say it
i rarely use tts so i can't help 😭 soz
https://docs.aihub.wtf/tts/tts-tools/ u might want to see this
i loaded the voice model and index file of optimus prime but what do i select in the TTS voices
im confused lmao
nw thank you tho
whats thi new thing with the new applio version?
Applio TTS uses Edge TTS, for the voices just select one that has the same gender and language of the model you wanna use
Hey so, how do I find the best setting to use for my computer
There is an issue, but nothing I can solve as far as I know.
Ayo? @frosty yacht level 1 !!! 
i selected this US-EN voice but the output doesnt sound like optimus prime AT ALL 
The best you can do is play around with the pitch & search feature ratio, and try a different voice, ofc applio tts won't be always perfect as rvc is made only for speech to speech and applio tts just generates an audio with edge and then converts it
Could also depend on the quality of the rvc model you chose
Is there any alternative where I can get the best results for TTS?
somebody good with x minus?
you could try gpt-sovits
rvc models won't work though. you have to train it with voice files first
You could either try by generating an eleven labs tts audio and use it as an input, or its better you use gpt so vits that is built for tts, but u cant use rvc models here
Else you can just play around with the applio tts settings till u get something decent
okay holy shit i did this and it actually work 😭
thank you!!!!!
You're welcome!

RVC Guides (How to Make AI Cover)
📚 All-In-One English documentation
Full AI Voice Model Training Guide (Local)
Link: YouTube
Credits: Christopher Villanueva
Model training with Mainline RVC
Link: Rentry
credits: Raven (ravencutie21)
AICoverGen Colab Guide
Link: Google Docs
Credits: Eddy (Spanish Helper)
Create a model with RVC disconnected (colab)
Link: Google Docs
Credits: Angetyde
it has this weird glitchiness to it
are those settings in the website you gave me
yep
crap
Kinda like this:
yw
so is the fm graph on rvc disconnected accurate?
Apparently not...
It is glitched out
HI HI so, I'm trying to upload a vip-file of a model I have but it just wont upload into the bot (public colab no ui) am I just being too impatient or is this a common issue? ty!
it is a google drive link and that didn't work so I downloaded it however it's still not liking it if I upload
whats this program?
Can't I use a pth file for rvc with svc?
anyone help me whats this?
run the .bat file
you can disable the flask server in settings
how can i enable?
already enabled...
??
i dont understand what do you mean sorry
you see this because you have the flask server enabled
and where can I turn this off?
or does it cause problems when it is on?
?
turn off in Applio settings
which one?
second one
thanks
hello anyone here?
Does anyone have a link to the latest local rvc?
https://docs.aihub.wtf/ have em
Last update: Mar 10, 2024
Thanks!
not welcome
Hi everyone
I'm using the weights ai platform but the male voice doesn't sound like a female voice
Which platform do you recommend?
Please help me
Is there an active person?
Change the pitch
Hello everyone, Can someone send me or tell me if there is a tutorial to use the voice models, I lost knowledge of how to do it
Check our docs https://docs.aihub.wtf/
Last update: Mar 10, 2024
RVC Guides (How to Make AI Cover)
📚 All-In-One English documentation
Full AI Voice Model Training Guide (Local)
Link: YouTube
Credits: Christopher Villanueva
Model training with Mainline RVC
Link: Rentry
credits: Raven (ravencutie21)
AICoverGen Colab Guide
Link: Google Docs
Credits: Eddy (Spanish Helper)
Create a model with RVC disconnected (colab)
Link: Google Docs
Credits: Angetyde
However, it did not give the result I expected
-uvr
for whatever reason my voice isnt being converted :( any help?
i've read over the voice changer stuff but nothing seems to help me much
im using the w-okada one
check your command prompt
I don't see anything nessecarily out of the ordinary
i can't post an image but my settings dont seem to be anything out of the ordinary either from what i know. @proper shale Any ideas?
Ayo? @peak lark level 1 !!! 
oh i can post now, here
and heres my cmd also
i hope the ping is okay also
sorry if not
I saw someone say something about howpassthru should never be red it should be green, or something
Mmm. Seems like something went wrong with the models folder 
Or with this model specifically. Try reuploading it in a new slot
on it sir
Also, don't use crepe - use rvmpe_onnx instead as ur f0
yeah i just tried again and same thing
this is what it says when i try one of the base models
I used colabs for applio and after some hours i lost connection. The lastest pth should be in content/program_ml/logs but i cant find a program_ml folder. What am i doing wrong ? oO
Ayo? @tacit tinsel level 5 !!! 
alrighty, try this:
delete pretrain and model_dir folders, restart W-Okada, see how that goes
check assets/models too?
not really an applio user soo
hm no assets folder either
also just another thing im under the assumption im starting from here?
maybe i lose all progress after losing connection ?
cause he logs in to a different machine?
yeah
well yeah
okay i'll get back to you when it's all up again
kk
🙏 hoping it works
i asked this like 10 times so, sorry but whats a rvc library for python? That does not load the model each time?
I... don't think there is?
maybe i extract it again lol
Kinda lacking on that department but yeah... 
Damn 
im completely new to this so maybe i fucked something up myself without realising
Ayo? @peak lark level 2 !!! 

maybe screenshot all the process of command prompt...
just so we can see if there's anything wrong there
good idea
oh damn, i guess i have to try to use some sort of rvc api then
Maybe this could help?
rvc disconnected isn't working for me anymore, it keeps saying filelist not found everytime I try to train a model with titan, this time I used 40k sample rate with titan and it still didn't work
It has something to do with Index... that's just weird
Hm?
could this be because I'm using the autosave feature
Probably Titan is the issue
Just use original pretrains, they seem to not give this issue.

yeah I think that was the issue since now it's training
I'll keep that noted
Ty
Can anyone remember the software or site where you’d get the voice and combine with the song
@proper shale thank you by the way i got it to work but it's super choppy
-rvc
📚 All-In-One English documentation
Full AI Voice Model Training Guide (Local)
Link: YouTube
Credits: Christopher Villanueva
Model training with Mainline RVC
Link: Rentry
credits: Raven (ravencutie21)
AICoverGen Colab Guide
Link: Google Docs
Credits: Eddy (Spanish Helper)
Create a model with RVC disconnected (colab)
Link: Google Docs
Credits: Angetyde
Np 👍
using ov2 for pretrain, any ideas?
What's your current settings rn?
that's normal in the beginning
I switched to VAC instead and that seemed to fix it i guess? I feel like a lot of these voices are for americans lol
so I should evaluate it like any model, by its lowest point?
Ayo? @pure shale level 1 !!! 
im kinda at a disadvantage being british
lowest point in g/total isn't always the best model - it's just an average of all the other g stuff
aka. generator
which is your model
oof 
You can use a little bit of index but idk if that's gonna help a lot
I thought g/total is what you should go off though for finding when it starts to OT
well, it's not really the main thing you should look after, since it can go up even when the model is still improving
lmao. the only solution is to make your own b'ri'ish model.
my recommendation:
check mel and kl
mel being how close it's getting to reproducing the clarity of the model
kl is how different it the model is from dataset
lower is better
but like, I can't imagine the divide in between British and American accent is that large in rvc terms, unless you've got a heavy one
and, if it's still going down, don't stop training, it's probably still improving
heard that, thanks man I really appreciate it
i just feel like a lot of these dont sound right on me at all lol
like i know theres british voices
but no british girl voices from what i can see
tbh ive seen once a live example, someone turned up the index of a french model and it sounded actually french 💀
yw 🙏
that's not because of your accent, that's just how it is for 99% of the random voices you're gonna try.
like most of them are going to sound kinda weird and un-natural in realtime
you can make it pretty damn good, just requires a lot of effort
like making your own models, and that's just the first step.
but uh, obviously most people aren't willing to put that much in xD
ye, but if you don't wanna do any of that, I'd try out merging some of the models you sorta like. Merged models tend to sound quite a lot better in realtime.
Hey folks, I'm trying to to run TensorVENV, based on the docs.aihub guide and when I run it as an executable, the batch file, it opens up windows cmd but it gives me the below errors (i put the file in my applio-main folder that has the files to open applio: Installing virtualenv...
The system cannot find the path specified.
Creating a virtual environment named tensorboard_venv...
The system cannot find the path specified.
Activating the virtual environment...
The system cannot find the path specified.
Installing TensorBoard into the virtual environment...
'pip' is not recognized as an internal or external command,
operable program or batch file.
Downgrading packages for troubleshooting...
'pip' is not recognized as an internal or external command,
operable program or batch file.
'pip' is not recognized as an internal or external command,
operable program or batch file.
'pip' is not recognized as an internal or external command,
operable program or batch file.
'pip' is not recognized as an internal or external command,
operable program or batch file.
Launching TensorBoard...
'tensorboard' is not recognized as an internal or external command,
operable program or batch file.
Keeping the command prompt open...
Press any key to continue . . .
as long as you're not merging garbage, anyways
merged?
sorry i am incredibly new to this lol
you can merge the weights of different models to create a mix of them.
you can do it pretty easily in the w-okada merge lab
oh interesting
Ayo? @peak lark level 3 !!! 
like here's a merge I made as an example
i just wanted to make a decent girl voice for shits and giggles mainly
apparently people could hear themselves echoing or sometging through my mic too
that just means your mic is picking up the sound of their voices from your headphones
ah right
and turning that into the ai voice of your model as well
you really need sound suppression on your mic if you wanna use it a lot
and like a noise gate too
enable sup2 + increase s thresh
whats a speaker id
i had that all the way up and sup2 and 1 on i’m 99% sure i’ll check tomorrow
It was supposed to be a way of using multiple voices in one .pth, but it doesn't work really
Just don't mess with it, tbh
amigos alguien sabe como puedo e contrar la voz de sherk
is it normal that RVC is taking 700+ seconds to process the path to the training folder

depending on how long ur dataset is... 
uhhhhhh
Ayo? @green barn level 1 !!! 
its over 1k seconds now btw
wait send a screenshot this is weird
its not in english
anyway this was it
forgot to ping ya lol (do you see something that might be off?)
classic, cpu processes breaking stuff
id close off cmd and the page, restart, and use only 2 cpu processes
it just finished now, if i dont do it will the whole thing be slower?
...nvm then
like, next time... do 2 cpu processes
well, no, sadly
BUT
we got something even better, that supports RVC models
and is completely free
EOL - No further Updates
Github - Blanc-dot
Discord - Blanc_dot
Despite being end of life, most if not all information has not really changed, so should be very accurate until actual new stuff comes out.
Other Links
Antasma's Local Error Fixes
Antasma's Colab guide
Sushi's useful Links - You need...
^^^
then this is useless for me
unfortunatelly ill stick with voice.ai (i bought a yearly subscription), do you know if there are any model makers that are compatible with that?
no
im very sorry to say, but, you spent that money in the worst platform possible
mm?
that's a saving grace but mhhh
hm, thanks for the help then, i appropriate it 
yeah, why do you think so many people come on here talking about it lol
nobody would be using that shit if they didn't add rvc2 support
@slim geyser sorry for the ping but do you know something that might be able to help me?
Ayo? @green barn level 2 !!! 
idk how to add it to voice.ai, but they do let you upload .pth files there, right?
yw man 🙏
if so you should just be able to train a model normally and use it on voice.ai that way
so, here's the thing
apparently not, since most people that have come here with voice ai say it can't upload their models
I have no clue, i added a custom model to it once, dont know what type it is tho
I'll have to search abt that later
Today, I will show you how to use Voice.AI free AI Voice Changer in Realtime and record mode with custom RVC models. Transform your voice into any YouTuber, VTuber, Anime Character or Celebrity.
Download Voice.ai: https://voice.ai/
Most popular RVC models: https://linktr.ee/rvc_bestvoices
you can do it, apparently
but, I think they make you use their coins to actually use any voice you put on there
which is pretty funny
the Voice.ai voice builder fucking sucks and will always generate me an elderly mans voice instead of the one im looking for (Female character from a web series)
Huh... then how come so many people say it doesn't work with them? Kinda weird 
user error prob
but voice.ai sucks so uh... anyone using it should be given direction towards anything but that instead of tech support for it xD
Don't doubt it, but ye
Tbh thats the only thing people have told me to use so far
It's a shame people don't promote free and open-source stuff
before joining this server
Yeah it's insanely popular
We tend to have this bias of being focused on open source AI space and we don't really know what goes out to the masses
its not from them, more like from friends and research online
you can see some of it on that "voice changer guy" channel, it's pretty funny
like super fake ai voice trolling
duckus
i swear, the amount of people ive seen asking for "duckus egirl voice model"
that is a good selling point you gotta be hinest tho, im not a programmer and i dont know shit on how to code so yk
Yes, true
bro i watched one of his videos once to see what all the hype was about and man i was disappointed
nah duckus at least looks like he trolls real people. On that voice changer guys channel every short where's he's trolling people is fake lol. It's just ai voices or his own voice masquerading as the team mates he's trolling.
oh
because the entire thing is a front to advertise voice.ai lol
which happens in every short
i just checked and the main model i use on voice.ai is a .pth
So they can do it, probably
That's good yeah
Is that the model that rvc gives ya?
you don't need to know how to code to use rvc at all, or even train models.
it's not that hard honestly
mhhhhhhhhh yes
but
the nature of it all can seem overwhelming for a new user
I mean it's prob pretty hard for little jimmy who wants to troll but uh
especially like, w-okada or rvc realtime
you open up, theres a black box with text downloading shit
you feel like you're about to drop a bomb somewhere
💀
but, once you reach the GUI it's smooth sailing
I mean if you don't even know what a command prompt is
guess i gotta reinstall it (i just delted it ffs
)
then I guess
unless you get an error, and then it's... confusion
but that's a pretty low bar
its alr
Anyone know how to get Mangio to work with zluda?
sadly im not sure
dunno if amd peeps in here even do thst
Anyone know where I can find some good sources to use for training an ai voice? Looking to get a m-f egirl setup

Not certain if sources is the proper word for that
Well... probably YouTube
You'd have to dig deep to find an interesting voice... 
Well yeah but I'm not sure what I'd look for, I don't want to end up spending 5 hours cutting down the audio and whatnot
ASMR shit maybe... im trying to think here
Reaction vids, perhaps?
There's a lot of ways tbh
as long as you're smart and choose audio with no background noise already, should not take that long
I had something that was like 15 minutes long and it was pretty clean, but it didn't seem to produce a good output at 300 epochs
Or even up at 500
You need to merge to get a good result for realtime, honestly
Mhhhhh
Not necessarily?
like as long as your models are nice and clean, that's all you need to merge
Yeah true
Personally I'd buy my own but, don't have the money for that atm 🤣
So I've considered making my own
I've never heard a singular model that could not be improved in realtime with merging
voice lines
it's a scam anyways
Is it possible to share merged models?
Yeah I believe so
merged models are just normal .pth files
where is the .pth file placed then?
For flips sakes, I just realized that when I made the ai voice I deleted the wrong file and accidently trained 500 epochs on a voice that had background and random cut offs 😭
...normally like any other model
ah
That'd explain the quality
ive looked there and couldnt find it
Maybe check logs or something, or sort files by most recent and check manually 
they go in here, in the 199 folder
Whats a good software to cut up the audio?
I forget the name of the one I'm thinking of but it's pretty popular
audio something?
audacity?
Ok, it might be a while till i get to test this bec im on amd and the person whos has all my .pth files and is on nvidia is not home currently.
wut
you cant merge on amd
you can

i cant
18a
might be the new version? I've only got 16
ight let me downgrade rq
Cleaned dataset and sample still sounds a bit noisy (200epochs)
and if you've got amd and use real time a lot, might wanna use rvc's regular voice changer client
don't have to convert to onnx or anything
and it just performs better
i actually use emoji's fork
i dont have to use onnx
might have broken merging on the fork tho
only problem is that rest doesnt work
yea its broken 100%
but I will say, when you merge for the best results you gotta put together voices that are pretty different
to like give the voice more range, y'know
well uhh
cuz I know you like your asmr voices xD
yeah asmr is only good in merges as like a light sprinkling
still cant in 15
Is crepe or harvest better?
just makes you sound weird if you merge the asmr in way too hard
try uploading some models of the same sample rate?
well my models are like hard asmr they are more like mommy voices
wait do i have the iq of a toddler
trying to merge with no uploaded models, perhaps xD
i have the iq of a toddler 
it didnt work in 18a bec i didnt have any .pth files uploaded
only onnx
uhh so when i try to download the model using the rvc v2 easy gui colab it fails for some reason and i cant discern why
yeah but still, if you merge stuff that sound very similar, still just gonna sound kinda weird. Like I've heard someone who merged like 3 vtubers and 3 porn asmr people and it just sounded kinda weird and unnatural. It was very smooth, sure, but not realistic in any way.,
is it because he merged 6 voices?
nah
heres the error
---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
<ipython-input-5-fb5ac1b09727> in <cell line: 31>()
31 if os.path.exists(config_path):
32 # File exists, proceed with creation of creds and client
---> 33 creds = Credentials.from_service_account_file(config_path, scopes=scope)
34 client = gspread.authorize(creds)
35 else:
5 frames
/usr/lib/python3.10/json/decoder.py in raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
@junior walrus is having this issue with the latest version of Applio, never seen this error. First time they're installing applio. Anyone know what this is? Sharing because they dont have image perms here
you can merge 6 just fine, my merge has like 7 different voices at this point, but in different percentage merges
sheeesh
like you gotta control for the kinda voice you want with percentages you merge, y'know
and a lot of trial and error
yea
like this is the merge, for example, and I don't think it sound super weird or unnatural
btw i was able to find the merged file
The real time voice is lagging to me, should i get a better video card or a better ram?
Ayo? @prime wedge level 1 !!! 
like what is this supposed to mean
it does this with any model i put in
Ayo? @turbid owl level 1 !!! 
@junior walrus check out this thread im replying to
post your settings, +your gpu
Ok
Sorry for some portuguese words there, i'm from Brazil
But you'll understand it
I'm guessing you didn't export your model to onnx, right
are you using the direct-dml version of okada as well?
try exporting your models to onnx
Ayo? @slim geyser level 22 !!! 
besides that you're using a laptop so uh, would not be surprised if it was not strong enough to run it
i used to make models on a laptop
Yup xd
took me more then 8 hours to make a model but i did it
is onnx a type of archive?
this is an amd laptop
not gonna happen lol
there should be a big button beside save settings
to export the model to onnx
then you need to save that somewhere and reupload it to okada to use it
erm actually you can use rvc disconnected 🤓
in this context you were obviously talking about training locally 🤓
Oh okay
I'm exporting now
Looks better now, besides still laggy, but i'm a laptor with amd user so xd
Ayo? @prime wedge level 2 !!! 
I'm planning to upgrade my laptop
just get an actual desktop
if you're going to be spending enough on something that can run rvc, anyways
i downlaoded a model from weights.gg and it didnt give an index with it
what do i do or is it fine without the index
🤔 What batch size would you guys suggest for a 4070
Ayo? @crystal gull level 5 !!! 
woot woot
"It's advisable to align it with the available VRAM of your GPU. A setting of 4 offers improved accuracy but slower processing, while 8 provides faster and standard results." This doesn't make complete sense, how come the lower the batch size the better?
Would it just be sacrificing speed / quality at a range of 1-12 for a 4070
Last time I trained it was at 12
Oh well.. Now I can't even get the training working..
IndexError: list index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\BumiMandias\Downloads\Applio-3.1.1\env\lib\multiprocessing\process.py", line 315, in _bootstrap
self.run()
File "C:\Users\BumiMandias\Downloads\Applio-3.1.1\env\lib\multiprocessing\process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\BumiMandias\Downloads\Applio-3.1.1\rvc\train\train.py", line 220, in run
net_g.module.load_state_dict(
File "C:\Users\BumiMandias\Downloads\Applio-3.1.1\env\lib\site-packages\torch\nn\modules\module.py", line 2152, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for SynthesizerTrnMs768NSFsid:
size mismatch for dec.ups.0.parametrizations.weight.original1: copying a param with shape torch.Size([512, 256, 16]) from checkpoint, the shape in current model is torch.Size([512, 256, 24]).
size mismatch for dec.ups.1.parametrizations.weight.original1: copying a param with shape torch.Size([256, 128, 16]) from checkpoint, the shape in current model is torch.Size([256, 128, 20]).
Saved index file 'C:\Users\BumiMandias\Downloads\Applio-3.1.1\logs\Mary2\added_IVF654_Flat_nprobe_1_v2.index'```
your model seems to have a very bad static problem
they always do
Does the audio have a lot of white noise, or? Because that's not normal
._. after raising my volume to 100% i hear crackling in the background of the vocals
i used ultimate vocal remover to clean it up but clearly it did not work too well
I would just try to avoid any audio with big background noise in general, messing with uvr or alternatives is a pain in the ass, and if used incorrectly can often make the audio quality worse
OkOk so I had this installed awhile ago I forgot what the programs called
Do i need collab pro to train my Ai online?
Im really confused rn
or is it better to train it locally
They're both options and not a requirement. So training locally doesn't mean your models will sound better. You can train faster because you have a good GPU. Without a GPU, you're limited to 3-4 hours of colab daily and yk colab pro gives you 100 credits https://docs.aihub.wtf/
Last update: Mar 10, 2024
Real-time Voice Changer Client Demo
RVC
w-Okada
Applio
Mangio-RVC
SVC So-Vits
Any of those recognizable?
anyone know why on realtime voice changer demo none of the voices work except for the one that says beatrice?
Hey! Quick question: HuBERT or Contentvec?
Beatrice is a different type of model, it's not rvc and I have no idea why it doesn't work also
how to train AI voice Model Train in mac ?
'Morning guys.
I have a question:
I’ve tried the Google colab to create RVC model, mais at the final stage (download the Index and Profile), I got an Error message.
Is it because I have click a seconde time to "Generate Index" after the Training ?
Thanks in advance
I dont think there's a good backend for that purpose in mac. even like other NPU-equipped "AI" PC/laptops, Mac M3s can't outperform Nvidia GPUs' TOPS performance.
https://www.tomshardware.com/tech-industry/artificial-intelligence/nvidia-criticizes-ai-pcs-says-microsofts-45-tops-requirement-is-only-good-enough-for-basic-ai-tasks
also don't cross-post everywhere 
Which notebook are you using?
help with the petrain that doesn't start the train why does it create this folder?
?????
Not necessary.
If you're on AMD, onnx.
notebook ?
the google colab url/link
This interaction has expired, use the command
/guides realtimeif you wish to see it again.
For any problems ask in #🔍│help-w-okada or #1192011222023950368
Contentvec seems to be slightly better but honestly didn't try it
Yes check https://docs.aihub.wtf/
Last update: Mar 10, 2024
I need help with the rvc google colab
I install the 4 hidden cells, step 1 basically
but on step 2, it gives me a red error
"File exists, proceed with creation of creds and client"
"JSONDecodeError: Expecting value: line 1 column 1 (char 0)"
Ayo? @jolly onyx level 1 !!! 
probably outdated colab
write -colab into this channel for the new ones
- Applio, by IA Hispano Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- easyGUI, by rejects Google Colab
- 🆕 UVR5 NO UI for Google Colab, by Eddy Google Colab
- Applio, by IA Hispano Huggingface Spaces
- Ilaria RVC, by thestingerx Hugginface Spaces
- RVC-HFv2, by r3gm Huggingface Spaces
- AICoverGen, by r3gm Huggingface Spaces
- Advanced RVC Inference, by r3gm Huggingface Spaces
- RVC v2 Huggingface version, by Clebersla Huggingface Spaces
FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/MyDrive/rvcDisconnected/my_model/%s/my_model/filelist.txt' What is the reason for this error @pastel oak
Disable autosave and try again
Is there any method to make RVC more accurate?
I'm using W Okada and Mangio but the result are just the same
Not much you can do tbh...
Train until it overtrains
Set an initial goal and see if it's overtrained by then is also a good one

Ayo? @errant mist level 1 !!! 
Can be, yeah
If you're on Colab (RVC Disconnected especially) make sure to limit ur training sessions
Ah nice
Go above and beyond with it 🙏
I see a lot of people can make a good voice changing with particular setting, but when I try their setting it is just not eve close sometimes
It really depends on the voice model being used and what voice is being converted
Experiment with pitch and index settings, if it doesn't convert well enough then it's probably the model
I understand about pitch
but what about index?
Index is basically the accent part
Pronunciation
If you increase it will try to convert the original pronunciation to that of the model
You're welcome :)
Sometimes that can happen, try enabling Sup2 and increasing the S Threshold until it goes away
So I am new to this please be kind 🙏 , I've been training a model recently with this dataset
disc=3.111, loss_gen=2.528, loss_fm=8.051,loss_mel=17.230, loss_kl=1.483
I started this around 1am, now it is 9:30 am. Am I doing something obviously wrong? Is there a way to optimize my training parameters to reduce the epoch duration?
I'm running a gtx 1060 6gb and my usage looks is around 70% but it fluctuates..
when im starting the voice changer (thru file, start_http), it gets stuck on the black screen with random script-looking texts, and doesnt actually open the voice changing app. any cause to this?
it didnt work! 🙁
Ayo? @dawn talon level 1 !!! 
yeah it's always only the path (pth) and INDEX files
Ayo? @proper sapphire level 1 !!! 
-colab
- Applio, by IA Hispano Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- easyGUI, by rejects Google Colab
- 🆕 UVR5 NO UI for Google Colab, by Eddy Google Colab
- Applio, by IA Hispano Huggingface Spaces
- Ilaria RVC, by thestingerx Hugginface Spaces
- RVC-HFv2, by r3gm Huggingface Spaces
- AICoverGen, by r3gm Huggingface Spaces
- Advanced RVC Inference, by r3gm Huggingface Spaces
- RVC v2 Huggingface version, by Clebersla Huggingface Spaces
There shouldn't be a config file
Maybe activate the Echo option too 
So how so you use svc
...we don't use SVC anymore
guh
only RVC 
still the same
alr is there a download link
https://docs.aihub.wtf/ has all of em
Last update: Mar 10, 2024
and guides too
ok
RVC > Local section
what's the difference between RVC and SVC
yeah that's probably your mic settings n stuff
better W Okada or Mangio?
RVC is newer and faster to convert and train
ok
Depends on what you wanna do
Okada is for realtime, mangio is for audio voice conversion
mac how to use
https://docs.aihub.wtf/ read the RVC > Cloud section
Last update: Mar 10, 2024
I guess I need it to realtime
Yeah then use Okada for that
I see
mic bleed, probably
Mangio is for singing and kind of stuff then
Yeah
Not sure what's up, make sure you've followed all the steps accordingly n stuff
I'm having trouble with the AICoverGen on huggingface
it keeps piling up requests on queue and it never ends

does someone know anything about it?
AICoverGen is borderline hell in Huggingface Spaces 
oh
You'd have to wait like 20-30 mins for a result
And if it messes up (which is sadly common) you'd have wasted your time...
ooh
thank you
could you advise me on what I should use instead, then?
I'm mostly new to it
Yeah there's a lot of Colabs and stuff in https://docs.aihub.wtf/ - check the RVC > Cloud section
Mainly use Applio or Ilaria RVC for inferencing, they're great and get the job done
Last update: Mar 10, 2024
Although you will need to isolate vocals
thank you!
I got that part sorted out
Ah you should be good then
You're welcome :)
If you have any more questions just let us know!
Ive not done any ai covers in awhile is it the same as before? Using that google collab
Mmmm I'd suggest using the newer ones
https://docs.aihub.wtf/ has them + guides, just look at the RVC > Cloud section ;)
Last update: Mar 10, 2024
Noted thanks
I use EassyGui 4/20/24
Ayo? @rotund knot level 1 !!! 
What is the best way of making rcv modules rn
nah i aint downloading 3 gb
Ayo? @frozen token level 1 !!! 
What is a good audio sample size to begin with when training
I have around 5 minutes rn
go for 10 if possible, 5 mins works too ig
use a different local fork, that's pretty outdated sadly
https://docs.aihub.wtf/ RVC > Local section
Last update: Mar 10, 2024
mainly use Applio
is it supposed to stay at the phase before it prints "end preprocess" for a long time
if you have too many cpu processes, maybe 
Can I run process data and feature extraction at the same time
CPU's only at 9%
Mmm no, preprocessing needs to be done first
btw what does this mean
why is the rvc gui file 3 gb
Yeah but what did you set in "Number of CPU processes to use" (iirc that's the name)
8
even one whole 5 min file still takes no time
cause it has everything needed for RVC
I see
this thing is gon take 5 hours to finish
Does this change anything
Yea not really, it's a feature that doesn't work
size of the file is 124 mb
Ayo? @proper sapphire level 2 !!! 
It like gets stuck halfway through the files
ah
What batch size are you using?
...or what step are you on?
8GB should be enough for training
8
Mmm, maybe lower it to 6?
the recent Nvidia driver should never cause that OOM error, also batch size 8 should fit on your 8 gb vram
i am using 8
i haven't updated my nvidia driver in like half a year
and might better use mainline RVC 1006 or latest Applio
this fixed it somehow
took 3 minutes for one epoch
i'll probably leave it on as a sleep
can i use a realtime gui on a cpu
instead of a gpu
i want to use the realtime audio conversion on a cpu
on a laptop with a not good gpu but a very good cpu
basically a laptop with a igpu
Possibly? It'll be kinda slow though
itd be beneficial to either use emojikages wokada fork for best results, or create a virtual environment but i cant help with that
yea ill need thjat
cpu works pretty decent on emojikages fork
10+ sec delay to cope with
if u have a decent/good cpu
my laptop has a i3 12 gen
6 cores 8 threads
I'm not sure whether the latest version works good on CPU though... Iirc was ok with b1805, but with b1823 and later it's not the same
Or it's actually fine and I used onnx which provided better performance at lower quality
Question, using Kit.ai, how do you even download specific voices?? Or is everything behind a subscription paywall???
But let's say quality and latency-wise CPU is bearable at these settings on Ryzen 7 5800H for me
Should be 1.2-1.4s delay in total I guess
You can't download models sadly, prefer stuff like weights.gg and Colab
Gotcha, thank you for letting me know. :]
You're welcome, hope it works for ya :)
Hello there, does rvc on colab still work?
Yeah, just use some of the newer ones:
https://docs.aihub.wtf/ has the main ones that are up to date and working
Last update: Mar 10, 2024
And guides for em too
Ty
is there a way to adjust model for my voice?
Ayo? @mild quiver level 1 !!! 
In realtime? You can do that with pitch (Tune, up = higher pitch) and pronunciation (Index, up = makes accent similar to model)
hey i cant press start on the voice changer
.
i click start and nothing is happening
what is going on
im getting this error [Voice Changer] VC PROCESSING EXCEPTION!!! 'devices' argument must be DML
'devices' argument must be DML
Traceback (most recent call last):
File "voice_changer\VoiceChangerV2.py", line 223, in on_request
File "voice_changer\RVC\RVCr2.py", line 211, in inference
File "voice_changer\RVC\pipeline\Pipeline.py", line 197, in exec
File "voice_changer\RVC\pipeline\Pipeline.py", line 114, in infer
File "voice_changer\RVC\pipeline\Pipeline.py", line 108, in infer
File "voice_changer\RVC\inferencer\RVCInferencerv2.py", line 47, in infer
File "voice_changer\RVC\inferencer\rvc_models\infer_pack\models.py", line 953, in infer
File "torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "voice_changer\RVC\inferencer\rvc_models\infer_pack\models.py", line 112, in forward
File "torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "torch\nn\modules\sparse.py", line 162, in forward
return F.embedding(
File "torch\nn\functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: 'devices' argument must be DML
Failed to infer: 'devices' argument must be DML
Ty!
Yw :)
Did you select CPU?
Ill try that tmrw
I get this error when trying to use applio 3.2.0 or 3.1.1 (I ran the installer)
Hello i'm trying to use this colab https://colab.research.google.com/github/SociallyIneptWeeb/AICoverGen/blob/main/AICoverGen_colab.ipynb#scrollTo=NEglTq6Ya9d0
but i'm getting some issues generating, like an error about missing utils
-colab
- Applio, by IA Hispano Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- easyGUI, by rejects Google Colab
- 🆕 UVR5 NO UI for Google Colab, by Eddy Google Colab
- Applio, by IA Hispano Huggingface Spaces
- Ilaria RVC, by thestingerx Hugginface Spaces
- RVC-HFv2, by r3gm Huggingface Spaces
- AICoverGen, by r3gm Huggingface Spaces
- Advanced RVC Inference, by r3gm Huggingface Spaces
- RVC v2 Huggingface version, by Clebersla Huggingface Spaces
use the compiled version
Ya presenté dos examenes
It wasn't even that big of background noise
Pretty clean
I think
Had light background noise maybe like 35% of the voice volume
What pretrained model do you guys recommend?
titan
I wasn't using any pretrained models previously, if I find a better dataset. Do you guys think next time I should try using a pretrained?
Perhaps titan like blaise suggested?
Gotcha gonna download that and try it
Does it end up taking a shit load of more time to process?
Either way, a reference pretrained model is essential.
Gotcha perhaps that's another reason why my models have always been turds
You weren't using any pretrains?
Nop
not even the normal rvc one?
Don't forget to keep your eye on the graphs.
Just been using applio without any custom pretrain model setting
Oh, applio loads a pretrain automatically
Oh yeah it says pretrained was selected but I wasn't using any custom one
I see now
Yeah that wasn't the problem with your model
And, a constant static that's like 35% of the voices volume is pretty loud
Ah still a good idea to use titan though yeah?
if there's constant background noise you should be able to barley hear it at like max volume
The audio didn't have any static originally, just some light background music. And I tried to remove it but it removed the music and added a light static
Mind if I ask what voice you are trying to train with?
I grabbed some e girl talking off youtube, it has quite a few pauses and stuff like that which probably isn't for the best. I can't find anything better anywhere that wont take me hours to separate. I'm trying to end up with mrmodz quality voices
United recommends these settings:
I just found a more clean youtube video so I'll try that, and I'll also try truncate silent thanks for the tip
and if you really want a good model, you should go through the audio after you do all the processing and make sure you didn't miss anything, like little mic taps, bits of background noise that got through, little sound effects added in if it's a stream, etc...
Can someone help me with removing extreme echo from an audio? I've looked at UVR, can't figure that out, and every website that says it'll remove echo takes forever to load, is there a concrete way I can do it? Or can someone do it for me?
Come to think of it, I don't think I did that on my current model I am training.
wow embed fail
lol
it's pretty important to do, because even if you find like the perfect kind of audio where it's just them speaking with a nice mic, no music or anything, there's often going to be some stuff left in there that you don't want.
True. Even raw game audio files.
jesus audacity is confusing
like, even them putting on a voice that's different from their normal speaking voice for example, like them mocking someone with an exaggerated accent, is something you'd wanna cut out
cuz it's not the kinda of thing you want influencing the rest of the audio, y'know
Isn't variety essential? Within reason?
pov have to look up a youtube video on how to cut audio in audacity
Yes, I'm not talking about cutting out yelling, or any emotion, I'm just saying anything that doesn't really sound like them normally.
Because if you're making a model of them, you want their normal right? Not them putting on a weird voice lol
missed a not there in the first sentence
that changed what I was talking about entirely, very cool me
LOL yeah I was a bit confused
same
Ayo? @junior halo level 6 !!! 
What about laughing or quick giggles?
I think I know what you mean. Kinda like when we make silly voices little of the time during casual conversation.
completely fine if you wanna use the model for realtime.
that was completely a joke by the way I'm just doing this more for my sake
You do you. :P
I've always just kind of liked.. Having stuff? But not doing anything with it I guess
Ayo? @crystal gull level 6 !!! 
Hey, if you're willing to put in the effort to make a nice voice, that at least puts you above the million little timmys who wanna catfish and girl troll.
xD
🤣
Yeah I'm working in the mines man seeing what I can produce
Going to get rid of all the pauses and then run a train using titan 👌
Amazing, what are you working on?
aaaaaa
How many epochs?
To 500. For now.
That one right there is at 500?
This is an 18 minute dataset.
daddy?
Wait, no. Training is still in progress. That there is an audio file straight from the game.
OH
I see
getting kratos vibes from that voice
Was shitting myself
Nice how did you retrieve all his audio clips? Go through and play them yourself?
Kratos?
No, I download em from a particular site. An archive of in-game audio files. Both wav and ogg.
kratos when he's being sad, not kratos when he's screaming and murdering
Gotcha
I ended up being at 7:47 on my dataset, going to train and see what happens I guess ;p
I am a huge fan of Warframe in general. There aren't enough voice models of characters from that game.
Not a bad dataset size.
Batch size should be set equal to your vram typically right?
Don't remember if I asked that before bit braindead
guh I have to train the pretrained models myself?
I'm not sure. Not sure what it has to do with vram unless you're training locally.
No you don't, no.
Titan ended up just being a folder of .wavs
Yes
I'm not sure what to tell ya. I'm not experienced with local training.
you downloaded the wrong thing
ofc
you the need the D and G files
yay my dataset sounds okay
So these two yeah?
Welp thats another gig download to wait for
Oh I haven't known if I should do this or not, should I make an index?
Ah yes now constant errors whenever I attempt to train
Sometimes just instantly cuts out and just says "Model trained successfully"
.-.
omg im finally training after like 15 minutes of trying to fix the issue (still have no clue what was going on)
What’s your GPU
4070
@noble crater can you check this?
Some issue with index and overlapping names or something? It kept reverting my name I put to an old name and having issues
Though I got it working now whatever was going on
Love my 4070, already at 60 epochs
Indexs just sound weird in realtime, not worth it
Okay can I still train with the index and just not use it or..?
Unless you wanna post your model on here, I guess, then you need an index
Because regardless of what I do it looks like it's generating an index
Oh okay gotcha, welp I have 1 just in case
Oh wow it's sounding pretty good at 120 epochs only
On a second check it just seems quite monotone, when testing tts voices atleast. Not sure if that's normal. And it's like it's taking a lot of info from the input voice over the model itself
The tts I use makes a HUGE difference in the voice quality
yeah, that's how rvc works. It's trying to recreate the input voice but with the data from the model
so if you have monotone Microsoft sam going in, it's going to sound a lot like microsoft sam xD
Oh
And, I would not test the model with tts or inferring audio files for your use case, it's not going to sound the same in realtime.
Just as a little test*
Gotcha what would you suggest to test realtime using
don't you have okada?
Ah no I don't have that installed atm, Is that the best choice?
anyone know how to fix this? sounddevice.PortAudioError: Error opening OutputStream: Device unavailable [PaErrorCode -9985]
Oh I see, should I attempt okada then or do you have another suggestion?
I'd recommend the realtime client that comes with RVC natively, idk if applio still has it on their branch locally. But if they do you should have a file named go_realttime_gui.bat
@pastel oak know anything abt this?
Nop they don't appear to have it natively
Ayo? @crystal gull level 7 !!! 
Atleast not via applio
you can just get it from rvc's github then
Alright it's sitting at 500 epochs and I'm downloading the rvc realtime :p
also, idk how many checkpoints you saved, but one really repetitive thing you can do to find the best sounding checkpoint is check every like, 20 epcos, choose the best one from that range, and then check every single checkpoint within 10 of that checkpoint, because even one epoc of difference can have a noticeable effect on how the model sounds.
ah
just something to keep in mind for the future if you wanna be a perfectionist about it
Gotcha
error log haven't trained in a while so I don't remember what it was before someone probably told me before:
villager no 😭
"ModuleNotFoundError Traceback (most recent call last)
<ipython-input-3-957433113790> in <cell line: 4>()
2 #@markdown Link the URL path to the model (Mega, Drive, etc.) and start the code
3
----> 4 from mega import Mega
5 import os
6 import shutil
ModuleNotFoundError: No module named 'mega'
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
To view examples of installing some common dependencies, click the
"Open Examples" button below."
Any help?
Did you install all the requirements using the command?
no im currently looking at a youtube video that may be outdated so i need to get back on track 😭
pip install -r requirements.txt
Old colab, use the newer ones in https://docs.aihub.wtf/
Last update: Mar 10, 2024
And perhaps that too
Has guides too
How good is google collab?
Probably wouldn't be even close to my 4070 but I'm curious how fast people are running epochs on collabs
Good for training n stuff, but limited sessions make it meh
appreciate it 👍
How long does it take to hit 500 epochs on it generally for you?
🙏
Why is it when I train a certain voice, it's heavily accented? I trained using different datasets of the same voice... yet it's still heavily accented. Just wondering, why?
It really varies
anyone know?
Perhaps it's the voice itself
yeah, probably that
It took me like 40 minutes with my dataset to hit 500
ffmpeg error, you probably linked the path to a file and not a folder
no it was a folder
Perhaps you are using the wrong pretrained model?
hmm tts rvc is weird, sometimes it generates decent results sometimes it generates absolutely garbage results, why?
just how edge-tts can be sometimes
||how does one edge tts?||
hmm any tips, because 99% of the time the result is trash