#✨│ai-help
1 messages · Page 210 of 1
Prob u left urself then
AUAHUAHAHUAHUHAUHAHJA
It happened after that
if you are going to make baseless claims actually fact check everything
Poopmaster told me himself other stuff argued with you about the zap thing
yes people argued about something, that everyone said not to do, including me in the messages
wake up
argue with arcelia
zap myself
mfw
anyways as i was saying, the only thing the fork does better is for amd and intel arc. its not even better for nvidia, its the same at the best case scenario
This is a help channel, there's no point going to argue here
as you bring up stuff on ur own!
I already know this argument is useless since y'all were the same trying to forcefully tell @pastel oak to put the original wkada above the fork in the guide list
man i dont think you should be a sr mod, you are letting ur emotions kick in! must be weird having a mod team with literal children on it!
no i never did
what
what the
i literally told him his issues in his guide, told him how to fix it, i havent reached out to them in like 8 months
once again, if you are going to make claims, please fact check them

me rn
if anyone has reached out to them on my behalf, it still isnt me, since i genuinely dont care about it
i even offered the entire original guide over at one point
but go off king
Anyways
No point in arguing here as I said
This is an help channel
People are here to get help not to see some arguing
i went off of misinformation in the guide, which i had them fix
!!!!!
thats not doing the claim u said
of "put the guide above!!!"
Oh no
.
.
Nah the fixes were fine i mentioned that too
Was after that but aint that deep anymore rlly
Case closed in this channel
Hello Shad my dear mod/helper
🐢
This convo is going past each other
in what channel can we argue then? 
Lmaoo
🥬
🐢: bites the lettuce
i mean, i wanna watch
im there for the memes

i trained a model on RefineGAN and 441k on the main applio branch and its not showing up on my inferencing model list
is there a switch i have to toggle in order to get it to inference in applio mode?
it's probably because the final model pth didn't save
i literally lost a model...
Can anyone help me out with UVR? I tried doing the vocal splitter and just got a peaking buzzing sound... then I tried another model and there's weird beeping sounds in my audio
Like distorted beeping
Post a screenshot of your UVR interface before hitting the convert button in here
where you can see which model you used etc.
1 minute I'm testing to see if it's a problem with my machine or overall
Shiiit I gotta eat dinner
i have a dataset that's at 44.1k, but theres not a sample rate of 44k on applio, should i train with 48k or 40k?
Well im about to head off so ill just drop a temp fix in here based on assumptions
- Assuming vocal splitter is splitting vocal from background music, I hope youre using a good model like bs roformer or mdx23 which are recommendable. If you havent messed with settings the default ones should work just fine
Else, tho uvr is great and runs on my pc i prefer to use https://mvsep.com/en with bs-roformer viperx the newest one listed, think 2024.08 version. Create an account which is for free, but lets you skip 95% of queues and get your splitting ready in usually 3 minutes
This is a split opinion thing. To keep it short and summarize the core argument:
People that say use 40k: because upsamling to 48k means the ai has to generatively fill frequencies your dataset doesnt have. Can sound worse
People that say use 48k: just upsample because you got the range of 40k to 44.1k which is important. Upsampling doesnt damage it too much
Maybe someone else thats analysed these things can give a definite argument for one of them but thats what I heard from most
i can use wtv i think the best right?
i prolly stick to 40k even that i'm loosing some frequencies
the datasets like 30 mins
that gon take a while
but imma try both
thank you
Thank you, I'm back now and will try these things. I was doing the "Vocal Split Mode" found in UVR's Additional Settings, which I think is meant for splitting lead/backing vocals. Using the UVR BVE 4B SN 44100-1 model. Kinda weird that it is hidden in the settings. Is that mode janky?
Yeah something's up with UVR on my other machine, it was bugging there but fine on laptop
What realtime voice changer should i use for my amd gpu
You can use deiteris' w-okada fork.
Guide style is in the same as vtarcelia. Thanks vtarcelia for corrections. Most technical information comes from deiteris.
Last update January 17: NEW UPDATE VERSION b2332 (from December) , adjusted known settings
Translations added for:
German: https://rentry.co/ForkVoiceChangerGuide_de
Turkish:...
Scroll down and download the AMD version.
Hey everyone, unsure where to ask so apologies if this is the wrong place
I’m a little behind when it comes to RVC, and it seems like there’s been a lot of developments. I’ve been just using mangio RVC for as long as I can remember
But, now I’m seeing so many options, like weights.gg, Applio, Voicecraft, W Okada, F5, even seeing Sovits reappear again
Where should I go to relearn my stuff?
Or, what should I be using?
Looking to make ai song covers (mostly for a bit in a video)
After reading the guide I have a few questions.
- Which audio isolation program is reccomended.
- If I download the original RVC will it be able to do the song cover or do I have to download an extension for that feature
3.specs for local training ai can't find it
depends on what you want to do
Crepe vs harvest which is the better option for songs
I got it to work!!!! But the vocals sound rough maybe I need to isolate it better any suggestions
im looking to just clone voices. I am not as interested in TTS
Ive been looking at this guide, but im still a little overwhelmed by all these choices
It also just doesnt seem to cover exactly what I should use on a case-by-case basis. Moreso a comprehensive beginner’s guide on what things like “epochs” are
to break it down, what is your use case, inference or training?
still using harvest, not rmvpe? I suspect you might be using the old tiger18n's rvc gui
I believe i need both
Oh I saw rmvpe but my gpu is a 7900xtx so I thought it could handle the other options
harvest only runs in cpu mode, and rmvpe is the sota
can i use tts with a rvc model?
Applio has TTS option.
i never figured out how to use it with a custom model
Applio TTS is a demonstation. It is text -> Microsoft Edge screen reader API (online) -> tts output file -> rvc voice change -> output audio
yeye i get it
but i'm getting an error
it says "Value: 0 is not in the list of choices:" and then a bunch of audiofiles name
show a screenshot of the error
aight
im using the hugginface space btw
im currently installing applio cause maybe theres a problem with hugginface
idk
Ok covers colabs definitely hate to see me coming
What Is that space for? I didn't know they had one
I'll see if this one works for me 😩
That one didn't work either 😭
I guess I'll never now how my model sounds...
Inference
you tried ilaria rvc?
hold on
did you included the .index file in the zip file
try uploading your model to hugginface
in a zip file with your .pth and your .index files
Yeah, i did that already
do you want me to try it out for you
send me a sample that you want me to run with your model
it's all good
It's not fully trained yet, i wanna do around 400 and 500 epochs, but i wanna know how it sounds so far with 200
you should check tensorboard while training
so you can tell when it's starting to overtrain
The graph showed somewhere around 1k, I'm not sure exactly how it works... Overtraining is the lowest point on the graph, isn't it?
yep
you looking at that on loss total right?
and the smoothing set at 0.987
this .5 index rate and default pitch
Damn...
I'll probably have to remake my dataset
its just a lil noisy
Yeah, but it wasn't so easy to clean up since the album i made from is full of distortion and reverb effects...
yeah, makes sense
what you using to denoise your datasets?
USMVP or something like that
mvsep?
YEAH, THAT, i didn't remember the name 😭
I used UVR before, but it didn't have all the models i needed
For denoise and stuff?
mainly to isolate your vocals
what model are you using?
or what's your method to extract vocals
Ok!
which program?
Uninstall it and go back to the download page of vac, download the lite version instead of the trial version
The lite should be at the bottom
Ok i will restart thx man
https://software.muzychenko.net/freeware/vac470lite.zip
you should use this direct link instead of the website itself that may mislead you to downloading the trial one
it's good! much better
which pretrain is the best for Mainline RVC ?
default one is fine
unless you need singing and stuff
okay
and
where can i find "training/weights" from "For syncing graphs you need to train however many epochs you have set you save frequency then go into the file manager and find your model which should be in training/weights"
mainline rvc is dumb so synching graphs is done this stupid way
you can just set the value to 50 steps and be done with it
or just spitball it to number of samples/batch size
in mainline rvc the model is in /assets/weights
ty ser
ty ser
Anyone here have connections to big gaming youtubes etc?
Klm uses old refinegan generator, it doesn’t work in the latest update
is ai cover broken again?
@thorn abyss would you mind making my cover too?
covergen has a problem rn
Yeah i think it's completely broken
Not working either for me
Also does the original RVC WebUI work with current python versions?
nope
you just have either spaces in the file name or the file that youre trying to infer with isnt there
The WebUI on colab broke
everyone is having this issue
you sure? it doesnt look like it did from that ss
can you use the google collab webUI than
pls
That's NoUI
and try to output a song
WebUI's broken on Colab.
I know I agree with you
just that if Razar said it isnt than maybe he can go to WebUI and try to output us a song
I think he didn't see it's NoUI.
so do we use NoUI rn since WebUI broken?
that you were running
If NoUI even works then i guess
doesn't seem like it tho
so nothing works? the only ai converts that do work rn is weight but theirs arent so good
And i do not know how to fix it
I suppose you have to extract the vocals, so download the song and use MVSep with SCNet while also having premium usage turned off, then extract the back vocals, run the main vocals through RVC and combine them.
everything u said is new to me, what
before I go up and learn all that are they all working rn?
is the process longer and more complex than webUi?
And as of RVC, use Huggingface
also is it on par atleast with WebUI?
webui was the best tbh for ai cover converting for me (the output always sounds the best)
Unfortunately that's so, but i suppose if there's not a better solution (and you're planning to use several songs), then i guess.
Dont use SCNet use bs or mel roformer since thet are better
you can do it locally depending on your pc, you can use huggingface spaces and yeah weights
I use weight just now, but it sounded terrible
only very few songs ever come out satisfied in weights for me
The huggingface AICoverGen works, but you can only use a few songs and have a few goes at it before you run out of GPU quota or however it's called.
can you do it for me? I dont have alot of space left on my C disk and my GPU cant even load a TTS model for me
its just 1 song
ill send you my hugging face model and the song?
Sure
here? or dm
you did something wrong
what I do
idk
wait ill do it again and than if it happens again ill send the ss of the model used and song generated
If it errors out, send a screenshot with the parameters used.
What is the recommended batch size for 12 min dataset audio ? (medium quality, 16Khz cutoff)
OH WAIT I KNOW WHAT WAS WRONG
@viscid moss There is a bug with the colab version of your Ai cover gen where if you use split audio it gives an error but when you dont split the audio it works fine
depends on the dataset
what was wrong?
Thx

colab's ip was getting blocked from youtube based on the screenshot
what colab are yall using?
HTTP Error 403: Forbidden
I looked at the screenshot again

youtube angy?
Right
this is new to me what, How do I use this
It's a WebUI apparently
The problem it's just the audio download
If u put a path to the audio file it will work
I dont know what to be doing here..
It doesnt ask for a link of the video I want my ai to convert to
or asking for my ai model address
I tried other songs same thing
or is it just all youtube no longer works now?
youtube went that hard?
YT

Colab is getting blocked from YouTube, nothing gets downloaded
so no more youtube links?

how do I put an audio file
Gimme 5 mins
File Browser
that's the split audio code I've not seen in a very long time
step 3 says place a youtube link

Well, I just take the ownership and haven't tried all that stuff. I'll check that
Go to Optional :p
Guide also needs to be updated
-colab
Damn, still down
no way
The bot?
ye Automaze
Ah
Also does the original RVC WebUI work with current python versions?
nah, I think only works with 3.9 and 3.10
Any specific pip versions i should look for?
@viscid moss work
And 3.8 as in the README
💀
Now you know what to do when YouTube doesn't work.
why youtube do this
I told u. It's only that YT sh!t
Because you aren't really supposed to just download the videos, YouTube doesn't like that.

real
Not completely sure tbh
WTF is that error lmao
Just press OK
lmao Colab being Colab
its right there why wont it let me bruh!
try to copy that file into ur Drive to save the cover
rename the file
to?
Cover.wav :p
how do I do this
i renamed it to Cover.wav
copy path
how do I paste it to my drive
To copy to ur drive use this command
Crate a new cell of code and replace:
!cp cover_path drive path
to get the cover path do right click on it and press copy path
same for Drive path
bro what
all I understood was to copy path the output
everything else I dont get
i never learned to use google drive
ur Drive path is basically the path that u place on SONG_INPUT
just delete the audio name
it will be something like this
/content/drive/MyDrive/
I'm editing that colab notebook rn
hahaha check ur Drive
empty
I press upload
and than it left

ok
Thanks guys I was able to use spikes voice to sing my way :3
can I get your guys opinion? and how do i upload a mp3 here?
can i get some help rq
are there any datasets I can use to try out applio?
I want to figure out how it works in the first place before I commit to making a dataset
Is there a filesize limit on Applio colab?
Grab some from https://www.sounds-resource.com/ ig
what's the issue
hey do I have to stop training my model myself or will it stop?
if I have to stop it manually then how long should I leave it for?
I'm talking about this part
and if you can explain what generate index means that'd help
Is contentvec is better than the original hubert base?
same
Both are the same.
There are colabs working?
Of course, Applio is one that works till this day.
Tho what's your GPU?
what settings should I set for training a model in Applio VRC
hey can you help me with putting good settings in the Applio VRC thing?
or if there's any guide for the settings
All i'll suggest on applio is playing around with pitch and use clean audio to infer with models.
Last update: Oct 21, 2024
There you also got a guide.
can someone help me?
why is the Applio VRC training the model using my CPU and not my GPU?
the GPU should be better suited for it so I'm confused
lol
Can anyone help with Refinegan I got errors during the "Save Every Epoch
" stage I would need to dm since I don't have perms 😭
what errors?
go to training tab and see advanced settings
how do i convert models to onnx?
It says "Unfortunately, there is no compatible GPU available to support your training.", my GPU is 6700 XT
Traceback (most recent call last) but more specifics are in dm
I already installed that
then you did not follow the instructions
How do I access kaggle or colab notebooks
you don’t need to for the latest wokada deiteris fork
if you’re talking about realtime voice changer, that’s wokada not rvc, say ur pc gpu and the guide link u followed in #🔍│help-w-okada
which google colab notebook are u using
is there any way to fix the ping when running the deiteris okada alongside a game? I'm trying to run it and marvel rivals at the same time but my ping skyrockets in the voicechanger and makes it to where it just doesnt function at all. As soon as i close the game ping goes down to 0 and it works perfectly
That's because you're running 2 very intensive tasks at the same time
Also, this is the wrong channel
Show ur pc GPU and a screenshot of ur wokada while in game in #🔍│help-w-okada
oh ok i wasnt sure which one thank you
How do I access kaggle or colab notebooks
-colab
-kaggle
Why these commands are not working
the bot is down
Rvc disconnected
damn
rvc disconnected is not supported anymore
check #📰│dev-updates
it had a last update which should fix things, but it will NEVER get supported or updated again
Because it runs on mangio rvc, which is way slower and abandoned since 2023
what’s ur pc gpu
bro should i wait until E300 ? (i write Total E 300)
Epoch: 57 [2025-02-06 11:24:29] | (0:00:10.485343)
sus af
Any alternative?
The bot is up again
- Client's average ping: 48ms
- Time passed since last ready: 2 hours 20 minutes 13.6 seconds
How much difference is there between 300 epochs and 500-800 epochs?
more epochs don't mean more quality
I think we already talked in #🔍│help-w-okada when u asked me which epoch to choose
U need to read the tensorboard to know that
Last update: Dec 24, 2024
I've seen it, choose the one closest to OT, right? I just want to know what the factors are for model quality.
no, you need to first of all set it up like in the guide, then, choose the lowest loss/g/total value
the epochs and step won't mean more means more quality or smt, the quality all depends on the dataset lenght, quality and the way you read the tensorboard
aight
at how many steps/epochs does a model start to sound good?
ik ideally it's right before the overtraining point
but if you don't reach the overtraining point at how much it should sound fine
Actually there's no accurate answer for that.
All will depend on your dataset, training settings and the voice itself.
https://discord.com/channels/1159260121998827560/1329255505872551936 anyone know why this doesnt work on w okada :/ ?
why is there no run-install.bat
I keep downloading the zip from the github and huggingface
both have no run-install.bat
that's annoying because I want it on my USB and there's no run-install.bat
I reinstalled the zip file from both hugging face and github
installed a brand new zip file and there's no install.bat file again
I'll try installing the install file straight from the github page
maybe it is a hint you dont need run-install, hmm?
maybe it is a hint to just use run-applio.bat?
mystery
what would be the purpose of 6GB 'compiled' version vs 5MB .zip file with a source I wonder
guess we'll never know
you're using the precompiled release one, try clone the main repo which is the actual latest one
oh
thank you
why is my processor loaded at 80% and the 3080ti at 1-5%?
what app?
Further questions ask in #🔍│help-w-okada
But delete your voice changer and download NVIDIA from this
https://rentry.co/Forkvoicechangerguide
Guide style is in the same as vtarcelia. Thanks vtarcelia for corrections. Most technical information comes from deiteris.
Last update January 17: NEW UPDATE VERSION b2332 (from December) , adjusted known settings
Translations added for:
German: https://rentry.co/ForkVoiceChangerGuide_de
Turkish:...
Has optimized hardware usage
somebody know how do i fix the error "unhandlerejection" and TypeError: Cannot read properties of null (reading 'type')
tell ur pc gpu, what are u doing, what program are u using
nah already fixed but ty
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Buonasera @low shard, @proven hill. Non so se vi ricordate di me... Comunque volevo chiedervi come posso fare a creare una cosa del genere https://www.youtube.com/watch?v=1fWqJD0uXlQ. Attendo Vostre, a presto!
Mi sa che l'unico modo è lo stesso cantare come abbiamo detto prima
Non c'è un modo di allenare il modello?
Io ho sentito che si poteva allenare il modello AI per poi farlo cantare (o parlare) come il real
I modelli RVC sono Speech To Speech, se lo alleni e gli dai te che canti in input si funziona
Applio è buono?
Sì certo
Si
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
does anyone know if I can test a RVC backup audio before it's done fully
im at like 50 epochs but im going for 300 with a dataset of around and hour and a half
i dont use discord a lot so tag me if you have any answers
anyone got a download for mac?
yea
each epoch is fully finished so you can use it
except if it has a d/g in the front
xD
Explain what you need and what you want to do
Which Applio version works with KLM 5 Refinegan?
Applio 3.2.8?
main branch from github
it has not been released, so the "main" branch
Sorry, how do I find main branch?
Thank you!
can someone train a model for me, my system is not enough pls
you can use cloud
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI, no guide as of right now)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.gg which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.gg: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio (ui)
Else, u can do requests in #1159289738314919936 or #1191429836321849435
5 dolla
well based off this run, its now 180 epochs, it hasnt loaded the pth file that isnt d,p
yes you can use that one
but which file am i using if i cant see the main pth
what
once I load a backup, do I just export it to save as a zip
Why harvest is better than any other?
are you trying to save it? then yeah
put the pth and index file in a folder then zip it
who told you that?
Rmvpe is way even better than that.
Can anyone help me ?
I have a 1:16m long audio from a Voiceactor in a game and ive been trying to figure out what the optimal training method would be.
i watched a tutorial from a guy that used a 2:40 ish long audio and got almost perfect results. Yet when i do that i get my own voice just cringy higher pitched as a result.
i try to give everything i entered into Mangio RVC
audio datasample length; 1:16m
Transpose : male 2 female -> 12
algorithm: rmvpe
training Epochs: total of 200 i guess
do i have to train longer or do i need more audio samples ?
YouTube tuts are old
Mangio RVC is abandoned since like 2 years
Mangio Is a Fork of RVC, a modified version which is now slow compared to other forks or the original mainline
There isn't a right amount of epochs, you have to train using high quality and long dataset and use the tensorboard
What's your PC GPU
gtx 1660 Ti
so what exactly do i have to use instead ?
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
oh thx alot
@ashen dirge
Yo everyone, does anyone know how long it takes to train a voice on Apollo with 1000 epochs? Thanks in advance!
does AMD support this voice mod?
Do you mean w-okada?
Because we don't give support to programs like voice.ai
Actually training to 1000 Epochs on Applio would be overkill/and can lead to overtraining.
It's better to watch the tensorboard and stop training when the g/loss total graph goes up after a while.
im trying to download the VB audio cable, and before i do i wanna make sure its capatable with my GPU
First of all, what's your gpu?
AMD Radeon RX 7700 XT
go to #🔍│help-w-okada and download the voice changer from the pinned guide there
Hehe that's exactly what i was gonna say
thx
how to upload .pth file in weights folder?
there's not even a button to upload my model
in mainline
I have no more clue but using kaggle in mobile is a pita
I trained like 30+ model for my own voice on Rmvpe but with the same dataset in the third try I got a perfect model of my voice using the same dataset, batchsize and on 280 epoch
does anybody know any free ai websites that can generate unique and genuine expressive voices that sound real
hey guys, new to ai stuff, is it possible to create an ai song with replaced lyrics while keeping original voice?
People, what is the most optimal number of eras for rwmp? Everywhere they say different things
also new to whole ai song/cover creating stuff, main focus is on creating rdr2 songs/covers, tried arthur morgan model on one song and it sounds off, distorted, unclear, loads of noise etc, i was using harvest as a method, does anyone have any insights that would help me navigate trhough RVC and how to use it properly, would be much appreciated
Heyyy i downloaded some models from #1175430844685484042 but none seem to be working in tortoise tts ? I got screenshots but i cant send em aparently
But it says this
error
Unsupported audio format provided: .pth```
inside my voice folder I got a pth and an index
the only folder that works is the default "random" which DOES have .pth inside so im confused why the ones i downloaded are just not supported
any alternatives you got in mind ? 🥺
Im using it bc its the only thing i found that allows me to use in python scripts i make
i wanna change voice, text and output to different directories from the comfort of a python file
RVC is STS, not TTS natively
oh im silly, i forgot to mention i have audiofiles created based on the text, i just need to run it through rvc so its not a generic microsoft voice, but instead the voice of the model 😄
Guys what would be the ideal number to set crep hop to, i'm new so idk, also any other settings i should change so it works properly?
I think https://pypi.org/project/rvc/#description or https://github.com/blaisewf/rvc-cli could help
What are u doing and using and what's ur PC GPU?
ty ill take a look !
Ryzen 7 4800 h gtx 1650, i'm trying to do some ai covers/songs
Hi, I have trouble training a model, I have 3h good quality in French, I tried epoch from 100-1000, the results are robotic and no similar to dataset
Alright, you should use rmvpe as the pitch extraction since it's robust and most recommend one
Also, which program are you using?
I'm using RVC
You could also play with the pitch, lower means feminine, higher means more masculine
Yeah but there's different versions and forks, I'm trying to check out if u got a good or old one
That's why I'm asking, it's to be sure u don't got an old one from a yt tut
Oh i see, i think i have the latest version
May you link me the download link you used or tutorial?
Damn that's really old
RVC GUI is a fork (modified version) that has been abandoned since 2023
YouTube tuts are old, don't follow them
yeah i'm on the latest one from 2023, like i said i'm new idk what i'm doing lol
Dw, uninstall that one you got, I will give you updated ones in a sec
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
For any clarification, ask me
did u suceed make rvc in other language as english?
Be sure that those 3 hours are good quality, quality matters a bit more, also be sure to be using the tensorboard
The original Pretrain has been trained on English only, maybe there's some french Pretrain but I don't remember if there was one
thanks a lot, will check them out
so every one of these should give me much better results than this old RVC that i used right?
the audio has been cleaned with UVR, tensorbord for loss? i use mathplot
How do i get mel denoiser on uvr
These ones are more updated and otpimized than the one you got
Okay thanks, much appreciated!
you need to start with explaining which RVC you're using
f0 method?
you might be using the voice of multiple speakers
['infer/modules/train/extract/extract_f0_rmvpe.py', '2', '1', '0', 'D:\RVC\RVC\RVC1006Nvidia/logs/Lou', 'True']
todo-f0-1904
f0ing,now-0,all-1904,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_1.wav
['infer/modules/train/extract/extract_f0_rmvpe.py', '2', '0', '0', 'D:\RVC\RVC\RVC1006Nvidia/logs/Lou', 'True']
todo-f0-1905
f0ing,now-0,all-1905,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_0.wav
f0ing,now-381,all-1905,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_1724.wav
f0ing,now-380,all-1904,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_1722.wav
f0ing,now-762,all-1905,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_25.wav
f0ing,now-760,all-1904,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_2497.wav
f0ing,now-1143,all-1905,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_3224.wav
f0ing,now-1140,all-1904,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_322.wav
f0ing,now-1524,all-1905,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_3964.wav
f0ing,now-1520,all-1904,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_3957.wav
f0ing,now-1900,all-1904,-D:\RVC\RVC\RVC1006Nvidia/logs/Lou/1_16k_wavs/0_992.wav
['infer/modules/train/extract_feature_print.py', 'cuda:0', '1', '0', '0', 'D:\RVC\RVC\RVC1006Nvidia/logs/Lou', 'v2']
D:\RVC\RVC\RVC1006Nvidia/logs/Lou
load model(s) from assets/hubert/hubert_base.pt
move model to cuda
all-feature-3809
now-3809,all-0,0_0.wav,(149, 768)
now-3809,all-380,0_1359.wav,(149, 768)
now-3809,all-760,0_1721.wav,(149, 768)
now-3809,all-1140,0_2112.wav,(149, 768)
now-3809,all-1520,0_2496.wav,(149, 768)
now-3809,all-1900,0_2855.wav,(149, 768)
now-3809,all-2280,0_3219.wav,(149, 768)
now-3809,all-2660,0_358.wav,(149, 768)
now-3809,all-3040,0_3956.wav,(149, 768)
now-3809,all-3420,0_632.wav,(149, 768)
now-3809,all-3800,0_991.wav,(149, 768)
all-feature-done
I just asked for the method you used, not the log dump
rmvpe is okay
so see the tensorboard output if you had it enabled
just tried illaria rvc zero with rmvpe and it somehow ended up sounding much worse than the RVC GUI from 2023 that i tried
sorry please i have been away for a while. Please how can i download a model from #🔍│find-models to use in real time gui?????
-rvc
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
-local
Not available yet
what voice changer should I use for this
For W-Okada the "realtime voice changer", go to #🔍│help-w-okada
where would i download it
Because people keep mistaken "RVC" for realtime voice changer, I always do this everytime.
Again.
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
yo i need some suggestions
I wanna troll my friends so does anyone reccomend a female voice model
Heck yeah
No.
Update for Colab AICoverGen and everything else that could download YouTube videos on colab: YouTube blocked all Colab IPs (pretty much) and now it needs a token from a legitimate account to download videos
yup, using pupetteer could bypass it or some specific user agent but not quite sure
Hey- so... can anyone help me with a small thing? I'm new to Voice.AI, and I've been trying to upload an avatar for a voice I downloaded. But it keeps saying "invalid image dimensions," the hell is that supposed to mean?
try download it manually
Are you trying to create models, make an ai cover, or use realtime for calls
We do not offer help for voice.ai, we offer local free open source options here
I can send you a link if youd like, would need to know your GPU first if its possible
Oh- mb. Sure, tho. It would be nice
I just want to make a sample because my device is still low
So an ai cover (converting an audio file into another with a model). I dont know what device is "low" means, I assume you mean you dont have a good pc?
The easiest to do online is via Ilaria RVC:
https://huggingface.co/spaces/TheStinger/Ilaria_RVC
An explanation on how to upload a model on there and converting samples ("inference") can be found on the guide
https://docs.google.com/document/d/1YbXcLFPaGjhOdG5NFkK3QrucCEpHZBwFUxkeMO8aB18/edit?tab=t.0
Still need to know your gpu name
Anyone?????
Youre in the wrong help channel, for the future use #🔍│help-w-okada for realtime
Click on "View Model", it goes to weights.gg where you can download it from. Might need to be signed up
Or use #1175430844685484042
Okay thanks
How do I go about creating a song with a cloned voice?
You can record the song yourself with your normal voice, and then use rvc inferencing to turn that into a cover with the voice model
There are apps like suno ai that create a whole song for you with beat and all, but unsure if you can select a specific voice model to be used, havent used it
Is there a text to speech one? Like one I can write the lyrics instead and it just sings the lyrics with the cloned voice?
You can't expect a TTS model to sing a whole song. TTS can only speak like a normal speech.
not TTS bro that is why i asked if "there is anyone" because i am not talking about anyone specifically
can u help me my gpu is rx 5700 xt
For voice changer?
Guide style is in the same as vtarcelia. Thanks vtarcelia for corrections. Most technical information comes from deiteris.
Last update January 17: NEW UPDATE VERSION b2332 (from December) , adjusted known settings
Translations added for:
German: https://rentry.co/ForkVoiceChangerGuide_de
Turkish:...
Download AMD, virtual cable, read audio setup discord&games section
Your gpu will work fine
Any other questions ask in #🔍│help-w-okada
I did everything according to the instructions.(https://docs.google.com/document/d/1XuxQYiqEhYrdYeCZRRLrmV_ciMKo0bV-jTCGHu_-5Cc/edit?tab=t.0) But in the Training Model field it says this(AssertionError: You need to download a pretrain! Please run the "Download Pretrained Model" cell before continuing.)
How can I fix it?
Emm.. I think that guide is outdated, and RVC Disconnected is not maintained anymore.
What's your GPU?
You can either try installing RVC Mainline or Applio locally if you got a decent GPU.
gtx 1060 3gb, good?
quick question: what is the overall best quality base model to finetune from in terms of similarity to human speech?
I'm not sure...
But i think you'll be at least able to install either Mainline or Applio to use models locally.
Last update: Mar 8, 2024
which rvc i should use for training female voice? I have RTX 4060, i can run in local,
should i go for mainline or applio?
We mostly recommend you to stick to OG pretrain on RVC.
that's the card I use, I found it can run small models that don't take a lot of vram, but there is quite a bit of delay if you don't want artifacts
Oh, you were talking about realtime/w-okada?
Applio
more just general processing speed but yes
In that case you can install Deiteris' w-okada fork
Here you got a guide
I just assumed that's what the asker was requesting information on
Guide style is in the same as vtarcelia. Thanks vtarcelia for corrections. Most technical information comes from deiteris.
Last update January 17: NEW UPDATE VERSION b2332 (from December) , adjusted known settings
Translations added for:
German: https://rentry.co/ForkVoiceChangerGuide_de
Turkish:...
Ah maybe i got confused.
@languid tulip, did you mean realtime inference, or training (or something else)?
He was talking about RVC training, not realtime
ah, apologies
Dw, it's fine.
I failed to read above 
But for your case as of you mentioned delay, i guess you can use deiteris' fork for realtime.
but yeah, for training, the 1060 is worthless
Certain versions are for a specified GPU.
_[:21]
Yeah, i feel like you're right.
Oh yeah my bad, as GTX is nvidia, then scroll down on the guide and download the nvidia version
guys where can i find a sample dataset of any female model?
first time am using applio so just wanna upload a training dataset and see how it goes
there are no sample datasets, but ig you could use a voice from ears
Wym?
there are no sample datasets specifically for testing applio but you can grab a public dataset like facebook's ears and then grab one of the voices from there and test applio
go find the source and learn how to process & clean it
I’m having issues with KLM 5 model, not working with Replay Application?
RefineGAN is not supported in other application yet
Any that you know of, that do?
Applio, main branch, not officially released yet
Thank you
Why does nothing happen when I run the “TensorVENV.bat” file?
#📰│dev-updates rvc disconnected will never come back again, it got it's end of life last update
We don't support it anymore
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI, no guide as of right now)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.gg which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.gg: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio (ui)
Did u check ur PC GPU first?
Hey, I have a model already done, what's the easiest way to use them on TTS?
Using this Applio, how can I load a model there?
Thank u so much💙 🫂
how do i stop training
how do i fix this error? 'NoneType' object has no attribute 'setdefault'
I'm using a YT link tho
CoverGen Colab?
ye same
Colab blocks YT downloads, u need to upload the audio manually
oh
that's fine ig
how do i stop training with rvc?
ctrl-c in terminal window
Can someone help me with setting up audio
Yw
Setting up audio on.. Realtime?
In that case head over to the #🔍│help-w-okada channel
I fixed it
Oh, nice.
There are different Text To Speech (TTS) AIs:
GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/
Freemium 11labs: An easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS
FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site
With RVC Models:
RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)
If you wanna do tts locally with RVC Voice Models (if you got a good pc):
If you don't got a good pc you can do tts with RVC Voice Models on cloud:
-
Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide
-
Use Applio UI Colab (with google colab T4 free daily limit gpu)
-
if you don't wanna use edge tts, you could try another tts ai from our tts index and use the output as an input in rvc
Hey! I just wanted a little help if possible: I was testing a Japanese RVC I trained myself (200 steps, around 30 minutes) but I was trying to make it pronounce things in other languages like Portuguese. Sometimes it pronounces things very wrong and I don't know what I should do to make it sound right. I'm using Applio 3.2.6. Any suggestions please?
That's because it's not trained on other languages
The most you can do is turn Lower the index rate making the accent lower
I trained it just so I could hear this voice saying things clearly in other languages 🥲 So it's kinda impossible?
I'll try it to see if I can get any better results.. TuT
Was it mostly trained on japanese and a lil on Portuguese?
Just Japanese. It's a Japanese voice actress I like the voice and I wished to hear her speak things in my own language
Imagine you studied only the Roman empire, you wouldn't be able to perform well if I asked you about another historical time
Same thing goes with AI, if you trained it only on japanese, you can't expect it to go well with other languages
I see y.y Sometimes she can speak phonemes she wasn't trained for and other times she pronounces the same phonemes completely differently. I thought I could push it to speak clearly
but thank you very much for helping
it is not about the language
it is about prononcuation and having specific sounds present in the dataset
Pronunciation can be different based on the accent and language
yeah, but if the language has no L sound, then trying to infer L sound would result in weirdness
Do I have to manually change the audio frequency of each sample or RVC resamples my audio when I choose it from the drop down menu through "preprocessing"?
you mean 200 epochs?
200 steps = ~5 epochs lol
it automatically convertes source audios to target sample rate
SO I DON'T HAVE TO CONVERT ALL 40 SAMPLES TO 32 HZ?
Thanks!!!
Is dat true?
pretrains wont do miracles, everything depends on the dataset
no
Nice
Is he/she right?
half-true
your dataset needs to have high pitch data and you also have to use a pretrain that has high pitch data as well
Og pretrain isn't the best, as it has noise.
noise is not really a bad thing
Understandable
og pretrain is the best, others will produce noise
og pretrain is bad for singers bc it was trained using monotone data
rvc injects silence to the dataset to teach the model to not generate noise in silence parts
- also learns from the natural silence in your dataset
There was a message of someone converting the of pretrain into a .py file and inferenced w it and it had noise.
why would you do that
yea thats normal but when you finetune rvc learns what silence is and mitigates that problem
its whatever
Someone else did it
doesnt affects models
Mitigates?
removes it
That's nice
og pretrain naturally adds noise in the whole result
however when you train a model, rvc automatically injects 2 silence files to teach the model to not add such noise
so the result is a model that is able to inference silence without adding noise
Uh that's nice
the pretrain having noise also helps rvc understand what noise is
so for example if u didn't clean your dataset's noise, rvc will know to separate noise vs voice
it'll learn that the noise is a feature of the dataset
pretrains that are extremely clean (like klm) requires a dataset extremely clean as well since the pretrain has no idea what noise is
I use Melroformer denoise 2, is it okay
Aggressive
Cause I noticed it even removes noise from speech in talking videos
yea its good, klm requires a very clean dataset
og pretrain works fine with noisy audio
Nice is titan still useful
idk i haven't personally compared it vs the rest
Oki
it was trained in the old fp16 and it might seem undertrained
Pulling Applio from github and hunggingface (3.2.8 fix) is different 🥲
now i know 
Question, i tried testing train with KLM 5 RefineGAN, but somehow it said "The parameters of the pretrain model such as the sample rate or architecture do not match the selected model."
what should i do?
what code did you download from github?
the models posted 3 days ago are compatible with the latest main code
Batch Size: 8
Dataset Length: 30
Pretrain: DMR V2
Sample Rate: 32k
where should I put these settings please tell me?
nowhere, it is a model's description
Applio Colab Training is just not working at all, tried Preprocess Dataset and the console just says "Preprocess completed in 0.00 seconds on 00:00:00 seconds of audio.
Backup Complete: 1 new, 0 updated, 0 deleted."
im assuming preprocessing the audio shouldnt be that quick... so something has to be off, anyone know the issue? i've been following the Aplio Guide for the colab as well.
show the settings
cant share photos in here, but my Model Name is set to "me" sampling rate "48000" dataset path "/content/drive/MyDrive/datasets/me/me.flac" Dataset Creator UNCHECKED
?
I tried both ways, I did "/content/drive/MyDrive/datasets" as well as "/content/drive/MyDrive/datasets/" with a slash at the end...
both just process it in 0.01 seconds
how i get the rvc?
What's ur PC GPU? And what do u want to do
i have nvidia
Nvidia is a brand who made a LOT of GPUS, which exactly
i wanna make ai cover songg
laptop gpu i dont know exactyl
3050
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
You can choose whatever, if u want to do it locally without limits then applio is the best, if u want the easiest on cloud then weights.gg is the best
Yw
any help?
again, I'm using Applio colab... i dont even see that cell you just showed anywhere
I'm using the Applio UI version, not the Applio NoUI, i wasnt aware that was even a thing.
still, whats the issue with my colab?
I'm... well aware... i tried multiple different methods, putting it in a folder, putting it out of the folder in the mydrive, tried slash, tried no slash, just finishes the preprocess in 0.01 seconds... which obviously means its doing nothing with the .wav audio file.
should it be a .wav or should it be a zip?
I just attempted "/content/drive/MyDrive/me.wav" and would you look at the result.
"Preprocess completed in 0.01 seconds on 00:00:00 seconds of audio.
Backup Complete: 1 new, 0 updated, 0 deleted.
Files are up to date."
once agan, 00 seconds means you got nothing in
yes and that's why im confused, clearly there is something in, im providing it the path to the .wav file, the audio is there, so why isnt it finding it?
Have you tried copying the full path? Usually paths start with the hard drive followed by the folders till you hit the file
Unless your not locally doing it in which case I'd have to look at it myself to find anything
path. to. the. folder. with. the. files
is google drive mounted?
Alright than, I’m trying the Applio No Ui, wish me good luck🤞🏻
Yes it is, I hit the cell at the very bottom, which has 4 cells in it, however, i should note that I stopped running the 4th cell at the very bottom of them, the one that does the "Download all custom pretrains" because from the guide, it seemed as though I didnt need it for training?? unless im mistaken, should I run that?
o save time, unfold it & cancel the custom pretrain download, if you aren't going to use them. <--- the guide does indeed say this
so it should be fine
yep, canceled it and didnt use it
ah, got it working now, just did a different version of the path name
i didnt seem to have to deal with a path name
i went to applio then clicked on the training tab and checked the dataset creator, uploaded my vocals and proceeded to do the rest of the stuff and am currently training
does rvc disconnected still work?
what version of the path name did you use?
with UI version you do need to use a dataset creator, it basically creates a folder and puts your files in it
I Need Help, where did the pth file of it go?
I literally do everything of the video that it says, Raw Onions
Is rvc still considered the best out there for making models?
ngrok mainline colab
I am using the mainline Colab now, wish me good luck as well👍🏻👍🏻
SKAP i need help
Sorry for the ping but what's the difference between the generator loss and the fm loss? I'm kind of new to ML/torch but it looks like the fm loss is based on the output from each conv1d layer within the discriminator, while the generator loss is based on the final output of the discriminator? Does each have a different purpose?
EDIT: If anyone randomly finds this for whatever reason, it's a GAN technique called Feature Matching. It helps avoid mode collapse
discriminators looks at the real sample thru an answer sheet, then it looks at the generated sample thru the same answer sheet, records values from each and FM is the discrepancy between them
generator loss is a result of the discriminator looking at the generated samples and providing a probability of it being fake
yes, feature maps are indeed a result of each layer of discriminator network and the discriminator or generator score is the final value at the end
I see, thanks
Hey everyone! I've been trying to get RVC running locally on my machine and I finally got it to work... sort of
I realized my macbook isn't up to the task
So I was wondering if anyone here know's what kind of a machine I need to run this stuff
i just download the zip file and then run install.bat
it just download everything automaticly
so i guess it the latest one from there
yeah, that will do
i did that last night and "The parameters of the pretrain model such as the sample rate or architecture do not match the selected model." 
the problem still there
Hey Guys, It Works The Mainline Colab
haven't tried the 40k version of the pretrain but the 32k one should work for the corresponding target sample rate
i will also try it again this night, hope it work
can someone send the link of aihub guide?
-rvc
Suggestions for @near thorn
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
thnk u very much! 👍
what should I see in TB g/total or mel
I tried running it, but the error no module named dotnev come out
@acoustic scarab
Why does that error appear ?
its older and less capable than applio but i can assure you it will run
you need to send the entire error message for me to give any advice
pip install python-dotenv
Can I just dm you
no
It said I am missing these stuff
if it says try pip install python-dotenv then you can just do it and try again
I tried that and now its telling me to install other stuff
Yes
download this https://huggingface.co/IAHispano/Applio/resolve/main/Compiled/Windows/ApplioV3.2.8-bugfix.zip, unzip and run run-applio.bat
i think you got the non-compiled version, without the required packages
Ok let me try
Error not found
Looks like my pc is bugging. So I restarted it now it works
👍
What is Applio btw ?
fork of the thing i've sent you in the first message
I mean like is it also rvc-realtime-voice ?
no, as far as i know it's not realtime
more like "make an ai cover"
Can I not download rvc-realtime-voice ?
tts ?
-okada
because that is what I am trying to download from https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/tree/main
it's better to use this https://huggingface.co/wok000/vcclient000/blob/main/MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.18a.zip for realtime
the latest version is actually the main repo that is not officially released yet but has some features like the new model vocoder options: mrf-hifigan and refinegan over the default hifigan since the mainline version. you can clone it with the following way:
Why does my tensor board only display changes every 200 steps?
it's recommended to use fork version of wokada voice changer, go to #🔍│help-w-okada and check out the pinned guide
Does it work on rtx5080?
Why not rvc real time ?
Most likely no, as far as I know 5080/5090 requires latest version of pytorch
oh man, another 50-series failure on their paper launch other than the stability issues 
https://youtu.be/-KDA-h00VRc?t=645
Sponsor: Lian Li O11D Evo RGB on Amazon https://geni.us/B3OD
Hardware news this week talks about NVIDIA's response to our PresentMon criticism, AMD's impending launch of the RX 9000 GPUs (including the 9070 and 9070 XT), RTX 50 series price surges, ASUS' Q-Release issues, and more.
Find our ASUS Q-Release lab report here: https://gamersnexus.ne...
*refer to: #🔍│help-w-okada message
Hey ive downloaded a model with pth from weights.com and using it in appilio but it doesnt sound the same am i doing something wrong?
How much is the google colab limit? They say 12 hours but I used collab only for 2 1/2 models (1/2 because of disconnect in the middle), I'm pretty sure it didn't take 12 hours.
12 hours is the timeout iirc
Who told you like that? Free tier in Google Colab limits to 4 hours daily usage if running with T4 GPU. With only CPU of course would give you that 12 hours when active.
If you leave the Colab session inactive for 90 minutes with no code running going on, it will automatically disconnect.
12 hours without gpu, the gpu is max 4 hours daily
but it’s random
okay... that's bad ig
what happens if I use Local Mainline with GTX 1650?
Fine for inference, doable for training even if not much suggested
what are u looking for?
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI, no guide as of right now)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.gg which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.gg: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio (ui)
colab isn’t the only cloud
place to make voice model ?
inference = use models
Training = make models
so it’s kinda doable but i wouldn’t suggest it that much
your choice
BUT what if i use Mainline with GTX 1650 for training model How bad is it? Do I have to wait for hours or can I not do it at all, or will my pc explode?
It is doable, it’s possible, but it will take more time and might be limited on the batch size
Yes, you'd have to wait for many hours if you train a voice model there with GeForce GTX 16 series GPU.
Your GPU won't explode if everything is set properly.
Does Local Mainline not have “timeout”/“AFK” like colab? If not, I have no problem turning on the pc.
Uh.
Locally RVC won't disconnect itself if running for a very long time.
It doesn’t have any timeout, it’s locally it runs on ur pc and it’s open source, colab has time out bc it’s using a remote good pc, which costs google money
btw there’s also kaggle
4 GB vram is not viable for minimum recommended batch size 4
perhaps batch size 2 but not really recommended since it could lead to more model instability
If you close that web UI while it's training either by accidenr, RVC would still continue training in the background. Any attempt to launch web UI again after you did that, everything there will be reset and won't tell you if the model is finished or not. So it would be better to leave that web UI tab on.
aight thankyou nick, seia, and touhou project person
You're welcome. 
how can i make a voice with ai that i can use with a voice changer?
Anyone?
Hey ive downloaded a model with pth from weights.com and using it in appilio but it doesnt sound the same am i doing something wrong?
What's ur PC GPU and CPU?
Might depend on the settings
Can I upload an mp3 recording to the program and get the output file processed by the neural network? Is there any other way to do this?
make an ngrok account, find the authtoken then change the "gradio" to "ngrok" and put your authtoken in the "ngrok_token" column
Last update: Jan 31, 2025
what is ng used for
Ngrok?
Ngrok is a tunnel to expose the local port from the google colab PC to the internet so you can use the Public url link
Gradio, which is a python package for making web UIs for AI, also has its own tunnel, the gradio tunnel
I understand, thanks, another question, is it possible to train models of 50min of dataset? in either of the two colabs? ui or not ui, I did it in rvc disconnected for each training 100 epochs, will it be possible in these too?
Yes you can train 50 mins and how much epochs you want, but you shouldn't just train every single model to 100 epochs, you should use the tensorboard
Also rvc disconnected wont come back
https://discord.com/channels/1159260121998827560/1337292865407029288
Is it suitable for Applio No UI Collab?
Lamentablemente y gracias, como cargo pre trains ? Como el klm5
Yes or no?
if you use a main branch then yes
What was the way?
Unfortunately and thank you, how do I charge pre trains? Like the klm5
in the clone cell code remove the branch name
you mean change?
you would need to use the main branch of applio
how do i combine two voice models, i remember i saw something about it here but I don't remember where or how
I should remove this then
👌


AI HUB Docs