#✨│ai-help
1 messages · Page 212 of 1
-rvc
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
-rvc
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
Is there any local alternative for suno ai ?
don't think so yet
Hey everyone,
l have enabled the overtraining threshold in Applio so that training stops automatically if no improvement is detected.
In this case, do I still need to use TensorBoard to monitor the training, or is the threshold enough on its own to prevent overtraining?
Dont use the overtraining detector because its inaccurate
tensorboard is much better
even more so if you have the avg graphs
Thanks for the advice! I’ll stick with TensorBoard then.
i don't think so, g/total isn't rising up
keep training until g/total starts rising up and never goes down
Help I can't seem to download okada the online way
Wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Tell ur PC GPU in #🔍│help-w-okada first
I'll try to resume training, i just put it to 250 epochs
next time you can try putting 1000, so you can see better the lowest point of the tensorboard
and when the overtrain actually starts
(you know, g/total rising up forever)
Yeah… im using local training because i dont have money for colab pro, and i dont have time for not being afk kicked by colab
5 minutes per epoch
training for more may not bring any improvement and can actually make things worse
I think so, my dataset is ≈1h
you're right, it causes overtraining
what i want to suggest the user is train 1000 epochs, but stop when g/total actually overtrains
g/total may not even go up
but the model still go to shit and lose all the knowledge from pretrain
yea on long datasets overtraining is rare to see
5min audio, 1700e - lost all ability to sing
you mean with the og pretrain?
because of overtraining, right?
1700 epochs is crazy for a 5 minute dataset
50k and g/total didn't die
maybe the model will sound shit at some point
no, it was fine for speaking
oh nice then, maybe the dataset is just speaking
that's why it may be bad at singing
it could sing at 500e, not could not at 1700
the trained model retains pretrain features, overtraining in most common sense is training a model so much it forgets the previous training
so more epochs = less ability to sing?
more epoch = more of learning from the dataset, more of losing previous knowledge
so more overtraining makes the model forget about previous information/knowledge
and that's why models lose ability to sing, right?
pretty much. the ability to generate higher harmonics was pushed out by some other realignment
interesting about the realignment thing in the harmonics
wdym by realignment anyways?
I dont know specific parts of the model that change this way, there are many millions of parameters responsible for the waveform generation after all
inference uses speaker latents and noise to generate a predicted spectrogram
so with overtraining it fails to make one with higher harmonics
For W-Okada, go to #🔍│help-w-okada . This channel #✨│ai-help isn't where you asking where to download a working W-Okada program.
Why is it changed in Weights to get songs on YouTube??? It is way easier and idk how to get songs in audio files 💔
More likely they had problems trying to get YouTube downloader to work again. But most of audio files that have been downloaded from YouTube before the removal are still there in their database.
Is it gonna come back?
I don't know, but better look out for more information in Weights' Discord server. Although you won't be able to submit any YouTube link there on Weights, you can still do this with an already AI converted track that used an audio from YouTube to convert.
These two tracks were made after the removal of YouTube link feature on Weights, but can still see the YouTube icon marked on both.
literally my ai voice rn
I'll create another dataset then
Hey everyone,
I think my model is overtraining because my loss/g/total is increasing sharply after 46k steps.
Should I stop training now, or is there something I can do to fix this?
Also, if someone could help me understand this better, I would really appreciate it because I'm a bit lost.
Here’s my TensorBoard screenshot for reference.
go to Scalars tab and show the whole chart
increase sharply is not the increase from 34 to 34.5
Hey @Noobies,
Here’s the full chart from the Scalars tab. Does this confirm overtraining, or is there something else I should look at?
how hard is to click 'SCALARS' ???
That doesn't look like what the Scalars tab should be. The button is supposed to be orange, not gray.
SORRY 😂
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
how big is the dataset?
My dataset is about 34 minutes long.
batch size?
My batch size is 20
lower that to 8 and try training again
20 is way to high
Okay, I'll set the batch size to 8 and restart training. Thanks!
that fm is pretty normal imo, usually it goes down for several epochs then goes up infinitely
doesn't go that up for me unless im overtraining it
its like a slow rising
I'll try that, thanks for your help!
I just optimized my dataset and the time per epoch went from 4¿ minutes to 2 minutes
Why is there this error?
-gui
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
what?
do you think rvc could have loras?
ai image help is obviously in #🔍│help-w-okada
guid
Realtime voice changer for calls?
this is the wrong channel then
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
@autumn viper tell your PC GPU in #🔍│help-w-okada
so, you just want to use it on pre-recorded audios right?
then let's use this channel, what's your pc gpu?
that's too weak, it could run locally on CPU if you got enough ram and good enough cpu, but it would be extremely slow anyways so not worth to run locally
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI, no guide as of right now)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.gg which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.gg: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio (ui)
Via cloud, you will run it on a remote good pc
eval is the folder for logs
@warm glacier #🔍│help-w-okada message
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
thanks for the information
Yw and lmk
Hey, everyone, can I ask for help on an error here or is it on the making-models channel?
It happened when trying to reload a model in a new session, in colab applio
@jovial pollen if you need something, ask here
My apologises I wanted to know if you had any test model to be able to test on the new realtime
For refinegan
Since the other ones doesn’t work due to being old
Of course if u don’t have anything im sorry for disturbing then
Lord thank you
hmmm just wanted to know this is a test model right ? Not made to sound good ?
overall it sound amazing!
on the low and the high the voice doesn't break anymore
it is a model made with a very small dataset, 5:30s
doesn't break, do you hava female per chance ?
(maybe too much to ask 🙏)
no
aight welp thanks a lot!
Is there any way wo make the AI voice sound more crisp by doing some adjustements to your mic?
no
also voice changer questions belongs in the voice changer help channel -> #🔍│help-w-okada
graphs aren't very accurate in showing overtraining, you just have to hear the epochs
its possible for a model start to overtrain even when the g/total graph is still going down
the model forgets most of what it learned from the pretrain if you train it for too long
overtrained models are pretty obvious, the model sounds robotic/disorted and it struggles to inference any audio, tho every model overtrains differently, some overtrained models still are able to do some stuff despite forgetting things
Does this mean we should try making lower epoch models so the models doesnt forget stuff the pretrain taught it
no idea, im actually interested in trying this
i did it with the jeff model
after e64 the model forgot how to sing
well, mainly dont run 1000+ epochs
Well, i’ll test it later
Hello, I want a voice similar to children chorus. Is there any?
Hey all, been outta the loop since the initial ai boom and need to brush up my knowledge.
Is the RVC client by w-okada still the best tool for the job, or have people moved to something else by now?
Thanks.
Ah, I see there's guides sections in here, I'll poke around a bit.
please go to #🔍│help-w-okada and read the pinned guide there
guys, is there any working rvc training colab?
It's prohibited to share models about kids.
First, did you try checking your PC GPU in case it's good enough?

Hello,
I set the batch size to 4 as recommended, but now the graphs seem to be stagnating. Here are the charts:
I’m not sure if this is normal or if it indicates a problem. Could someone tell me if this is okay or if I need to adjust something else?
Thanks for your help!
click this damn thing
Alright, thanks! I’ll try that right now.
am i geeked or is this graph absolutely fucked
Yeah, it doesn’t look good. Any advice on how to fix it?
what does your dataset look like? is it nice and clean, and how long is it?
Yes, it was processed with UVR to remove noise, and it’s an MP3 file.
some basic questions
- what's the learning rate
- what pretrain (cuz that's a thing now)
that is some INSANE levels of mode collapse
I will be doing this voice as Meggy Spletzer. How can I adjust it in the best way that is realistic and not robotic? Or can you send it to me?
Is there a tutorial somewhere instead?
you're not actually training anything
i mean.. 2 steps per epoch?
you f'd up preprocess/extract features
there's no way it's 0.1 steps per epoch, perhaps it means it reached 12k epochs
the default learning rate is 1e-4 and there's no reason to touch it for most cases in rvc
why?
I don't think so, it could possibly happen if the loss disc keeps approaching zero, for example
i remember the original Ilaria RVC had that as an option
also how come we can't change activation functions?
rvc has not aged well
Mode collapses aren’t really a thing, those large jumps are just rvc learning silence which is good
it's proven that changing it may make things worse or take more resources/vram
check noobies' comment on this #🔊│ai-development message
no access
activate the ai testing role here
interesting
How can I make Meggy's voice realistic?
go to #🔍│help-w-okada and read the pinned guide there for some troubleshooting
There's no way for tensorboard to show such X axis... unless you're training on 2 mute files and all your unsliced audios were tossed out
wrong channel, tell me ur browser in #🔍│help-w-okada
I did these as examples, but none of them turned out as good as I wanted.
How do I download RVC V2?
- what's ur pc gpu
- what do you want to do?
4060 ti and i just wanna troll with a female voice
ttroll in calls, meaning realtime voice changer for calls?
Yes
then this is the wrong program and channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
@remote karma ur pc is good, I will ping u in #🔍│help-w-okada
Sounds good thanks
...?
is applio inference not working?
It works but the settings are not correct and the voice sounds like a robot.
does anyone know how to create your own RVC model?
Hey everyone,
I finally managed to get the training to work! However, I’m having trouble locating the checkpoint files in RVC. I want to use the best checkpoint I found, but I can’t seem to find where it’s saved.
Does anyone know the exact folder or path where RVC saves the checkpoint files? I checked a few places but no luck so far. Any help would be appreciated!
assets/weights
for mainline RVC
#✨│ai-help #🔍│help-ai-art Hey I was trying to use voice model to change my ai assistant voice but couldn't do it bcz of some error is there anything who can help?
I didn’t activate this.
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
heya, what does crepe hop do for RVC f0 method crepe?
Hi guys quick question, is this overtraining / converging? From the point I think it's the lowest it has started going kind of flat. Ty
https://ibb.co/RT4S1HLY
check other charts, fm and mel
Hi! Is there an RVC that works for conversion voice only?
help
it is called RVC (Retrieval-based-Voice-Conversion)
ilaria rvc
Can anyone help me on this
So im training a model that has
a dataset with about 10-15 mins
and its taking 40 steps per epoch
on a batch size of 8
is this a good sign or a bad sign?
using refinegan btw
using noobies base train 44k sample rate
sorry to ask, i'm not a helper but you're doing a refinegan model as we speak ?
Yeah its all good, im currently using refinegan right now
its training
do you plan on making it public or no if u don't mind telling me ? I wanted to see how well refinegan performs on realtime
im planning on posting it yes
its an artist
that im training on rn
english ?
yep all english dataset
i can ping you when results are ready
probably needs another hour or so
because of the steps its taking a while (to make progress)
well right now i am actually at 2k steps for every 50 epochs
let me see if i can get a sample right now
many thanks!
hmmm seems like it needs more training but im still gonna send you a sample (this is a rapper model / lil uzi vert) and you can hear some progress but i heard that it needs like way more epochs to get an actual really good result
aighty thanks!
yeah weird
its only 50 epochs rn
that explains haha
heard you need to reach 200-500 to hear any noticeable difference lmao
since this one is more advanced
lets see around that then
alright i will ping you when ready then 
bet
might be a stupid question, but how do I have applio generate the index file at a specific epoch once i've found the lowest g/loss value on the tensorboard, or do I have to restart training and stop at that epoch to generate the index?
Hello anyone have problems with w-okada like gpu goes to 100% usage and freeze totally the game you playing?
cheers
this is the wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
elaborate:
- your pc gpu
- what guide link did you follow
- a screenshot of your wokada
thats very likely ur dataset messing with the learning process. are you using studio recording or are you using uvr isoalated vocals
im using uvr isolated vocals
cleanest though
its a lil uzi vert model in the making rn
i can send you a sample
send a 3 sec sample here
of the dataset
yeah
minor double vocal (but not included in all dataset)
i heard it just cleans it out its very minimal
shit quality
whys that? its ai isolated
bad isolation
well there are so many issues with this
whys that ? as long as there clarity in the vocal
thats what should matter right?
you see
you have backing vocals in there
thats bad
causes vocal doubling and confuses the model lots
2nd theres instrumental bleed
also very bad practice for training
and just to mention i dont have izotope 11 so i cant clean it as good as usuing ai isolation
nor dont want to crack it bc of virsues and stuff
so what would be best in ai isolation
to clean it
a lot of frequencies between 3000 and 6000 were butcher, which may impair perforamance and training stability
refinegan is also very sensitive to dataset
izotope isnt whats really needed here, you can get away with other plugins just fine
gans arent really the main issue here, ideally you will need to effectively clean out your dataset first b4 anything
aaaa aight aight
what can be used then? and whats an alternaitve to izotope to keep my dataset clean?
you even have left over reverb residues which is bad ofc
i used mel reformer dereverb by avenue since its more aggressive and better
heard its better than the old fox joy model
yeah but still u can have some of it left over so you will need to delete it urself
how can i do that tho?
with a noise gate
and by manually silencing that part
i do hear oddly some clipping tho is there any way to fix those artifacts
i get that sometimes
like that mild clipping
which audio are you talking abt
im hearing it in this audio you sent me / has some sort of clipping in it like bleed?
kind of like a distortion
to your sample
like very low
then i think thats the term , is it even possible to fix the distortion
and this type of distortion is usually to the loudness of the vocals + the instrumental removal which damages the vocals
not really no
any way you can walk me through my cleaning process just quickly to see if there is anywhere wrong? i use mel reformer for vocal, and then dereverb with either fox joy or the mel reformer dereverb, de-echo if needed, denoise , and then if theres any backing vocals i use melband karakoe / then open audacity, apply a noise gate/trunacate silences and then normalize
what you seen (the sample i sent) was what i used and did
Anvuew mel dereverb v2 is the best btw
should i get izotope btw
is that reccommended
if i want super clean vocals
yes since its a great audio repair tool
it'll help with annoying clicks, left over reverb residues, noise removal and more
do you need the "funds" to get it?
i think bro needs the "funds"
hook my mans up
fuck yes the "funds" 🙏
sent you the "funds"
getting it
how do i input audio for voice training in weights.com?
is it broken? because there's nowhere to import voiceclips
you have to press next
oh
also, I wouldn't recommend training on weights, you can set up rvc in the cloud by following this tutorial if your computer can't handle it:https://docs.aihub.gg/rvc/cloud/applio-kaggle/
by the way do esemble models like bs reformer + mel reformer combined
do better?
then the normal model itself?
but has a better sdr and should perform better tho? dont you thin
since its combined with the best models
sdr is heavily inaccurate
especially for creating models
mel band has a lower sdr than bs roformer but it isnt always better
always look at the spectrograms and compare
i tried to install but i gave up halfway through because the tutorial is not very clear
or maybe i'm just autistic and couldn't understand it
how does this sound
ill unsend
its just a sample of what i have
it does have minor echo residue
@simple ore btw does Applio work with Python 3.12.3? asking because I'm updating the termux guide for ubuntu24.04 which doesn't support 3.10 anymore, I'm git cloning and running the run-install.sh
seems like it works until it gets to numpy 1.23.5, so I feel like there's a dependency problem
do you have to use a vocal only version of a song for an ai cover in weights.com?
you can always use virtual environment
pyenv
yeah that's right, but I was asking if you ever tested it yourself on python3.12 since you're one of the devs, just to check if it's an issue on my end lol
doesn't run-install already make a virtual environment btw?
you can use either, they automatically separate vocals and instrumentals, or if you want upload your vocal only file
oh i already found it out
not testing, no plans for that
alright, thanks
there are some differences, not sure what needs to change
gonna use pyenv
the difference I can confirm you is upgrading numpy
after that the installation didn't work so I dunno if other packages would need other changes too
should work fine with numpy 1.26.4, <2.0
when i try to use it on discord then test it i cant hear myself can someone help?
I created a song with my own voice a long time ago but I forgot it, it's been a long time, how can I do it now?
where can i train my model
Hello, i have trouble with the voice training, every time i try to use it there is an error : AttributeError: 'FigureCanvasAgg' object has no attribute 'tostring_rgb'
Can someone helps me with it ?
guys how to add model +tts
how to trainnn
use updated colab/local app
Wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
elaborate:
- ur PC GPU
- the guide link u are using
- a screenshot of ur discord and wokada settings
in #🔍│help-w-okada
What's ur PC GPU
ryzen 5 5500
Elaborate ur PC GPU and what guides link u are using
right
There are different Text To Speech (TTS) AIs:
GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/
Freemium 11labs: An easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS
FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site
With RVC Models:
RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)
If you wanna do tts locally with RVC Voice Models (if you got a good pc):
If you don't got a good pc you can do tts with RVC Voice Models on cloud:
-
Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide
-
Use Applio UI Colab (with google colab T4 free daily limit gpu)
-
if you don't wanna use edge tts, you could try another tts ai from our tts index and use the output as an input in rvc
rx 6600
What's your PC GPU
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Applio (AMD Windows) : A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
- Mainline (AMD Linux/Windows) : The original RVC
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
I would suggest you Applio with Zluda
thanks
Yw and lmk
with 6600 you'll be able to train 30-45m dataset model overnight
Damn I thought it would be faster
Python moment 
I forgot everything, I need to learn it again sometime. Is there a video you can recommend on the internet that would make it easier?
Nope all videos are extremely outdated
The only updated guides are the written ones, which I have sent you the link
if the video is 6+ month old, it is likely outdated
sad
https://docs.aihub.gg/essentials/how-to-make-voice-models/ you could also check this but AMD tutorials aren't included, however it's the same program
In the context of RVC, the dataset is an audio file containing the voice the model will replicate. It can be either speaking or singing.
I want to make a live2d model using reactjs and tts but it doesn't run locally. Specifically, when I text, the model will move and speak after I finish, I will deploy it using the web so the phone can use it, but the problem is that reactjs can't do tts or I don't know how to do it, can you show me?
My computer is good enough but I want to do it on the web only.
How does the TTS even work with Live2D?
https://www.youtube.com/watch?v=oCFm-rXI6HU&t=24s I want this but pure code with reactjs and can run on phone
I don't know what else to do
need to find someone who can help me
when i finish i will build it through expo into apk

Don't know about this one. I'm only here for basic RVC the audio changer and W-Okada the realtime audio changer steps.
bro u knw fire base?
or github
cvt to cdn
what if i put the full song on github will it exceed the limit?
I don’t use reactjs
Do you really need the tts model to run locally?
you could maybe try adding edge tts api, It works on python so should too on js
javascript like that
yeah I don’t use js
GitHub is a place to upload anything about code, not a music service like SoundCloud.
so you want to make an apk that has live2d with custom 3d vtuber, tts and rvc locally?
u use python?
YES
Of course.
that’s gonna run slow
call too much api bro
it won’t work for realtime
why bro
it would run on the phone CPU literally
Don't expect any AI to run that fast on mobile smartphone locally.
Yes it would work on cpu, but it’s going to take time
maybe try deploy on web and run?
But if you host the service on cloud, and make it as a hybrid-web application, that would work.
I used RVC Applio on Termux on my honor 90 lite, it took 69 secs to inference 8 seconds
make it into a web and display it on your phone=))

so, you mean making a website that phones can use? Meaning the AI would run on cloud
yes bro
if it’s going to run on cloud (remote good pc), rather than the phone’s power, then yeah it would work
how to use idk bro
But you'll have to pay the cloud service for that.
maybe try hungface?
free
Yes, you can make a space on Hugging Face.
python, html, Jupyter, some assembly x86 and some C
huggingface gives just CPU for free, it wouldn’t be that good for realtime
maybe only reactjs?
But if you wanna have GPU for that, you'll still have to pay for that one.
if this is what you're developing, sorry no one but your own dev team could help you
you can also try asking chatgpt or claude to help fixing some code issues
wdym
No team bro
I mean how could you get anyone's help unless you open source the project?
show the code and will help?
too bad I've lost my interest on the dev shit, but I suppose you could use chatgpt/claude anyway
#✨│ai-help hello is anyone here who is good in setup and use of RVC i was trying to use but i cannot i don't know how can anyone please help me its urgent
Elaborate your issue
rvc when i input my sample voice model.pth and index.index it shows error of missing hubert file downloaded and aaded then statted showing another like that
What's your PC GPU? What guide/RVC link are you using? Show a screenshot of the error too
Which RVC program are you trying to use? RVC GUI, Applio or anything else?
gpu nvdia 1650 don't knw abt guide error of ss can uh text me personally
Which rvc download link did you use?
don't knw i searched on chat gpt and it gave me a github link and i used that
Also you have permission to send SS, is there a reason why you want to do it in DMS?
No. Never trust anything about RVC from ChatGPT.
ChatGPT can't know much about RVC/Wokada
nahh i will send here i thought i don't have permission
i guess thats right
You should be able to send them since you have permission
okey lemme start rvc again and then send uh
You can screenshot the folder of it.
I can identify the RVC program by folder name.
I mean even the UI is recognizable


ok
Retrieval-based-Voice-Conversion-WebUI-main
Retrieval-based-Voice-Conversion-WebUI-main
-gui
Damn, I guessed it right. The RVC GUI is too old now.
That's not RVC GUI
RVC GUI is another fork made by t1g3r
This is Mainline RVC
fuck man i wasted my night in this shit
Show a screenshot of the error too, also is it a public model which you can send the model download link?
Not the same one as RVC GUI, but still sounds like it.
It's not RVC GUI, it's the original/mainline rvc, it isn't much updated as Applio (a fork) but it still can be used
uh sent the model remember
Can you re-send it? And also show the SS of the error
tats it
Rmvpe model file is missing?
Are you running RVC from an installed Python and not compiled-portable one?
what uh mean first it was saying hubert_base.pt is missing i downloaded it from google and added then this rmvpe
That's so far messed up.
ohh belive me i know that very well
can uh guide me to download nd install from start i hope that should do the trick
Typical folder path for the already compiled RVC isn't supposed to be inside an installed Python folder, which is found in C:\Users\"your username"\ or Program Files, unless you're trying to develop a fork RVC by yourself.
I feel like you missed some steps of the manual installation
you can delete that mainline you got
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
the local links I sent are guides for the precompiled version, if you want to do it locally
You can't say you know everything on how to install a Python program when you struggle to get it to work by hands. 
dude can uh guide me step by step?
may b donno
i am not a veteran so i didn't get this at all
Never ask someone to teach you every too little step.
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
too slow chat guys =))
Now make it to speak in English. 
I don't know if coqui still supports it guys.
I already gave you all the links with step by step guides, is there a specific issue?
yeah while instaliling man it suckes so much
what's the issue?
trash bro 
I told you to uninstall the one you got, and to choose one of the working links
trash meta
I also explained you the difference between them
are uh available i will install it later if i get any issue can i count on uh?
sure
Sounds like you didn't make the English-speaking variant of it. That's what I know.
Is there a limit to the number of chats in one session?
you can ping me for any issue
I can still see some Vietnamese texts in the code section.
for now can uh tell me which one should i use my goal is to achive a voice that can express emotion perfectly
limit ? =)))
A voice model that can do the most emotional expression? Don't think that's a thing.
any of the ones I said can be good quality, if you train it yourself good
also it has some limits like it can't laugh well
i don't have good voice samples so i have to go with pre trained for now
I'm not sure what you mean exactly with emotional expression though
then the quality depends on the model you use
wait lemme send uh a link my goal is to achive a voice like that
Well, the code you made can do much just that.
Models can laugh you just need enough laughing in the set and maybe it needs it's own only laughing slices
I think that would be possible, might need some voice acting too
use api 🐧
so ?
fair point
so you can choose any
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
local = runs on your pc
don't know how to remove its limit or it has some error ❓
cloud = remote good pc
as you got a gtx 1650, cloud will be faster, but you will have limited gpu time on cloud
cloud will be paid if i use free one i won't get a good one
you will get a good one, it's the same program you would run locally and paid
did't get that
you would either use google colab or kaggle, which are basically remote good pc borrowed by google that run the code
it still runs the same RVC program
the quality doesn't change if you use it on cloud for free or locally
what changes is if you pay for cloud it will be even more faster
how much time limit of this google cloud?
The quality of voice model depends on how you train it. The settings, audio dataset you use.
google colab gives random 4 hours max daily, it can be random
kaggle gives 30 hours weekly of better gpus than colab
this is the free limit
4 hours average a day if running with GPU on Google Colab.
if i m gonna use pre trained isnt running locally is a good choice
using hugface and no gpu how long will it take =)))
Even if I had no experience of development in the past, I still know how to install RVC and any other Python related program locally. 
Training will take weeks, inference couple minutes
real time how long?
Despite how weak my laptop is, RVC and Applio even worked, but took hours to finishing a single audio.
Idk depends on chunk and extra
it will work, just won't be as fast as cloud, but it won't be time limited
it's your choice
little speed issue can be compromised but limit is a issue
alright then, if you go with local, I suggest you Applio
so what should i choose Applio or Mainline?
inference = use models
realtime inference could take a while too to the point it's not realtime anymore
okey thanks do uh knw after installation how to integrate it with my ai assistant
I would uggest Applio as it got more updates and easier user interface
@low shard
what AI assistant
ai ? like the one uh saw in video
so will pay for gpu brud so trash
why not free 
that seems like a mix of an LLM + TTS, RVC is STS, the only way to use RVC as a TTS would be to generate an audio with another TTS first then use it as an input in rvc
TTS = Text to Speech
STS = Speech to Speech
LLM = Large Language Model, like chatbots
yes that what i am hoping to achive

Ig the only way for you to do that would be using either SillyTavern or OpenWebUI with Ollama
they have also a Speech to text integration
but you wouldn't be able to use it with RVC models
https://docs.sillytavern.app/extensions/rvc/ actually I think you can with SillyTavern
This guide will walk you through using RVC, a technique that allows transferring voice features from one audio clip to another, enabling voices to...
I don't really use SillyTavern though
https://github.com/jaywalnut310/vits guys how to use this
GPUs are expensive, why would someone give 24/7 unlimited free gpus?
does it work on hugface with free account?
this
yeah why 🤣
that's a TTS, not a STS like RVC
it won't work with rvc models
tts = text to speech
sts = speech to speech
yeah i mean run that tts vits on hugface and let the live2d model speak ok =))
and to stt for another free cpu service?
CPUs are slow, they aren't that good for AI
a good stt is whisper-v3-large
i'm not sure how fast it would be on cpu
with just one model i think it will be ok
whisper-v3-large what is that
it's still just a CPU
read the whole thing
oh see
if made of coqui is it optimized to eat cpu?
I see him using this much capacity with coqui
can uh guide me how i can achive this??
dude can uh share link of this vid?
that is using a GPU
i need help
Sorry but I don't use SillyTavern myself so I can't help you on that
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
Please, elaborate your help request so helpers can help you
everytime i try to open start_http it wont let me open
for example, tell:
- your pc gpu
- the link of the guide you're following
- what's the issue
oh, you're following an outdated youtube tutorial about wokada for realtime voice changing
that file is only on that program
youtube tuts are old
don't follow them
https://huggingface.co/wok000/vcclient000/tree/main i download from this one
no free =)) trash bro
18a
12 months ago
oh its okey
that's an old version of the original wokada, it has worse performance and worse quality
optimized to eat cpu ❓
also this is the wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
pls tell ur pc gpu in #🔍│help-w-okada
i found out myself lol
to eat cpu? you mean optimized to use cpu? I'm not sure how fas thtat will be, also coqui has shutdown it's site and their TTS are discontinued btw
Can you make it to display on the web?
dude i found out vdo that i was asking uh to send link
so sad
yeah but i already foundd it
can you do it =))
what exactly showing 2d model on screen ? or a ai assistant that have 2d model ?
it is just a model
uh can do pleanty with just a mdel if uh how
and they used coqui llma to make it speak and convert text to speech
yeah
exactly that's what i wanted to do but it works on web brud
i think you know how to do
i also aming same thing i didn't quite reached to that state but i do have some knowledge

uh should use VSeeFace + VRM Model
uh can create 3d model with it
I have a live2d model, I don't like 3d very much, now I just don't know how to make tts and status on the web.
live2d and 3d models are different thing, not all vtubers would want to use the latter
yeah
use vtuber studiio
then how to make it speak and convert speech to text brud
search on youtube vtube studio tutorial uh will get many vdos its ezy
I mean I want to be like him bro =))
the lipsync and model rigging part is out of my knowledge, but I think you should try integrating the live2d models and stuffs in a unity project
unity is game design and i want to do web
fun fact: the vtube studio application is actually made using unity
🐧 it will probably take up a lot of ram
yea on a 2008 era laptop
Either way, it has to convert speech to text and vice versa.
do you think this is enough
I only have 1 64g bar left lol
dont even think it is more demanding than genshin impact
Is the a Applio Collab that's able to use the new hifigan pretrains?
u play?
u already got answered in #🧬│ai-chat
Hello, I have a question about using Kaggle for training. If I turn off my PC, will the training process stop, or does Kaggle continue running the notebook in the cloud? Is there any way to keep it running even if I close my computer?
too slow
can someone please help me ?
help?
yeah kinda
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
please elaborate your request
i dont know how to use these AIs to change my voice into my fav voice model
do you want to do it on realtime or pre-recorded audios
also what's your PC GPU
pre-recorded
damn, I'm guessing you don't have another GPU so your only way is via cloud (remote good pc with a daily gpu limit)
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI, no guide as of right now)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.gg which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.gg: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio (ui)
for each one, I sent an hyperlink, blue text that when clicked redirect u to a link, in this case, a step to step guide
you just have to read it
ook
if you just want the easiest way ever possible to do inference, try weights.gg
it's good too, just that you have to manually separate the vocals and instrumentals https://docs.aihub.gg/rvc/resources/dataset-isolation
Last update: Dec 24, 2024
noo
the reason why i need AI is
that
i want to change my pre recorded vocals into some famous rappers or singers
yeah then you can use either, they both will work
okk
hugface gives 100GB for free, so if I upload 100GB of photos and transfer to CDN, will it violate their policy?
hey guys do you have any recommended free site for vocal isolation?
How to use voices zip to convert into ai voice
what to do in Realtime Voice Changer when I speak, my voice is choppy?
Hi,
I’m having issues using custom pre-trained models on Applio with RVC. Default models work fine, but with custom models (like KLM5 RefineGAN), I get this error:
“The parameters of the pretrain model such as the sample rate or architecture do not match the selected model.
did you use the main branch?
what's ur pc gpu? what do u want to do exactly?
wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
elaborate the issue, your pc gpu, and the guide link you followed in #🔍│help-w-okada
Yes, I’m using the main branch and the latest version of Applio (3.2.8 bugfix), but I’m still facing the issue. Default models work perfectly, but with custom pre-trained models like KLM5 RefineGAN, I get the error:
“The parameters of the pretrain model such as the sample rate or architecture do not match the selected model.”
I’m using the main branch and the latest version of Applio (3.2.8 bugfix)
those are 2 different versions
are you doing it locally or on colab ui?
Yes, I’m using the main branch and running it locally on my PC.
Hello, I couldn't find the answer to this question searching through the chat history, so hopefully this isn't a repeat question with a really obvious answer. I'm trying to do a song cover of a female voice singing Fuck Her Gently by Tenacious D. Most the song goes just fine, but at the end when he rises into his falsetto the model becomes staticy and robotic. I'm using the base RVC program, and I've tried messing with the different settings with no improvement. Any advice or guidance is welcome.
how do i train from one of my save points
where do i get rvc
What is your PC GPU? Applio the RVC is one of the easiest RVC program you can install locally, and it also runs with GPU.
Why did you ask like this?

Most of websites for audio separation are paid to use. It would be better to go for Google Colab instead.
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
spek shows the khz in increments of 2
so i cant tell what im actually at
I want to do that to display images on the web
just checked in a diff analyzer mines around 35k when doubled, do i round to 40 or 32
it'd be better to not let the model learn the missing frequencies
so round to 32 kz instead of 40
Love u 🐧
#✨│ai-help will anyone help me in setup of Applio and then integrate it with ai assistant?
-kaggle
- Applio Notebook, by Vidal Kaggle
- Applio Notebook, by Shirou Kaggle
- Music Source Separation, by Shirou Kaggle
- UVR5 NO UI, by Eddy Kaggle
- Original W-Okada's Voice Changer, Kaggle
- Modified W-Okada's Voice Changer, Kaggle
- 🆕 UVR5 UI, by Eddy, ArisDev & Nick088 Kaggle
- 🆕 RVC AI Cover Maker UI, by Shirou & ArisDev Kaggle
- 📖 How to use RVC Mainline on Kaggle by Cauthess
Note: Kaggle limits GPU usage to 30 hours per week.
Guys I have skill issue, where can I get hugface api?
most speaking audio is around 15-16KHz
but I see an option for 42K where did that come from?
????
You know... just forget about it,
What happens if I increase the gpu batch size ?
i mean.. where did you find 42KHz?
yeah
Yes, that I forgot
as for batch size - larger the batch size is, larger the stride it makes each step
could reach the goal faster, but can also miss it
I see... so what batch size do you recommend if I save every 10 E for a total of 350E?
depends on the dataset size
as in minute ?
yes
4
you're using some old software

will anyone help me in setup of Applio and then integrate it with ai assistant? @low shard
Why would you need an AI assistant for Applio?
#✨│ai-help message sorry but i dunno how else u could make that other than reading the sillytavern docs
I want to change my ai assistant voice I need a voice that is like a human and can synthesize emotions perfectly
You mean TTS or W-Okada the realtime voice changer?
But uh knw how to do with RVC right so why not use that
Nick told me that it have integrated tts so I can use it for real time
so, you want to know how to use rvc generally? bc i can't help for making your own LLM+STT+STS+TTS
RVC is STS natively, I told you the only way to use it for TTS is by using another TTS like edge tts then using that audio as an input in rvc, like what Applio, which is an RVC fork, does
bro
?
I have tts and stt, now how do I create a chat bot?
Well can uh chk and suggest me some tutorial of applio
you want to do it locally (runs on ur gtx 1650) or on cloud (remote good pc)?
yo
do you need any help?
i need help
where in this would be considered the overtraining point?
im thinking at that third notch before it indefinitely rises but im not sure if im correct or not
set the smoothing to the max
look at the avg graphs if you have those because they are more accurate
it seems that there's much variation in your dataset, and untrimmed silence?
lots of variation yes untrimmed silence not that i know of
show fm and mel charts too
lmfao i accidentally refreshed my kaggle page so i mightve just botched the training altogether but no big deal tbh
Check out rvc-chat. That was a tech demo I did for a voiced chatbot before gpt-4o was released.
ello. is a batch size of 8 for 15min of data a good place to start?
is there any tool that allows phonetic tts with rvc models?
i would use 4 but 8 works too in your case
how to i get Unwa's big mel roformer beta 4 on uvr?
no, but you can run rvc voice changing on top of tts output
oh you're right lol thanks
text doesn't carry voice features like pitch, intonation, non-verbal things, and rvc does copy all of those, not guess them
an LLM-based TTS model can generally do such things
a non-LLM based TTS model can use phonetics to produce the audio, although it will be relatively bland
unless the model uses both phonetics and an LLM/GPT engine
basically vocaloid/utau, though I think it may be somehow possible or still needs a base voicebank behind it
RVC the voice changer and TTS program are two different things. You can use any TTS program and let RVC to process the audio. Applio has this feature.
One question. Is index file mandatory for IA covers?
@crimson depotno it's not. Sometimes it can help, sometimes it can hurt.
Pth si enough?
sometimes
For diferent language
where bro I can't find
-overtrain
Moved to /faq command.
anyone know how to change hz in steel series engine?
While trying to change the voice with Harvest, RVC shuts down, and Python says 'press any key.' When I press a key, it closes without giving any error code.
check event viewer, most likey just ran out of memory
and python crashed
Hi, is there any software like RVC but it can clone a voice just with some seconds of a wav file?
looking for zero shot ?
seed-vc does that
though it's really not that good
I have only 3 minutes of the voice I want to clone 😦
I guess this is what I'm looking for
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
train on gloud if need be but other it will be locally
where i can find some tutorial on Applio?
that's the tutorial for applio
hyperlink is a blue text, that when clicked will redirect you to the link
oh oky
if you don't understand what I'm saying, either click the lil blue Applio text, or directly go to https://docs.ai-hub.wtf/rvc/local/applio/
Last update: Apr 01, 2024
nahh i understood
Got it working and the quality is not that bad, probably on par with RVC
Hey everyone,
I’m training a voice AI model with a 35-minute recording, but I heard that using too many Epochs can make the voice sound robotic.
Any advice on how many Epochs I should use to keep it sounding natural? Also, if anyone can explain how Epochs really work, I’d appreciate it!
Thanks!
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
I guess the audio device doesn't provide any other sample rate
aight goodluck
U should use the tensorboard and high quality audios
Is RVC GUI still viable?
Nope it's really old and outdated
Don't follow yt tuts
What's ur PC GPU and what do u want to do
Hey does rvc runs in the 5000 series ?
not right now, still waiting for windows wheels for pytorch and torchaudio
there's pytorch, but no torchaudio
Hey, when should I stop the training? It looks like it's starting to overfit. Thanks!
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
yo dudes getting a rtx 5080 card soon, i understand from previous messages here that its not working with the voice changer atm? is thats still the case?
still no updates
50-series gpus have been getting several issues, even some 5080s have also missing ROPs. imo 4080 super is still a better choice for similar spec
-hf
- UVR5 UI, by Eddy and Ilaria Huggingface Spaces
- Ilaria RVC Zero, by thestingerx Huggingface Spaces
- RVC⚡ZERO, by r3gm Huggingface Spaces
- Applio, by IA Hispano Huggingface Spaces
Share a link to the official github of rvc2 and maybe there is some modern tutorial how to install and train the model? I found a tutorial video on youtube but there is a link to github where the last update is 2023.
YouTube tuts are very old
Tell your PC GPU and what you want to do first
I've written a story set in the witcher world, and I'm currently making a youtube video where I voice characters with voices from the game. I have trained models for so-vits-svc, but they work badly, with speech defects. I want to train models for RVC2 in hopes that it will work better. I have a GTX 1080 video card.
Yeah so vits SVC has been replaced by RVC since 2 years
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
SVC is too old now.
Thank you for your reply. I'm on my way to look into it
Yw and lmk
Hi guys! Do you know how can I create my own voice model for RVC-Gui? Do I have to take a 10 minutes recording and the renaming it with the .pth extension?
rvc gui is outdated
delete it
don’t use youtube tuts
and no model training isn’t just renaming the extension
what’s ur pc gpu
Thank you! My gpu is NVIDIA GeForce RTX 3060
With Applio/Mainline you can do both training and inference on pre-recorded audios, they are more updated RVCs
Thank you!
yw lmk
Hi, where should I put the .index file of a model (using RVC here)
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
ok I installed applio, dunno where to put the models
logs/modelname, click refresh on inference screen
Voice.ai sucks, don't use it, if you want realtime voice changer, tell ur PC GPU in #🔍│help-w-okada
Never go for the Voice.ai again.
hmm, which sampling rate is best to choose in applio?
Either 40000 or 48000, but 48000 gives better quality.
ok ty. i think i’ll go with 40k
that output looks like a mix of bunch of stuff
32k probably
but you can go with 48 as well
consistency is more desirable, not ideal to mix up from multiple sources that may have different cutoffs, but at least you can go 32k
and if it is from separate recordings there's likely volume / effect differences as well
different room / different reverb
Hi. Can I change the dataset when training a model? For example, I originally had a file 30 minutes long and I want to change it to 1 hour. How do I do that? Do I just change the file in the dataset folder or do I have to do something else?
hey im having a problem that i havnt gotting into since today, it seems like my inputs get filtered alot when i convert into ai, doesnt matter which voice im running into. Anyone with some good settings to help me out with?
How do you enable the Use RefineGAN options in Applio colab? + KLM 5?
Hi, can you please tell me where the audio file is saved after conversion? In the interface there is an option to download the generated file but it takes too much time, I would like to be able to work directly with the generated audio.
It would be even better if you could specify a place to save the converted file automatically.
Is anyone else having problems with the Applio ColabUI? It cannot find the gradio module. It was working an hour ago
Traceback (most recent call last):
File "/content/program_ml/app.py", line 1, in <module>
import gradio as gr
ModuleNotFoundError: No module named 'gradio'
Are you talking about Google Colab, or something else?
yes
What should I do to make it work? How do I "reinstall"?
i guess?
someone else may know for sure, I dont use colab
the error you got is a missing requirement
Oh I did reinstall. Looks like it's a Colab problem. Not sure
I assume when you discronnect from colab it deletes some installed stuff or something
is there any way to make ai covers with my phone
Help
The collab named "CoverGen_NO_UI_v2_en.ipynb" appears with this section: The installation takes about 3 minutes, if it takes much longer ping me at AI HUB
Pitch_Change:
12
What I do?
@cyan hare
uv is a replacement for pip install
Sorry deleted my message, but indeed replacing !uv run by !python works, although I see some other warnings now. But it seems to work
It works for now. So that's good enough for me ^^
El Colab denominada "CoverGen_NO_UI_v2_en.ipynb" aparece con esta sección: La instalación demora alrededor de 3 minutos. Si demora mucho más, envíeme un mensaje a AI HUB
Cambio de tono:
12
¿Qué hago?
Same error
I replaced uv run by python in the cell and it worked
Thanks a lot!
When starting to train a model I get an error at 5.3 seconds no matter what I run it own, I can seem to get an error code either. To be fair I'm stupid, idk if it has to do with pytorch and python dont work 100% of the time on my PC
colab or local?
anyone getting this on mainline colab? it was working perfectly fine just this morning 😑
i assume local, using RVC WebUI
there should be output in the console window with the error
is colab an easier process, found one error msg AttributeError: 'FigureCanvasAgg' object has no attribute 'tostring_rgb'
old colab
is kaggle or applio no ui better?
kaggle would be better but if you need the latest branch use noui colab
applio isnt working?
if you mean colab, it seem there's an issue with requirements install
aww man
replacing uv with python may work in the install cells
Wdym by latest branch 
Thank you lol 🙏 (Mb for being stupid)
nono dw its fine xd
Hey guys have you used Whisper Tiny yet?
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
ModuleNotFoundError: No module named 'gradio'
what should I do? it's on Hina's Mod AICoverGen. it was fine last night 😌
From what I understand, the command "uv run" was supposed to call "python", but it does not anymore. So replace all the "uv run" instances by "python" and check if it works
Is there a solution for Mainline Google colab?
What's the issue?
Can anyone know why in applio, when I start the continuation of model training, the epoch time increases several times? When I start training from the beginning the epoch lasts about a minute, but if I finish training and then continue, the epoch takes 5 minutes.
What error does Applio Colab present at this time?
how do i stop delayed voice
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
This is the wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Tell your PC GPU, the guide link you're using and a screenshot of your wokada in #🔍│help-w-okada
What kind of Colab notebook Applio error are you talking about?
AI HUB Docs