#✨│ai-help
1 messages · Page 183 of 1
you want a screenshot of my settings, yea?
Yes
Chunk controls delay
Select 80 for example
easy
4060 ti
and my extra is like 13100 and my chunk is 2400
⠀
Google Colabs 
⠀
AICoverGen-WebUI
Useful for making quick covers, by Hina.
AICoverGen-NoWebUI
Useful for making covers, doesn't include a UI, by Ardha, by Eddy, Hina and Gdr.
RVC Disconnected
To train new voice models, by Kit Lemonfoot.
EasyGUI
The OG interface, by Rejects.
⠀
⠀
Google Colabs 
⠀
AICoverGen-WebUI
Useful for making quick covers, by Hina.
AICoverGen-NoWebUI
Useful for making covers, doesn't include a UI, by Ardha, by Eddy, Hina and Gdr.
RVC Disconnected
To train new voice models, by Kit Lemonfoot.
EasyGUI
The OG interface, by Rejects.
⠀
Yeah I can't get the voices to sound like the people they should. I downloaded Robocop and can't get b the voice to sound right
If you're using W-Okada, maybe you could try with different settings and also reading the docs below.
-realtime
This interaction has expired, use the command
/guides realtimeif you wish to see it again.
Check Deiteris' one.
⠀
Google Colabs 
⠀
AICoverGen-WebUI
Useful for making quick covers, by Hina.
AICoverGen-NoWebUI
Useful for making covers, doesn't include a UI, by Ardha, by Eddy, Hina and Gdr.
RVC Disconnected
To train new voice models, by Kit Lemonfoot.
EasyGUI
The OG interface, by Rejects.
⠀
Okay so im confused right now.
The reason noone could answer my questions were because my questions didnt make sense?
That is very confusing.
Its not like I am speaking a different language
The Buffering rises above the threshold of 512 to 1000, and the res goes to 4k plus which makes my voice robotic and hard to hear.
Ayo? @wild vale level 2 !!! 
What part of that is hard?
Do I edit the chunk or something or am I missing something?
When I use a voice in discord chat its fine
But when I use it in game or a heavy game
The Chunk goes from
buf:512
res: 12-128
to
buf: 3x the normal
res: Goes to places higher than 2.7k
since there's no way to prioritize RVC over the game, yeah, that's gonna happen
get another GPU I guess
Chunk could be too low for your gpu, id need to know the ms number instead of the 2400 number
Else download v1
https://rentry.co/voicechangerguide
Github - Blanc-dot
Discord User - https://discord.com/users/824922747423031359
Special thanks to the following people : lusbert, poopmaster, felt, fazemasta, antasma, shadictl, x_hina, sushi
thanks are for anything added to guide, taken from any talks, settings added when previously collecting st...
I dont know who youre flaming here but youre in the wrong channel first of all but ok
You might be running into 100% GPU issues, so you have a few options to try:
- reduce your ingame quality and cap fps to just above your monitors refresh rate
- increase chunk and reduce extra for less gpu and cpu load from the voice changer
- if that didnt work, try out the fork. Has very little resources used and runs better: https://rentry.co/ForkVoiceChangerGuide
If all fails and it turns out youre playing a very high end game that goes to 99% gpu usage either way, then upgrade gpu or get multi pc setup
Guide style is in the same as Blanc_dot's. Thanks Blanc_dot for corrections. Most technical information comes from deiteris.
Last update October 6th, 2024: Multi PC setup explanation added
Translations added for:
German: https://rentry.co/ForkVoiceChangerGuide_de
Turkish: https://rentry.co/ForkVo...
anyone know the end of some words get cut out when u say a sentence with rvc
it doesnt cut out, but the ai gets weird and almost ignores the last few letters if you know what i mean
unless you really put emphasis, which just kind of makes it sound unrealistic
If your threshold is too high it might not pick up if you get quieter at the end of the sentence below the threshold
Move threshold/n gate to the left if its on the right
i have it on the lowest thing
like i said it doesnt "cut out"
it just ignores some last letters
most of the time
Extra is both the voice model quality and controls the length of a consistent tone, like if you hold a tone aaaaa you can hold it up to 2.7 before the voice breaks. And in this case, 2.7 is considered the max setting (most models struggle to go above this number, but some are capable of it)
In rvcs gui it does more damage than benefits to go above 2.7 from my testing
i see, thanks though it fixed basically everything
⠀
Local Forks 🖥️
⠀
Mainline RVC
Original project, suggested for advanced users,
by the RVC-Project team.
Applio
Simplified, suggested for all, by the Applio team.
RVC Studio
Simplified, suggested for all, by SayanoAI.
Mangio-RVC
Simplified, may not be supported anymore, by Mangio621.
AICoverGen
Simple yet great way to make covers, by SociallyIneptWeeb.
Replay
From the greators of weights.gg, excellent product for everyone.
⠀
I want to try Genshin rvc model by HuggingFace but the web i click on doesn't look like the old one 🥹
whats ur pc gpu?
You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
Also, are u looking for ai covers or realtime
Is "Intel(R) UHD Graphics" a GPU? I don't understand 🥹💔
Ayo? @next plinth level 1 !!! 
its the integrated graphics, which is bad, u cant dont do it locally (on ur pc) but can use cloud (remote good pc)
are you looking for ai covers or realtime for calls
Ai covers 🥹
use ilaria rvc zero, a zerogpu (A100 paid by Ilaria) huggingface (biggest ai platform) space (service they offer to try ai), its the fastest way
Ilaria RVC: CLICK HERE 🤗
Guide on how to use it: CLICK HERE 📝
Don't forget to thank Ilaria if you find it useful! 💖
What do I need to do to get the voices to sound right in voice changer? They always sound off? Am I supposed to be tweaking the voices based on the sound file or voice I'm using?
Probably some models you're using aren't meant to be used on W-Okada.
Ah ok. Are rvc models not universal?
Some voices can work for everything, some don't.
It also depends (i think) how the author made the model. (dataset cleanup and length)
It can also depend on your settings and your own voice.
I have an
Processor Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz 3.80 GHz
Installed RAM 32.0 GB
Device ID 60F634EE-B521-4FCF-A554-CDCB5FDC830E
Product ID 00330-80000-00000-AA684
System type 64-bit operating system, x64-based processor
Pen and touch No pen or touch input is available for this display
RTX 3060Ti
Whats the right channel then
Singing models are very bad at speech
While speech models are mid/bad at singing
and models not sounding like the original voice are undertrained or the dataset had timbre issues
Y algunas veces no es tan buena idea mezclar canto y dialogo en un mismo dataset verdad?
Porque soy fiel creyente que es mejor hacer 2 modelos distintos de la misma persona/personaje dependiendo del proposito que uno le quiera dar.
Es mejor hacer dos modelos distintos y no mezclar canto con dialogo

Factos.
Es lo mismo que evito hacer.
Pero no mucha gente sabe de esto.
Ok, can you tell me what game youre running and what settings on wokada, send screenshot
where does ilaria rvc store it's models?
i took a secondary model and placed it in it's root folder where model.pth is but it's not detecting it
ilaria rvc zero?
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Modified W-Okada's Voice Changer, Google Colab
- 🆕 FaceFusion UI, by Nick088 Google Colab
- 🆕 FaceFusion NO UI, by Nick088 Google Colab
- 🆕 EasyGUI, by Rejekts Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Ayo? @brittle wing level 6 !!! 
Which option should I choose for a pre-trained model if the dataset is at 44,100 Hz? Titan only supports 32k, 40k, and 48k
First check if its truly 44,1k hz because often times the waves show something different. You can check this with a program called "spek"
This is a debated topic: You can use 40k because you do not have those ranges from 44.1 - 48k in your dataset, so the model could get inaccurate. Some say use 48k because you wont hear the difference anyway and get more out of it
Imo it doesnt matter, i would probably still use 40k
Does anyone know how to remove the robotic sound at the end of words?
@jagged hawk im pinging gu in the right channel, whats ur pc gpu?
4070 ti
alright good enough
the docs are temporary down so lemme send u the temp ones rq
Ok, thanks
Oh nice
I'm instaling Applio rn
Is it intuitive?
Or u recomend watching a tutorial?
The guide i sent is already a written guide
there is no updated video tutorial
Ohhh
Ayo? @jagged hawk level 2 !!! 
you can always just read the docs
I've experienced something similar. It might be because you still have an active search in the top section. Try clearing it and maybe that will fix it.
python trainset_preprocess_pipeline_print.py "/content/dataset/EVRAART" 40000 2 "/content/Mangio-RVC-Fork/logs/EVRAART" 1
python3: can't open file '/content/Mangio-RVC-Fork/trainset_preprocess_pipeline_print.py': [Errno 2] No such file or directory
python extract_f0_print.py "/content/Mangio-RVC-Fork/logs/EVRAART" 2 rmvpe 64
python3: can't open file '/content/Mangio-RVC-Fork/extract_f0_print.py': [Errno 2] No such file or directory
python extract_feature_print.py "device" 1 0 0 "/content/Mangio-RVC-Fork/logs/EVRAART" v2
python3: can't open file '/content/Mangio-RVC-Fork/extract_feature_print.py': [Errno 2] No such file or directory
i have the same error as this guy, except i did put .wav files, and that i already tried installing depencies again
How much crepe hop length in inference and training
i'm thinking about deleting the whole mango-RVC-fork folder to install it again is this a good idea?
mangio rvc fork is very outdated and most of the dependencies aren't compatible to each other anymore
for training use mainline (the original rvc) or applio (fork of mainline)
mainline has faster ui and in some cases, training is faster than applio
applio has slower ui but some claim they have better training speed there
both options give the same result in terms of model quality, etc
64 for both
k good thanks
Ayo? @sly sluice level 1 !!! 
Wasn't it 64x2=128?
128 is too innacurate when i tried training with that value (the model ended up having more voice cracks)
Okie but I'm asking about inference rn...and yes you're correct
for inference try 64 or 32
I remember training a model with 128 hop length and it sounded bad
where do i find the saved models while training?
i cant find them
nevermind
i found them
lol
Bruh why am i getting this error now, applio was working just fine the other day
ive already tried reinstalling the newest complied version and last versions pre compiled and it still gave me that error
have u tried updating ur gpu drivers?
yup
.\env\python.exe -m pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --upgrade --index-url https://download.pytorch.org/whl/cu121
try reinstalling the torch dependencies again
still same error
what if u just delete python and everything related to it lols, thats what i do when shit stops working
ive been trying to use rvc webui for model training, but when i click on one-click training, the output information box has been stuck on 'processing data' for the past hour. any suggestions?
Hi did anyone manage to download Mangio on Mac?
mangio is outdated, its better to use applio or mainline, however, mac can only inference (use models) locally (on ur pc), i would suggest to use cloud (remote good pc)
Table Of Contents Introduction (with website link) Model Loader (Download & Upload) Inference (use RVC AI Voice Models) Ilaria TTS Settings (Inference) Vocal Separator (UVR) Troubleshooting “No gpu is available for you for 60s” Introduction (with website link) Ilaria RVC Zero, is an RVC (Re...
Last update: June 15, 2024
For rvc training cloud you can choose between:
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Applio (ui)
- Mainline (UI)
- RVCDISCONNECTED (no ui)
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
Doees it work on Mac?
Ayo? @twilit kernel level 1 !!! 
if u wanna do it locally, theres a mac installation: https://docs.applio.org/getting-started/installation
But i Would HIGHLY suggest to just use cloud
Documentation for a high-quality, open-source speech conversion ecosystem designed for simplicity and optimized performance
yes, its cloud, it doesnt run on ur pc, it runs on a remote good pc
And it makes good results?
yes, rvc is the best Speech To Speech program
its used by like 90% of ai covers
Thanks I'll try
yw
the cloud version works as good as localy?
the performance of the local one depends on ur pc, and cloud will run better than your mac
in terms of quality: yes, its just the same program
even if I have m1 pro?
yea, an A100 (The ZeroGPU used by Ilaria RVC Zero) is way better than that
thanks
for training id suggest Kaggle as it gives the most gpu time so u wont lose ur work
For inference i would suggest ilaria rvc zero
Is there a way I can do this on mobile
Inference (use models) or train (make models)
i don’t get it sorry
are you trying to use models for pre-recorded audios like ai covers, or make models?
or are you trying to use modles in realtime for voice changing in calls?
use models
for pre-recorded audios right?
yeah, i’m trying to use a donald duck voice for some audios i found but the files won’t work, unless it’s not for mobile to do that
Ayo? @feral plank level 1 !!! 
idk what ur using, but this is RVC Technology
You could technically do it locally on ur phone but its on CPU so slow and not suggested
Its way better u use cloud (remote good pc), use ilaria rvc zero
Table Of Contents Introduction (with website link) Model Loader (Download & Upload) Inference (use RVC AI Voice Models) Ilaria TTS Settings (Inference) Vocal Separator (UVR) Troubleshooting “No gpu is available for you for 60s” Introduction (with website link) Ilaria RVC Zero, is an RVC (Re...
how many epochs should i train my IA model?
Ayo? @radiant loom level 1 !!! 
I have 9:44 min of audio
Hey, i downloaded my voice model and i cant find files, pls dm me
there isn't a right amount of epochs, see https://docs.ai-hub.wtf/rvc/resources/epochs--tensorboard/
Last update: Feb 10, 2024
Can you dm me?
its better to ask here than dms
I cant send pic here
Ayo? @keen pollen level 1 !!! 
how did u train it? u sure u downloaded the .pth and .index?
I downloaded folder
Oh, sorry but i cant help much about local
tysn
I don't do things locally, i use cloud
I would suggest using a more updated version like mainline or applio tho
maybe this could help https://docs.ai-hub.wtf/rvc/local/mangio/#15-gather-models-files
Last update: Mar 8, 2024
are you doing it locally or using google colab?
did you download mangio rvc on ur pc?
first, whats ur pc gpu?
you ran tensorboard without any logs present?
Ive got these 2 models, both v2 and 40k, but it keeps saying they are different versions
Ayo? @sullen jungle level 2 !!! 
How do I make my ai vocals sound more realistic?Like what kind of lowpass/highpass filter or settings do I use
https://vm.tiktok.com/ZGdJph7aE/
Like this video
could anyone help im not hearing any outputs for the voice changer (rvc google colab) but I am for just regulaur in discord
how do i use a voice model 😦
What system are you using
Ayo? @brittle wing level 1 !!! 
how to make these settings:
Epoch: 620
Steps: 9000+
Pretrain: Snowie V3
in voice changer there is only: Gain, pitch, index, chunk and extra
when your using applio and you finish training a model. how do i export the pth and the index file to my downloads folder or google drive
-local
- 🍏 Applio, by IA Hispano GitHub
- Mangio-RVC-Fork, by Mangio621 Huggingface
- RVC Studio, by SayanoAI Huggingface
- AICoverGen, by SociallyIneptWeeb GitHub
- Replay, by Replay Team Website
- Original RVC, by the RVC-Project team GitHub
- GPT-SoVITS, by RVC-Boss GitHub
Credits to Faze Masta and Antasma for compiling these links.
guys, can i ask why i tried to train but it only process 1 step/epoch ?
you provided unsupported audio, you did not split audio, extract features did nothing, you're training on two mute files
but i splitted it
go to logs/yourmodel and see what files are there
xd i stopped that one
btw, about pretrained
should i use these with high steps or those with low point

use default pretrain
i mean, default pretrain don't support my language so i tried to make my own pretrain xd
you can not make a good pretrain from scratch
so is there anyway to make 1
at least not without using some magical way the original one was made
I doubt the original pretrain had russian language, yet a model trained with it for just 20 steps does fine
i tried to make one with vietnamese
it may take some extra source data to shape it up, but it is better to use an existing pretrain as a base
instead of trying to do it all from the scratch with 100+ hours of audio
ah so, base pretrain + train one with my desired language, then use these D and G to train another voice i want ?
i mean... you can do that too
but I mean use pretrain with 30-60 min of audio in your desired language and voice
if it ends up not good enought, use D/G from it + more audio
you can always buid up on top of existing model
it simply adjusts weights
even just doing 5-10 epochs on top of default pretrain you should hear your trained voice, maybe not perfectly speaking certain syllables, but close enough and training longer should fix that
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
Hey! I’m sorry I’m new to this AI stuff, where do I start to start learning to create a voice model? I apologize if this is an inconvenience to some, I’m very new and I’m just really want to learn! Very appreciate any help would be awesome! Thanks anyone that responses!
whats ur pc gpu?
I’m in bed right now so unsure but how does a GPU effect training?
You could run the ai locally (on ur pc), meaning it runs on ur pc
like the same way u need a good gpu for games, ai takes alot of computing
especially training
You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
if you have a push-cart you cant move a mountain, that requires a quarry dump truck
AI training requires a tremendeous amount of number crunching with specialized hardware, you can't do it on a cheap laptop wit intergrated GPU
woosh
Ayo? @simple ore level 15 !!! 
i need help
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
I’m having trouble finding the right file to download it my cpu is amd ryzen 5 3600 6-core processor
to download what?
are you looking for realtime voice changing for calls, use models on pre recorded audios or make models
and also, i need the gpu
Rtx 2060
Ayo? @runic schooner level 1 !!! 
alr, so what are u looking for?
is it just me or are the aihub docs down?
yes, check #📰│dev-updates message
Bro does NOT wanna answer questions 😭
3 times in a row lmao
Nah nah just seeing u ask him for what he needs 3 times with no answer is funny
😭
HE'S A CATFISHER I TELL YOU
what
REAL
BAN HIM FOR CATFISHING 
Hey nick! / anyone that reads this for this purpose i have a NVIDIA Geforce RTX 3050 Ti
laptop 4 GB variant? nope, not recommended for training, only inference
32 GB Installed Ram
Thanks for responding fast
I am unsure what you are asking my apologies, im really new to some stuff like this. I dont want for you to waste your time though! I tottaly understand being busy or if im to new to understand a lot!
I'm also busy playing around with SD, but not an excuse to not respond to you
Ah thats okay!
Let me know what you need from me and ill figure it out and give it to you!
the memory of the gpu
its suggested to have 8 or more gb of memory of gpu aka vram for training
You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
you should also be able to find the gpu memory
Also is it a desktop or laptop ?
12 GB
u prob mean desktop 3060
12gb vram should be good
Techically a Desktop but i shoved some parts from a few laptops nto it
alr, rtx 3060 12gb is good
for both training and inference
Downloading now
JSONDecodeError Traceback (most recent call last)
<ipython-input-6-75abb3770c40> in <cell line: 31>()
31 if os.path.exists(config_path):
32 # File exists, proceed with creation of creds and client
---> 33 creds = Credentials.from_service_account_file(config_path, scopes=scope)
34 client = gspread.authorize(creds)
35 else:
5 frames
/usr/lib/python3.10/json/decoder.py in raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
uh help
Don't use outdated colabs
What's ur PC GPU?
Yt tuts are outdated
You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
Intel(R) UHD Graphics 630 and AMD Radeon RX 6600 XT
problem 2: python3: can't open file '/content/infer-web.py': [Errno 2] No such file or directory
you are using an outdated colab
You could be able to use RVC Locally (on ur pc) via: https://docs.applio.org/getting-started/installation#amd-gpu-support-windows with the AMD GPU
Documentation for a high-quality, open-source speech conversion ecosystem designed for simplicity and optimized performance
Google Colab is a Cloud Computing service (remote good PC), so used for weak PC
Your pc should be able to handle it
Btw, you are looking for using models for pre-recorded audios, or making models, or using models in realtime for voice changing in calls/games?
@tepid atlas bc the one i sent is applio, an rvc fork (modified version) for making and using models for pre-recorded audios
for realtime voice changing for calls theres another program
okay
both pre-recorded audios and making models
yea then stick with Applio locally
okay
Ayo? @tepid atlas level 1 !!! 

-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Modified W-Okada's Voice Changer, Google Colab
- 🆕 FaceFusion UI, by Nick088 Google Colab
- 🆕 FaceFusion NO UI, by Nick088 Google Colab
- 🆕 EasyGUI, by Rejekts Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
What happens if you run inference with a different pitch extraction method than the model was trained on?
Error
Ilaria rvc doesn't work anymore ?
Same
but other zerogpu spaces worked on you?
I have a problem that showed up when i started the Train Model Button
~~
Any ideas or anything would be highly apperciative! Thanks so far for everyones thats helped me so far! I am just sorta bad at this stuff lol
Put the RVC1006 folder somewhere outside of OneDrive
Buuut tbh I dont see any error messages unless I'm blind
Epoch 1 started training, how long did you wait before sending this txt file?
About a half a hour ish
I let it do its thing and made lunch came back and it came to that
Nothing was moving so I was worried
Would wait for someone else to comment on it then, but would still move it out of OneDrive just in case
Hey guys i was adding a new voice model but have a CKPT file what do i do with this?
Ayo? @brittle wing level 3 !!! 
I got the path file and CKPT but no index, what do i do? 🤔
Table Of Contents Introduction (with website link) Model Loader (Download & Upload) Inference (use RVC AI Voice Models) Ilaria TTS Settings (Inference) Vocal Separator (UVR) Troubleshooting “No gpu is available for you for 60s” Introduction (with website link) Ilaria RVC Zero, is an RVC (Re...
Table Of Contents Introduction (with website link) Model Loader (Download & Upload) Inference (use RVC AI Voice Models) Ilaria TTS Settings (Inference) Vocal Separator (UVR) Troubleshooting “No gpu is available for you for 60s” Introduction (with website link) Ilaria RVC Zero, is an RVC (Re...
its a limitation of the inference time that it takes for converting in ZeroGPU huggingface spaces
i explained it above better in the guide
@proven hill can't u put the limit back to 300s (5 min) instead of 1 min on Ilaria RVC Zero?
Hey Nick any ideas for my issue?
I just want another opinion on it as it could be just the OneDrive thing but just checking as they are unsure
I don't do local sorry
In some parts of the songs, there are backing vocals; should they be separated?
You can listen to the sample sound below; it is only the vocal.
This example is from a single song.
Do you think I should separate them? Some songs sound different to me.
Yea
Ayo? @low shard level 108 !!! 
With HP KARAOKE 6 of UVR
First, I processed it through this model, then through the other model, and finally used this model to delete the reverb.
Do you think I did it correctly? Will it be of good quality? Also, some parts sound different to me; for example, the audio.
It sounds like the same song but with a different sound. Will it cause problems in training?
I’m curious about this.
what do you do then?
Ayo? @molten relic level 4 !!! 
I have successfully inferred ~3 mins audio, though the GPU task aborted issue also sometimes occured, but I think it should be still less demanding than even Flux1.Dev generation, etc
~~
Slightly differnt then last time but still no generation, not in a onedrive this time ago, anyone have simular problems or know how to fix?
says not in a onedrive, yet "C:\Users\storm\OneDrive"
but nothing on the is an 'error'
Huhhhhh? I switched the data set onto a external hard drive
logs and everything else is still on onedrive?
uninstall onedrive
Son of a nugget I just got this and in bed I’ll resume tomarrow.
Would anything bad happen if I do so?
what
if you don't need that bloatware, why not
I don’t know I thought was it something I needed
The limitation isn't about your audio file length, it's about how much time it takes to inference that audio
If it takes more than 1 min to inference that audio, it gives task GPU aborted
i cant listen to it rn
I do Cloud, i got a bad pc so i use a remote good pc
wokada is for speech to speech, if u want realtime text to speech, u can look at https://docs.google.com/document/d/12hCYJqNCFl6jWKoVvCxtwt2V6nSoilgi5La8dkZa1KY/edit#heading=h.xweoq2pdv4uj or use the tts client https://github.com/w-okada/ttsclient (but cant really help for the second one)
Table Of Contents Introduction Index of the best TTS 1. ElevenLabs/11Labs: 2. Bark TTS: 3. Edge TTS: 4. StyleTTS2: 6. XTTS2: 8. MetaVoice: 9. MeloTTS: 10. GPT-SoVITS: 11. gTTS: Use TTS in Realtime on calls (ONLY PC) Introduction TTS Means Text To Speech! Inference means when you use the TTS. ...
What’s cloud?
remote good pc, basically i run theprogram in a cloud computing service like google colab, kaggle or lightning.ai, instead of running them on my bad pc
Interesting
its way better to use it locally on a good pc but i got integrated graphics 
.
.
https://docs.aihub.wtf/ doesn't work for me. Trying to find RVC Disconnected guide so I can re-learn stuff again.
does anyone have link?
thank you so much! I appreciate it
could anyone help me i got screenshots
i need help with rvc can anyone help me
@low shard i finally created a voice model, how can i upload it in the channel voice-models?
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
Last update: May 20, 2024
Whatchu need?
It’s giving me an error whenever I click train
Idk why
When I got a 3060Ti gpu nevida
😭
Might help if you send the error
Hi
When I log in with new docs, the main menu opens but when I click on the app or any other button, it doesn't work. Why?
@timid olive
its not fixed yet
Are you not talking about the docs guide? the normal applio stuff should still work
Not
Can you send me the collab link?
For aı cover
Thanks
@pastel oak @pastel oak
Look, now I opened the "applio" and I couldn't download the sound I wanted. I paste the sound I wanted into this doenload, it says it downloaded in 1 second but it doesn't download, why? Sound model: https://applio.org/models?id=1218683186431660072
Ayo? @remote trellis level 2 !!! 
I dont know applio colab
This
Or
try ilaria rvc zero
Ilaria RVC: CLICK HERE 🤗
Guide on how to use it: CLICK HERE 📝
Don't forget to thank Ilaria if you find it useful! 💖
download model from applio and manual upload it
So where can I find the hugging face of this applio voice model?
didnt worl
Worl
Work
Did you even read what i said
Modeli applio'dan BİLGİSAYARINA indir
Ilaria RVC Zero'yu aç
"Model Loader" sekmesine git
.pth ve .index dosyasını yükle
Dont know
⠀
Local Forks 🖥️
⠀
Mainline RVC
Original project, suggested for advanced users,
by the RVC-Project team.
Applio
Simplified, suggested for all, by the Applio team.
RVC Studio
Simplified, suggested for all, by SayanoAI.
Mangio-RVC
Simplified, may not be supported anymore, by Mangio621.
AICoverGen
Simple yet great way to make covers, by SociallyIneptWeeb.
Replay
From the greators of weights.gg, excellent product for everyone.
⠀
what is the difference between FCPE and rmvpe?
Fcpe is faster, rmvpe is better quality
thanks
First of all. What's ur PC GPU?
its fixed now btw
As the main docs are down, the hyperlinks don’t work, so i changed the hyperlinks from the main docs to the temporary docs for now (so from docs.aihub.wtf to docs.ai-hub.wtf)
there were 256 broke hyperlinks 
just fixed it with search and replace all yk lol
I can't get applio to install on mac. It keeps saying that a java runtime wasn't found. Anyone know how to fix this?
Ayo? @stark wadi level 4 !!! 
⠀
Settings for Nvidia GPUs 
F0 Det.: rmvpe (suggested for all series)
RTX 40-series: 80-96 chunk | +16384 extra
RTX 30-series: 96-112 chunk | +16384 extra
RTX 20-series: 112-128 chunk | +16384 extra
GTX 16-series: 128-192 chunk | +8192 extra
GTX 10-series: 128-192 chunk | +8192 extra
Advanced Settings
Protocol : Sio or Rest
Crossfade: 4096 start 0.2 end 0.8
Trancate: 300
Silencefront: Off
Protect: 0.5
RVC Quality: Low
⠀
⠀
Settings for AMD GPUs 
Don't forget that your models needs to be converted in ONNX!
F0 Det.: rmvpe_onnx (suggested for all series)
7xxx XT cards: 112-128 chunk | +16384 extra
6xxx XT cards: 128-192 chunk | +16384 extra
5xxx XT cards: 192-256 chunk | +8192 extra
RX 580: 192-256 chunk | +8192 extra
RX 570: 192-256 chunk | +8192 extra
RX 560: 256-384 chunk | +8192 extra
Advanced Settings
Protocol : Sio or Rest
Crossfade: 4096 start 0.2 end 0.8
Trancate: 300
Silencefront: Off
Protect: 0.5
RVC Quality: Low
⠀
what is the best model pre train please awnser fast
Original
ok thank you
How do I transfer to ONNX?
why can't I mount drive on the colab?
What Google colab are you using and what's the issue specifically
i'm using disconnected, the error is "credential propagation was unsuccessful"
Can you send the Google colab link?
You need to be sure also to always allow the Google drive when you get the popup
Ayo? @mighty vortex level 1 !!! 
I do allow
Yeah that seems the right colab
Try re running it and give it access again
i've done that multiple times already
do you guys know how long would it take to train an rvc model locally
compared to training on colab
Depends on gpu
rtx 2080ti
Thats probably faster than the colab gpu
colab's got t4 tho
lemme check it myself rq
Tesla T4 is worse than 2080ti
just tried it myself, i ran the cell, it asks me permission, then i choose google account and allow everything
works fine no issue
be sure to not modify its permissions
btw what should be the dataset zip structure?
something like
| name of the dataset.zip
| | speaker0
| | | audio files
?
or just
| name of the dataset.zip
| | audio files
or should I not even zip it for local training? 😵💫
Dont zip, put in a random folder and select the folder path in rvc
how many epochs should you train a model with a 3 minute dataset?
Everything is measured by tensorboard graphs
but is there a general amount?
No, every run, every dataset is unique
I found this though? https://www.desmos.com/calculator/yeqx4dmcfm?lang=pl
Ayo? @prisma grove level 2 !!! 
honestly I'm confused now, the calculator says that 20 minute dataset is about 100 epochs
but that's very low
what does the loss stuff mean?
and why is each epoch taking so long to train?
is 1 epoch per minute the normal speed? I haven't used rvc in a while
and I don't think I've ever trained a model locally
Like Shad said, u have to rely on Tensorboard to determine when to stop and which epoch to pick
where is tensorboard
Found this here some time ago
https://drive.google.com/drive/folders/1o1ZZuUHQ6MuclA6B-AtQZlHlC5Uf34pb
I don't understand anything in that tutorial
I'm just gonna test all 7 models (700 epochs, saving every 100) and see which one sound the best
what's more concerning to me is how long it takes to train every epoch
what is the normal speed?
It all depends on ur GPU and batch size, dataset size, etc. My 4060 on 4 batch size, 30min dataset is taking abt 2:00 per epoch
Idk about local but check https://docs.ai-hub.wtf/rvc/resources/epochs-tensorboard/
Last update: Feb 10, 2024
does anyone know how to install on mac m3 pro?
Go to the docs guide Nick linked, download the file, put it in the same rvc folder where you launch it, run it, thats all
It opens another webui
saving every 100 epochs is silly
go every 10
I don't have that much free space on my drive
unless you got hours and hours of sample audio files, using 700 epochs is crazy
then why are people doing 700 epochs for 5 minute datasets 😵💫
they are stupid
or they follow a stupid guide
running 700 epochs on 5 minute file is trying to squeese a gallon of juice from one lemon
you can get all you can from 5 minute file in 20-50 epochs
yeah I got it now
You can delete the first 50 epochs safely while saving every 10 epochs, then delete as you go if you hit another OT point etc
another OT point?
shouldn't I stop the training when it starts to OT?
also it's saving every 50 epochs anyway lol
I think the interval is limited to 50
I have tried using this but it says “You cannot currently connect to a GPU due to usage limits in Colab” for the last two days.
What can I do?
mac can only inference (use models) locally (on ur pc) not train (make models)
i dont have a mac but u could check https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/en/README.en.md
tbh id suggest to just use cloud (remote good pc)
wait until tomorrow?
or whenever the limit expires
its 2024 and some people still believes 1000 epochs are better despite not being true at all lmao
just use tensorboard and select the lowest point in your g/total graph, for a 18 minute dataset your model should be done in less than 200 epochs
No idea on how long the limit is there?
you can:
- Use an alt google account
- Use kaggle which gives more gpu time but its harder
- Wait until tomorrow
- Pay for colab pro
btw, whats ur pc gpu?
the free usage period goes down the more you use it
and slowly resets back when you dont
imagine the graph keeps going down on above 1000 epochs for some short dataset
which is why kaggle is on top
lightning.ai is also cool but lower limits (as in gpu time) so 
batch 16 in 4 minute datasets be like

I tried registering another Google account but Google doesn’t allow me.
You mean like it asks u for phone verification when u make another google acc?
Yeah.
also its WAY BETTER to use Kaggle
you can use the Phone Gmail app to make another google acc without phone verification
ofc u need a phone tho
just like open the app, make a new acc and it will do it without needing any verification
but i suggest u way better to just use kaggle, its a bit harder and needs just a single phone verification but gives u 30 hours weekly (yes they refresh)
its WAYYY better than google colab
and u dont have the risk to losing ur stuff for randomly getting disconnected as 30 hours are alot for free
I actually did register another account on my phone, but later, when I went to log in on my laptop, it asked me to verify the account with a phone number! 🤦♂️
kaggle is a bit buggy but works good when decides not to randomly end the session
Ayo? @serene horizon level 3 !!! 
since when does it randomly end the session? Never happened to me
be sure to use encryption
wtf very weird
u should be able to login on ur pc of the same acc made on ur phone without phone verification
at this point i suggest u to use kaggle or wait
don’t u have even just 1 phone number ?
it does for me lols, everything set up correctly then randomly decides to stop the session when im downloading the dependencies
i fix this by creating another version of the notebook
skill issue ngl
never happened to me
nor heard it from others


-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Modified W-Okada's Voice Changer, Google Colab
- 🆕 FaceFusion UI, by Nick088 Google Colab
- 🆕 FaceFusion NO UI, by Nick088 Google Colab
- 🆕 EasyGUI, by Rejekts Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
I do, but i already used it so it won’t accept it.
dw, kaggle is a different service than google colab
use kaggle, u will be able to use that phone number
As you dont got a good PC, its better you use cloud (remote good pc) for training an RVC Voice Model:
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Applio (ui)
- Mainline (UI)
- RVCDISCONNECTED (no ui)
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
here i sent all the cloud ways, use mainline kaggle
Okay, I’ll have a look and see if I can work it out!
Thanks. I’ll give it a go now.
yw
can some one tell me if this model is good or not and how to know from this pic
its still in 150 ecpoh
18 minuts of training data
is this OT?
Ayo? @prisma grove level 3 !!! 
what batch size?
what about this
it tells nothing without the metrics
what are they and how do i get them
Ayo? @languid lotus level 2 !!! 
not grad
just terible
what is the size of ther training set?
18 minutes
you seem to have some terrible quality then
do you want me to send you the drive link for the audio
im making a billie eilish one
bro i used very high quality wdym
@simple ore
did you cut out all the silence?
yes using audacity
i dont know how to do all these stuff
the model needs to have silence gaps in order to learn a separation
so if i send you the raw voice will you make me a training data
just to know what ive been doing wrong
just take your original interview
Why does it say in the guide to truncate silence then?
before you cut the silence gaps out
ill try
the training specifically inject a couple of mute audios for the model to train how to reproduce silence
but it is only 2 3sec files
yeah another question do i need to make my voices cut or whole
you need to have only the voice of the target person
if i do what app do i do it with
obviously
no like one audio file or multipile
Is this just for Applio? What if I use it with another RVC trainer?
here's a model i'm testing right now, there's no weird jumps or craziness
and it is only 10 minute set
RVC mainline does the same
and what pre train do i use original or titan or ov2rsuper
Ayo? @languid lotus level 3 !!! 
I think without the silence gaps the audio becomes too complex to learn
I use original
But not RVC disconnect in Colab?
probably the same.. all the base code should be very similar
Okay.
but I've seen some projects where they do not include silence for some reason
bro if i want to make a rvc model of 21 but his voice is bad when i uvr look
@simple ore
any tips to make it better
you cant unbake a cake
thats true unfortunately
I got to the training point, but when I click train, it says “error.” 🤦♂️
But in Kaggle, I see it working.
So weird.
5
I left it at default
it's at epoch 266
should I stop it or no
you mean this right?
that's not right
show the error in the kaggle output
it has very little to do with GPU memory size
which chart
last page, with fm, mel, kl
Where would I find that?
fm metric is weird and high
in the kaggle site, where it shows the output of the cell u are running
There’s a lot of text there.
It’s currently telling me what epoch it’s on.
g/total is kinda high, but that's probably of the batch size
there should be an 'traceback' part
show a screenshot
what causes fm to go up anyways? i've never seen a fm graph that goes down lol
sounds pretty normal?
depending on the learning ratio, the model may overshoot the target, so fm goes up and down
I had models that sound like static so
large variation of data in the set may result in fm going up and down
do you think is good to change the learning rate everytime we train a dataset? last time we spoke about this we conclude is better to not change the default lr (which is 1e-4 i believe)
First, I’m going to try from the start again. Maybe I missed something. I
you can probbaly set it to 5e-5 (half of default), the training may take longer
this could prevent fm to overfit faster?
possibly... also using FP32 may prevent it too
thanks! i'll try it
this is what the 150e ckpt sounds like
this is all with no index because I'm too lazy to grab it
technically the metrics should go down or at least stabilize due to the learning rate automatically adjusting down
yeah all of the metrics go down EXCEPT fm for some reason
so what does that mean 😭
ive always believe this happens because the dataset is small but im pretty sure fm goes up even in big datasets
neshi'zzzzzz'te
that's what I hear
at 7 seconds
I mean yes
sibilances are artifacting
SH ch K sounds
and S ofc
u can decrease the artifacting by de-essing the dataset
because fm is not overfitted
type fm in the tensorboard, is a metric
model confuses and starts to learn the same patterns over and over again
so overtraining
similar
not the same?
nop
it started to fluctuate here
so any epoch in that zone have a big chance of having broken S sounds
e50 probably is before that so is not doing it
isn't ot determined by the loss/g/total metric thing
every metric does something
g total is your average of:
fm
mel
kl
g total raising means your model start to degrade and overtrain
it's not raising
fm going up means model overfitted the dataset features like sibilances
Ayo? @prisma grove level 4 !!! 
yeah its fine thats what i was talking about
fm metric usually overfits very fast
so what am I supposed to do
choose the lowest point in g/total
if u don't have the exact epoch choose the closest
show me g/total smoothing 0 and ignore scalars off
how do I know which one that is
hover your cursor in that low point
and?
check the step number
then find the epoch that is that step number
14.4
or the most closest to that step number
so you need epoch step number 14.400k
checked my logs
so that's epoch 150 😐
this is default lr fp16?
the one you said sounds bad
yep that is your correct epoch and has broken sibilance
i dont remember, that was a small test set
it doesn't sound bad to me
the epoch that's the best is the most broken one ??
basically
tf you mean
is not bad, just the sibilance are artifacting
nah, artifacting happens randomly
dont worry
is a usuable model
I don't want it to be usable I want it to be good
hmm try this epoch and see if the sibilance are artifacting
or if they got better
try that
even in smaller datasets (5 minutes) fm should go down alongside g/total?
way better
dunno about 5 minutes, that's barely enough
yea im aware, i don't have a 10 minute dataset rn lol
don't use this, just use tensorboard, more easy
yeah easy af man
smoothing 0 and scalars off helps u choosing low points
well, 20 min / 100 epochs is about right, +50 maybe
like what we did now
all depends on the content
yeah epoch 100 sounds fine to me

you used batch 5 instead of 4 like a weirdo
wasnt batch 5? lol
fixed
i notice the breathings are robotic in every epoch, probably the dataset lacked breaths
THAT'S THE DEFAULT
YES BECAUSE IT'S MIKU
MIKU IS A ROBOT
FFS
miku has breaths
WHICH ARE ROBOTIC
it was br1 in vocaloid iirc
and it is not right
nono, miku breath samples in vocaloid sounds like that
she is one of the voicebanks that has broken breathing samples
i mean batch being 5 as default
ah yeah
why does it even matter? what does batch size even mean?
how many random sets of samples it trains in parallel
this is how rvc is gonna learn ur dataset
a bad batch size can lead to unstable training and potentially bad outcomes
so it has to be 4?
for under 1hour use 4
yeah use 4
there's no speed benefit in using more
who makes 1hr datasets
you can make decent voices with 5 minutes, I thought my 19 minutes was overkill
welp model quality is tied with dataset quality, so a 5 minute high quality dataset is gonna sound high quality, just unnatural compared to bigger datasets
miku is not gonna sound natural ever 😭
miku should never sound natural 
alright
yeah and that smaller datasets are always gonna sound like a rvc model rather than a real human
what do you consider a small dataset
refresh it
under 10 minutes? under 30? under an hour 😭 ??
under 30
it starts to get realistic at over 30 minutes
how would that affect miku
if u add more miku samples, she's just gonna sound like her exported vocaloid vocals
like no one is gonna tell is rvc
so worse?
nope, good
you can't make her realistic with more minutes, you can only make her sound if the vocals were made in vocaloid rather than rvc
(which is why people prefer to just use vocaloid and not rvc)
for miku
I want to make realistic miku
cause it's also a matter of tuning, note transitions and all
technically she can sound more realistic because with more minutes rvc can do more precise pitch changes
she's not gonna sound like a human but rather like a very well tuned vocaloid exported
hm
Ayo? @prisma grove level 5 !!! 
30 yeah
your miku samples are her singing random vsqx files?
is there a source of this Miki voice?
she cost real money (unless.... yw)
i just need 30 seconds
i dont have her installed srry, but if u want any audio then check this https://www.youtube.com/watch?v=swqbfMh467A
Remake→https://www.youtube.com/watch?v=naQjypGoOHY
Cool Miku!(`・ω・´)/クールミク!
オフィシャル Official Song: https://www.youtube.com/watch?v=kp-plPYAPq8
オフィシャル Official Channel: https://www.youtube.com/channel/UC9z_ByomNk9DafVViv5sKkA/featured
SEGA Project DIVA Channel: https://www.youtube.com/channel/UC6FTMCuI9X2ggdMiKFpRukw
( doriko VOCALOID ボーカロイド ボカロ...
random vsq/vsqx/vpr with different voicebanks (v3, english, v4, chinese) and also stuff ripped from project diva megamix, de-reverb'ed if needed
only use her japanese one
y
use actual vocaloid exports and nothing isolated
don't worry just let the default tuning
anything that comes with the vsqx
but why only japanese
she'll be able to sing any language in rvc despite being trained only in japanese
no
yep
you can do 2 rvc models, one with her japanese model and the other with the english one
what 4
bootleg miku
why can't I just stuff it together
rvc learns better if there's consistency
quality is a bad because I used @prisma grove's song
it is consistent because it's still miku
yes but pronunciation changes
its up to you anyways
wouldn't that be good? make it able to pronounce more stuff accurately?
yes but is also going to get confused sometimes
and pronunciation might be worse
is not going to know correctly which pronunciation will use
if the source audio pronounces it right then it will too, no?
hit or miss
but higher chance of better pronunciation if all of the dataset has the same language
and that the inferenced audio is also the same langauge present in the dataset
well it's still 90% japanese
so with this info you know the model is better at japanese than the rest
and also the model has a bias of japanese pronunciation
index 0 can help avoiding this bias
but not always
index blends original features from the audio and features of the voice model
by mapping original to voice model
if she has a bias of prononucing "la" like "ra" then she has a 99% chance of doing this even with index 0
um.. no
that's not how the index works
doesnt force the dataset samples over the audio features?
tbh no one explained what index is
bro just use your model if u like it
then if you convert an audio of someone saying the word "Carrot", it's not gonna suddenly say "カロット"
already sounds good for me
ofc no
try inferencing more audio and see if u like the results
at the end of the day what matters is if you like the model
again, that's now how index works.... it take source audio feature, tried to look up something close enough from the voice model features
then it blends original and voice in selected ratio
finally someone explains what index is lols
english audio + french speaker at 0 index has minimal accent, the accent comes in full force when you use index 1
most of the stuff i learned was from trial and error
yea same i noticed this too
i see makes sense now
sucks that when we are starting doing rvc models there's no info about anything in the internet
how are we gonna know what the metrics are? you go to the official rvc github site and there's nothing that tells u what even g total is
😭 its like only a couple of people actually know how this thing actually works
this is the 90% japanese 100e model
it's pronouncing it better than I can with my shitty polish accent lol
dude./.. give me some good wav you're using for training
xD
like, you want a wav of miku's singing converted through miku rvc?
he wants a sample of your dataset
any wav
an audio that u used for training
AI HUB Docs