#✨│ai-help
1 messages · Page 213 of 1
i think it's just using original since you didn't turn on custom pretrained
perhaps it's better to ask @nocturne mural since he made that notebook
Oh good then. It is better to use the original
if you want to use other pretraining, you must activate the custom_pretrained checkbox, otherwise the original pretraining will be used
I want to use the original anyway
I tried 2 of them and didn't like them
TITAN, Ov2 Super
The model says some letters roboticly
I mean it kinda depends, it depends on your dataset language and lenght
fixed
#📰│dev-updates message @tawdry spade @unique rock @glass igloo @white bough
Dev announcement about Google Colab just dropped.
like that wasnt the message nick sent literally before yours ? 😭
Here's the English-translated of it. https://multimedia.easeus.com/images/multimedia/voice-changer/resources/w-okada-client-server-architecture.jpg
This is the highest quality image I could find.
I don't know what MMVC does stand for. But for VC, I think it stands for voice changer.
I can think for the name of it: W-Okada.
realtime for calls? and what's ur pc gpu
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
you did not stop at the actual saved epoch, so it stepped back
Is applio main better or codename fork better?
same base, codename's has some power user features for training
you have to stop the training at the moment rvc saves the epoch
so you can resume from that epoch avoiding that problem
but its just a visual bug
the model itself is fine
how to overcome this Welcome to ColabMod
Timer: 00:00:23DEPRECATION: omegaconf 2.0.6 has a non-standard dependency specifier PyYAML>=5.1.. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of omegaconf or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Timer: 00:00:24warning: The --system flag has no effect, a system Python interpreter is always used in uv venv
Using CPython 3.10.12 interpreter at: /usr/bin/python3.10
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate
Timer: 00:00:45DEPRECATION: omegaconf 2.0.6 has a non-standard dependency specifier PyYAML>=5.1.. pip 24.0 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of omegaconf or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063
Cloning the repository...
oof, i thought that the model died, thx
yes, I already suspected it
as long you resumed the training using the same batch size, yeah nothing to worry about
Well, I leave it at 4, since my RTX 3050 only has 4GB of VRAM
Are those features effective? I tried to check this channel, but I couldn’t find much information about them
warmup epochs were proven to give better results than without using it
but only for adamw if i remember well
currently both applio and the fork uses radam
and that already does warmup by itself (don't enable warmup epochs in the UI atm, its meant to be used with adamw)
besides that uhh
its just applio
ah and the fork has the mel spectogram similarity metric, i forgot about that 🦈
yaaa 😔
Hina's rvc on colab errors like this also
Timer: 00:03:28/content/voice-changer/server/HVoice.py:3: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.util import strtobool
Traceback (most recent call last):
File "/content/voice-changer/server/HVoice.py", line 10, in <module>
from downloader.SampleDownloader import downloadInitialSamples
File "/content/voice-changer/server/downloader/SampleDownloader.py", line 12, in <module>
from voice_changer.RVC.RVCModelSlotGenerator import RVCModelSlotGenerator
File "/content/voice-changer/server/voice_changer/RVC/RVCModelSlotGenerator.py", line 4, in <module>
import torch
ModuleNotFoundError: No module named 'torch'
WARNING:pyngrok.process.ngrok:t=2025-02-27T21:20:22+0000 lvl=warn msg="Stopping forwarder" name=http-46499-3f12fb39-2175-40bc-8140-83858962dbee acceptErr="failed to accept connection: Listener closed"
--------- SERVER STOPPED! ---------
uv is broken on colab, so it does not install nothing
So y’all are saying that the mainline Colab is not working?
Okay, so I guess I just wait until you guys fixed the mainline Colab y’all
you dont
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Is there any guide to making ai covers and stuff that’s updated and is there anything I can use for text to speech (I think this is the correct channel, correct if wrong)
thanks bro
what’s ur pc gpu?
FIxed old issue on collab if you run out of time but it trained enough epochs how do you download the file because all of them are not visible even to the download script
same here
Okay Guys, I’m going to test out RVC Mainline since I saw those guys saying that the RVC Mainline Colab Is Not Working, so I’m going to tried it out for myself, I will kept you guys posting for updates to know if it worked or not
Okay NeverMind
its fixed?
which one is this?
-rvc
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
RVC Mainline Colab
https://rentry.co/RVC-Mainline-Colab
this one?
It hasn’t working according to one of em on discord
Yes
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
check #📰│dev-updates
Also Womada isn't rvc
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Use #🔍│help-w-okada
gt 1030
i want to convert an audio file into someone else voice
i dont need training
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.com: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio UI Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
easy
sample is a bit short though
tts with 0-shot voice cloning
fish speech, f5-tts, xtts v2 from coqui
depends on the language though, most tts are just english and chinese
how do i make the ai voice more expressive in Applio? Cuz now it only reads the text like a robot. Should I use something else?
it's normal, RVC is natively Speech to Speech (STS) not Text to Speech (TTS)
the way applio uses it for TTS is because they actually generate an audio first with Microsoft Edge TTS API, then, use that audio as an input in rvc
edge tts is multilingual and good quality, but not emotional
There are different Text To Speech (TTS) AIs:
GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/
Freemium 11labs: Easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS
FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site
You can check TTS in our tts index
With RVC Models:
RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)
If you wanna do tts locally with RVC Voice Models (if you got a good pc):
If you don't got a good pc you can do tts with RVC Voice Models on cloud:
-
Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide
-
Use Applio UI Colab (with google colab T4 free daily limit gpu)
-
You could try another tts from our tts index and use the output as an input in rvc
The best way would be using 11labs tbh, but it's paid
else you could give gpt so vits, f5 tts, fish speech, a try
Ok, thanks
i am using edge tts so can it do this
no, edge tts can't do this
I explained it above and also sent a message about tts
check it out
Edge TTS is a screen reader, plain and simple, very neutral, no emotions
RVC S2S
hello guys am i at the right place to ask a question ?
Anybody got restricted by Colab ?
Elaborate:
- ur PC GPU
- what google colab are u using
- what restriction? Show a screenshot
This account has been blocked from accessing Colab runtimes due to suspected abusive activity. This does not impact access to other Google products. If you believe this action was taken in error, review the usage limits and appeal . @low shard
@wispy lodge you have to check your colab code, this is the 2nd report now :nails:
Sorry but I can't do much about it, it's better you don't use wokada deiteris' fork on colab rn
U could try the Kaggle or do it locally if u got a good PC gpu
Also next time use #🔍│help-w-okada
how to fix
/content/voice-changer/server/HVoice.py:3: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives
from distutils.util import strtobool
Traceback (most recent call last):
File "/content/voice-changer/server/HVoice.py", line 10, in <module>
from downloader.SampleDownloader import downloadInitialSamples
File "/content/voice-changer/server/downloader/SampleDownloader.py", line 12, in <module>
from voice_changer.RVC.RVCModelSlotGenerator import RVCModelSlotGenerator
File "/content/voice-changer/server/voice_changer/RVC/RVCModelSlotGenerator.py", line 4, in <module>
import torch
ModuleNotFoundError: No module named 'torch'
WARNING:pyngrok.process.ngrok:t=2025-03-01T01:16:00+0000 lvl=warn msg="Stopping forwarder" name=http-40611-85bb3119-0fa0-4dba-a7f3-4e73487e3dc0 acceptErr="failed to accept connection: Listener closed"
--------- SERVER STOPPED! ---------```
colab is fked
does anyone know how to get one speaker from an audio file? the speakers aren't overlapping but i just want a file with one person talking and don't want to do it manually
who's gonna sort the speakers out if you dont want to do it manually? some magic?
nope i have the same issue and it isnt fixed yet :/
Oh Okay
guys idk which voice changer im using but im assuming its this one
start_http is taking too long to load
anyone know why
do you see also this warning? if you do we cant do anything cus the server owner needs to update the python version if im right
uv is messed up
also there's no compatible version of faiss-cpu for python 3.11
you either need to downgrade the environment to 3.10 or change the version to install to the one supporting 3.11
1.7.4 supports 3.11
@rain urchin
how to start in kaggle
ctrl + k
Your Google account has been terminated from using services in Google Colab, unfortunately. All you can do is to try another cloud service like Kaggle or wait for the better PC you can run locally.
This is the wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Tell: your PC GPU, the google colab link you're using in #🔍│help-w-okada
Wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Tell your PC GPU in #🔍│help-w-okada
A lot of people keep mistaken RVC for realtime voice changer.
Wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Tell your PC GPU, and google colab link in #🔍│help-w-okada
And the colabs are made by engineers, it's not like the server owner can own like 20 colabs
RVC IS NOT REALTIME VOICE CHANGING, WOKADA USES RVC FOR REALTIME, SO USE #🔍│help-w-okada
I've been telling people to go to #🔍│help-w-okada if they wanna talk anything about W-Okada, for many times. Sure, I get it not everyone knows what RVC and W-Okada the realtime voice changer even are. But if they read more enough words instead of just one line, they should've been able to figure it out by themselves.
Have any of guys tried Livekit?
Hi I'm getting this error "ModuleNotFoundError: No module named 'gradio" on chrome browser, how to fix this?
tell:
your pc gpu
what guide link are u following
and what u want to do
You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPUYou can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
I am using Hina_Mod_AICoverGen_colab on chrome browser windows 10
it's broken like most colabs, check #📰│dev-updates
also I asked for the PC GPU not the Operative System nor browser
any replacement of that?
if you got a good pc gpu you could run RVC locally (runs on your pc) instead of cloud (runs on remote good pc) services like colab
I explained them in the channel I linked
read it up
Cool
guys no matter what I keep getting that noise in the end !
I'm using applio rvc inference
what could be I'm doing wrong
what did you use for training? was it 24-bit audio?
did you clean it up?
did you clean the inference audio?
16 bit
both cleaned up
you can notice here this part fixed but the other broken after changing Pitch extraction algorithm
post the source audio for inference
is it fixable or nope
it is separated vocal or something?
well, bad separation then
use mvsep
and if the source before separation was mp3 then it is even worse
the source was wav
holes say it was lossy compression
this the original
from where can I view this
on training or inference ?
well, whatever method you used for cleaning did mess it up and added that blip at the end
why only the last part is aligned?
but why would you remove silences for inference?
or what is the best example for good cleaning
I thought this could make static noise
you could just instead generate silence on parts to "delete"
where is that ?
denoise_mel_band_roformer_aufr33_aggr_sdr_27.9768.ckpt
I use this repo https://github.com/ZFTurbo/Music-Source-Separation-Training
just place it into models
and then select on UI
how to make it compatable
to make what ?
that is voice.ai, we don't offer support to that, it's literally paywalled wokada
also, this isn't the right channel, I literally explained oyu everything 2 mins ago in #🔍│help-w-okada
why do u still use that garbage
where do I find config.yaml for this one
I figured out it was mp3
congrats 😄
almost 2H
yeah done that

idk if there's way to make it use gpu too
do I need to run this clean up on both , training dataset and inference audio ?
so I have to train my module in RVC over ?
for some reason it keeps running over cpu
check which torch got installed by default
may need to replace it with cuda version
requirements does not have cuda index, so it most likely got 2.0.1 cpu installed
Guys I want anime tts to have a free api, is that possible?
torch installed for denoise
import torch
torch.cuda.is_available()
#True
torch.cuda.device_count()
#1
torch.cuda.current_device()
#0
torch.cuda.get_device_name(0)
conda ?
whatever that is
pip install torch==2.3.1 torchaudio==2.3.1 --upgrade --index-url https://download.pytorch.org/whl/cu121
oh cool then
but now you have cuda torch and torchaudio
yep its seems it works
cuda:0
thanks alot for helping
I hope it works and fix my issues
I been trying for long time
does this looks better ?
meh same issue still
I'm at https://discord.com/channels/1159260121998827560/1307339969743818852 if someone can help I can share the screen
where can i find pretrains
this is undertrained
you got a better 1 ?
for hifigan the original pretrain or https://discord.com/channels/1159260121998827560/1339155300720054316
hm ... not sure if refinegan is at its current state actually usable. No experience with it
results are comparable to hifigan results, got better harmonic reconstruction and less mirroring but people have been experiencing a buzzing sound in their models
refine has better singing range than hifi
buuuut i don't personally use it because of the electric/metallic sound it gives to models
maybe in future those kinks will be ironned out.
perhaps not any more 😛
so atm it doesn't has electric sound anymore ?

interesting

How can I prevent Ngrok from exceeding Data Transfer Out monthly limit using Applio on Kaggle? I'm facing only problems with training my voice models via this
idk how to decrease inbound connection volume without having to upgrade my account plan for additional capacity on ngrok
I think the only ways are either using another tunnel (if the kaggle has the option for) or deleting and registering the account for 'refreshing' ur limit or making another acc
For unknown reasons for me more than 300 MB data transferred out.
And this is just in early March 2025
Yes
btw for those that have a gtx 1650 (like i do): inference will work. very slowly, though.
you can run inference on CPU, slow but it is what it is
at least, the very first time you infer, as it has to load the model. the rest of the times will be fast.
Tbh you can directly use that model on weights.com
You can just click create
Weights.com uses RVC in an easier user interface
hi, i'm training some models but my results are bad. can someone help me? this are my settings. i also tried to edit my pitch but that don't make sense.
cual es el mejor voice changer?
no
that RVC is extremely outdated
Mangio Fork is a fork (modified version) of Mainline RVC (the original project) which has been discontinued since 2023
oh
absolutely delete that, and never look youtube tuts for RVC/Wokada
what's your pc gpu?
what should i use then?
We speak only english for this server, if you mean realtime voice changing, tell your pc in #🔍│help-w-okada
what's your pc gpu?
a nvidia rtx 3060
what is the difrence?
between the 2 I said?
yes
Mainline is the original project of RVC
Applio is a fork, with an easier user interface that gets more updates
basically Applio is more maintained
Okey! i will try that! thank you
mainline havent got a real update since 2 years (recent ones has been dependency fixes and not training related stuff)
applio in the other hand has new updated training code which can give faster and better results than mainline, its constantly getting new updates
a post says that i need to install it on my ssd, why is that?
faster loading times and faster writing times if you want to train
SSDs 🙏
rvc has to write two big files during training, if you train in your HDD is going to take a couple of seconds to write them (around 4 seconds)
but on a ssd is almost instant
it doesn't slow training speed but slows down the process a bit (basically the training will pause everytime the two big files are overwritten)
bc it has to wait until the files are written
besides that, it works just fine in a hdd
explained poorly 
i got this error after installing succesfull
Hi there, so I have trained some voices for TTS in the past, but I was thinking of trying to train the same voices for use in w-okada to use in DnD games. However the guide seems to say you need hours of clean voice samples for it to work... is this still the case?
(Most of my recent decent TTS ones have been done with about a minute of audio or less, mostly cos there isn't more than that available haha)
how do i use kaggle?
Speech to speech (RVC) is more complex than TTS hence why you require more data to have a good model
but anyways, you don't need hours worth of data
for realism you need around 30 minutes to 1 hour max of data
for ok results you need minimum 10 minutes
First of all, be sure to download it on the C drive
Also, you're using the precompiled from the guide I sent right
Which kaggle
thanks, im going through the guides atm so i'll see what I can manage
colab, applio
You mean Kaggle applio UI? Bc u just said Kaggle then now you're saying colab
If so, check https://docs.aihub.gg/rvc/cloud/applio-kaggle/
Last update: Jan 13, 2025
my output doesnt sound like the models. i have a 4070 super and i7 14700k. idk why its doing this
are my specs good enought to run this
why do u blame the specs for that result? it's more of the model itself or pitch setting
120gb in 2025
It was just an example, and 120gb are still used in 2025
Just in phones 
-# more correctly, 128gb on phones
phones dont use ssd, and not even the transfer speed
I was just talking about storage capacity lol
Anyways, it's just a reddit meme I googled 😭
it sounds like mid 10's
Well, I'm using the Kaggle interface, haha, to use Applio within it, this GPU T4x2 thing.
No, around 2020 https://www.reddit.com/r/memes/comments/j3ssp3/ssd_go_brrr/
-# yes I use reddit not only to search random shi
500-1 tb ssds exist in 2020 (perhaps it was considered high end tier)
I just thought of it as a silly meme showing ssds are faster but more expensive while hdds are great for larger memory but slower
btw that means SSD 15GB and HDD 125GB lmao
With the similar price, you'll get either a SSD with lower capacity but faster speed or a hard drive with larger capacity but lower speed. 
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
-rvc
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
- Applio Notebook, by Vidal Kaggle
- Applio Notebook, by Shirou Kaggle
- Music Source Separation, by Shirou Kaggle
- UVR5 NO UI, by Eddy Kaggle
- Original W-Okada's Voice Changer, Kaggle
- Modified W-Okada's Voice Changer, Kaggle
- 🆕 UVR5 UI, by Eddy, ArisDev & Nick088 Kaggle
- 🆕 RVC AI Cover Maker UI, by Shirou & ArisDev Kaggle
- 📖 How to use RVC Mainline on Kaggle by Cauthess
Note: Kaggle limits GPU usage to 30 hours per week.
Can anyone else with the update to https://colab.research.google.com/drive/1mHKTGH5e3SAyDSBss1KtiYRbDdQzwSMs#scrollTo=9qpCkSOUCkFr and where im supposed to put everything that is required because im genuinely confused lollll
Are you trying to do RVC on Google Colab?
I've been waiting for 6 minutes and it didn't even start training
Yes I guess it hasn’t been updated in a while so I’m trying to use applio but it’s still pretty confusing
If I cancel my annual membership, will I get my money back?
until you realize it said that the sample rate or architecture do not match the model settings applied since the preprocessing step
The samples are at 48K and Datasets are 48000
Does anyone know how to use applio if so can I pm you because I’m a bit confused lol
It's not that, its the custom_pretrained button and had to turn off pretrained
Cancel what? I don't think you would get refund for canceling a subscription, but the service would just let you use premium service you once paid in a year until it expires.
pls read carefully the pretrain sample rate and whether using refinegan or the default one
the refinegan one
make sure to use the latest applio repo or if not sure use the hifigan one and pretrain
Does anyone know how to use applio if so can I pm you because I’m a bit confused lol
You sent your same message for the second time now.
Thought no one saw sorry 😬
Want to talk to the website owner
Okay so I’m trying to upload an existing voice model that I already made to applio so I can make an ai cover of a song but I do not know where to put everything at.
An owner of which website? And who are you asking to talk to the website owner?
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Heloooo! I finally made my own female model and it sounds natural and amazing. But then I made singer's voice and it is metallic. I researched and found out it is because sample frequency rate.
There was a website to learn the frequency rate of an audio file. Can someone send me the link?
what is good chunk and extra settings
Can anyone guide me how to install the voice model once I have downloaded it from voice-models .com ? I am using tortoise tts btw
Tortoise tts can't use RVC models, those are 2 different type of AIs, RVC is STS
The only thing you could do, is make an audio with tortoise tts, then use that as an input in an RVC like Mainline or Applio
But tortoise is pretty old
There are different Text To Speech (TTS) AIs:
GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/
Freemium 11labs: Easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS
FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site
You can check TTS in our tts index
With RVC Models:
RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)
If you wanna do tts locally with RVC Voice Models (if you got a good pc):
If you don't got a good pc you can do tts with RVC Voice Models on cloud:
- Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide
- Use Applio UI Colab (with google colab T4 free daily limit gpu)
- You could try another tts from our tts index and use the output as an input in rvc
What's your PC GPU btw
Wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Show a screenshot of your WOKADA in #🔍│help-w-okada and be sure to not follow yt tuts
You can just download Spek on your PC
Ilaria RVC mainline colab is outdated and broken since a year, it won't be fixed
First of all, what's your PC GPU?
Also I just noticed you got a TCOAAL pfp, I played it too lol
Thanks man
Nvidia 4060
Rtx 4060? Yeah you're good then
Yeah
Could I DM you bro? I’m not that well equipped in terms of coding experience so your help would be great
No need for DMS, and not even to code
All you would need is just download Applio if u want the easy way
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Hi Help RVC Members,
I hope you’re all doing well. My name is Vikha, and I’ve been exploring voice conversion using Retrieval-Based Voice Conversion (RVC). I encountered an issue while trying to merge two PTH models—one trained for 200 epochs and the other for 150 epochs—into a 50-50 balanced blend. However, the resulting audio quality didn’t meet my expectations, and I’m unsure why the quality degraded despite the models being quite close in training epochs.
I’ve been experimenting with various fusion approaches, but I haven’t been able to achieve the desired results. I’m reaching out to you because I came across your profile and noticed your work in this field. I believe your insights could help me understand the potential issues that might be causing the problem in my fusion process.
If you have experience working with similar models or have any suggestions on improving the process, I would be extremely grateful for your guidance. Additionally, any resources, tutorials, or techniques you could share would be invaluable as I continue troubleshooting.
Thank you so much for taking the time to read this message. I hope we can connect!
Hey bro, I need help with this issue.
When I input VB-Cable into Discord or other apps, my voice becomes choppy and sounds weird.
Does anyone know how to fix it? 😭
I think the recommendation is to use Virtual Cable Lite https://software.muzychenko.net/freeware/vac470lite.zip
this is the wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
I replied u there https://discord.com/channels/1159260121998827560/1345791557063802970
I've seen so many tips on how to detect overtraining but I have no idea what is most effective. I've read the tutorials but I just want to be sure yk yk?
hear the epochs
compare them
overtraining is very easy to hear
the model starts to sound robotic
Thats what I used to do but It never really turned out well
why so?
I just don't have a good ear when it comes to listening for overtraining
hi, why does like every rvc model sound so weird when laughing and whats the solution?
thats an rvc limitation
so thats normal and theres no fix?
there's no solution to that
damn cuh thats unfortunate
atm there is no fix
I guess I'll rp as a mentally unstable egirl with missing laugh muscles
lmao
lemme ask chatgpt what the condition is that makes u not able to laugh
Akinetic Mutism yup I got that
thats me
been had that
anyone know why rvc won't launch?:
holy shit you got an ancient version lmfao
mangio rvc fork is discontinued since 2023
what's your pc gpu?
12GB one is better, but 8GB is serviceable
i have the 12gb one
can train stuff
do you need a realtime voice changer or trainer/voice changer for files?
yea ik im saying what is the new one
that changes voices for calls/discrodd
the one i have is hella old
is that the official one?
wait it isnt real time
i want a realtime one
@simple ore
you got one or no?
also the one you sent doesnt allow pitch change
great
wokada deiteris fork is realtime
wokada is a program which is better than the mainline rvc realtime, which is better than the mangio fork rvc realtime
and the deiteris fork is better than the original wokada
I replied to u in #🔍│help-w-okada
Everytime I use ApplioNoUI, my storage keeps getting full instantly.
That's because all these G and D files keep duplicating
no, it will work when it will be added as fixed in #📰│dev-updates
tbh just use Appllio meanwhile
Umm No Thanks
I just play the waiting game, I guess
There's nothing wrong with applio, it just got a more user friendly interface and more updates
I don't really understand what's wrong with it, but your choice
alright
If I train a singing voice model will it contains another dataset the same as RVC2 Disconnected on Applio?
if you mean "can I train a voice model with a pretrain?" the answer is yes
I woyld have to start FromScratch
Hello, can anyone tell me which colab they currently use to make AI COVERS?
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
sorry i keep asking for help, trying to retrain on collab, getting this error
NameError Traceback (most recent call last)
<ipython-input-2-5294ebea29b0> in <cell line: 0>()
26 print('Paste model link and try again!')
27
---> 28 if not os.path.exists(f'/content/sample_data/{Model_Name}.tar.gz'):
29 print("File not found.")
30 else:
NameError: name 'os' is not defined
Imagine there was a way you could train accents with a model
like for different artists
like an ai to train not just voice but another to train accents
above it add import os
btw I don't recommend using random colabs not listed as above #✨│ai-help message
as it may not be coded or maintained well
that's what the index is for
it doesn't always mean to relate the problem. the colab author might forget to include import os in the cell unless the previously run cell that contains import os failed (in that case yea might relate to the problem you state)
still I suggest him adding import os to see if it works or not
I was just saying the colabs that are verified to work
Would you guys recommend to keep whispering in the model or would it mess with the training? I am afraid that if I do that, the whispering tone/voice will come out when it is not supposed to...
Does anyone know what the "assets/hubert/hubert_base.pt" file is? When I run the command "python gui_v1.py " through cmd it writes to me that this file was not found, and it is. It's not there, where can I find it, and what is it?
don't follow youtube tuts
you're using rvc realtime from the mainline/original project, which is worse than wokada, which is worse than the wokada deiteris fork
for realtime voice changer, tell your pc gpu in #🔍│help-w-okada
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
do you have 50+ hour good dataset?
with multiple speakers and some amazing variety of the content?
Yes and no
does 48Khz has more noise than 32Khz ?
can you please tell me whats that?
Ok just learned thats not what its for
the index stores the accent of the model you trained on RVC.
I already explained to you.
May I use a zip file with .index/.npy/.pth(I downloaded somewhere... it's a rvc model zip) to generate modification? Also, on weights.com, if a model isn't exaggerated as I expected, should I use it multiple times to enhance the modification on voice? or is there a better way to make the modification within the website?
somehow this channel doesnt allow me to upload a image😂
better to train 32k anyway
got it
the rvc model zip shouldn't contain .npy, elaborate what you mean with modification and share the model download link
the .npy file is an intermediate file produced during index training in mainline rvc, not the final result, so better remove it to reduce the file size
thank you
that's just a mirror of https://huggingface.co/QuickWick/Music-AI-Voices
i still dunno what u want to do and mean with modifications
what up lads, is anyone else running RVC on arch using rocm? I'm facing a few strange issues, this is my full log, from start up of the web UI to trying to process a vocals file:
downgrade torch to 2.3.1-rocm
i'll give it a go
ty
yeah that's a problem, I think that version of torch is incompatible with the latest hip runtime (ImportError: libamdhip64.so: cannot enable executable stack as shared object requires: Invalid argument)
i'll try and see if I can get a docker container going
In Applio where do i select refinegan as the architecture to train a model with? i've downloaded some pre-trained refinegan models, matched the sample rate to the dataset but when attempting to train i get: The parameters of the pretrain model such as the sample rate or architecture do not match the selected model.
you need to clone the main repository
thanks. the modification, I meant voice conversion. changing my voice to somebody else
and i just figured out how to use weights.com/... today earilier it did not change my voice after I uploaded wav, thats why i kept wondering if there is another alternative for rvc
I think I discovered the issue, my GPU is supported by rocm but only windows, wtf ayymd
amd has pretty shitty and weird support for AI
I am aware, some things never change
https://docs.applio.org/applio/getting-started/installation#alternative-installation-methods maybe you could try Applio, which is a fork (modified version) which got more updates and easier user interface
or try mainline/original project guide https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/en/README.en.md
yeah i tried mainline, i'm trying applio on windows now
not sure if applio got also linux amd support, maybe @simple ore knows since he had an amd gpu
it does but I assume the same problems would be there, the issue seems to reside in the hip runtime just not supporting the 7800 XT properly on linux
I mean hopefully maybe in the next years things will change
-# Or you could use Nvidia GPUs
Nvidia is better than AMD, the only issue is their prices
you need to import the rvc model, then click create
or use one that is on weights.com
what gpu?
7800xt has options - HIP SDK + Zluda + patched cuda torch on Windows, WSL2, ROCM on Linux
I've tried ROCM on linux, but I do get a random segfault when doing the actual conversion, I've found other people who make their own ML projects having random segfaults as well, so I reckon it's an issue with their stuff
yeah, more or less AMD fully supports only their top of the line GPU
can use Zluda on windows, should be fine
I think for linux you gotta use ROCM5.7, then set environment variable HSA_OVERRIDE_GFX_VERSION=11.0.0
had, not using it any more 🙂
not working like what ?
Yo,
I’m looking for a good way to create a realistic AI voice, but I don’t know what to use or how to set it up to sound natural. Any tips?
hello link rvc?
What's ur PC GPU
What's ur PC GPU and what do u want to do
Rtx 3070
How many epochs for a realistic voice in RVC?
There isn't a right amount
Look at our docs for more info https://docs.aihub.gg/
Last update: Oct 21, 2024
Please how can earn iq points in this channel
what
How can earn points of ai hub
ai hub isn't a point system ?
Okay thanks
you mean levels? you have to chat
Yeah
more of losing your iq points
mainline has not been fixed yet
What does the pitch change do? to change between male and female pitch?
Im using AICoverGen No WebUI on colab: https://colab.research.google.com/drive/1u1brjK8IZt647UsbZuGYfW29oFM2I4tk?usp=sharing
depends on the implementation
but it either raises f0 values by some amount or also nudges them to match regular note frequencies
@acoustic scarab can you give me google colab web rvc links?
the helper explictly says in his display name that he does local inference only
he can’t help much on cloud
first, what’s your pc gpu and what do you want to do? To check if you got a good enough pc
i need only links to access
r5 5600 gtx 1660ti
what do you want to do? Inference or training?
wanna change my voice into another voice
That’s inference, but realtime for calls? Or on pre-recorded audios
both btw i have the realtime one i need only pre recorded
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.com: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio UI Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
gave you both local and cloud ways
tnx man
how can i make models do you have any guide?
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio (ui)
that's look hard
what if i made a model can i make money with that?
you can’t do paid commissions unless you become a model master
it’s AI
i have too much free time in these day's i'll must try
model masters means that you firstly become a model maker by having your first model approved by QC, then havng to make many high quality models
what's QC? it's an organization or something?
The quality checker. A group of people in AI HUB by Weights doing models quality checking before they being uploaded to #1175430844685484042 .
ohh tnx
Quantum Corp.
they are going to believe that
I tried to use mainline kaggle, but it gives me the same issue like in colab.
been trying to figure this out for days now
nvm i got it
i need best rvc
What's your PC GPU and what do you want to do
ryzen 5 4060 need for discord
ram 16gb
So, realtime for calls?
yeah
Wrong channel then
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
I have answered you before #🔍│help-w-okada message
Go to #🔍│help-w-okada
ohh i though it is real time voice changer
actually its opening in my browser
Nope RVC never meant that
And that's completely normal
The great majority of AI programs use web uis
Use that channel
so no application of okada now?
earlier there was one
It runs on your PC, just it uses a Web User interface
Also original wokada was a web user interface, just it made its own window which got removed in the fork as it can cause issues
yeah but still i would prefer app
can i get that?

It's still an app
The only difference is the User Interface
alright
Only the old original wokada which is way worse uses the web user interface in it's own window
Also use #🔍│help-w-okada
Don't use this channel
it shouldn't affect performance, but has actually better performance than the original one
okay
To get a separated app version of Detris' W-Okada, download the pre-complied codebase of Detris' W-Okada from GitHub, then try code the GUI by yourself and let the author know.
💀
That's what I can say if you want the app version of it.
can you give me real time voice changer cuz mine is not working
mine neither
alright, let's go in #🔍│help-w-okada
tell ur pc gpu in #🔍│help-w-okada
To request someone to do voice model for you, go to #1159289738314919936. #✨│ai-help here isn't where you asking for that.
i cannot start server using ngrok it allways says "server stopped"
elaborate:
- your pc gpu
- what do you want to do
- what guide link are you using
For W-Okada cloud, go to #🔍│help-w-okada. If you mean something else, tell me.
is there any guide how to run kaggle?
which tool should i use to make a high quality voice model?
What's your PC GPU
rtx 3050
You asked the same question on #🔍│help-w-okada , firstly explain your PC GPU and what you want to do
Laptop?
no no, desktop
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.com: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio UI Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio (ui)
what about mangiorvc?
i have that downloaded
That's extremely outdated
Don't follow YouTube tutorials
That program isn't maintained since 2023 and is one of the oldest RVCs you could use
so its bad
Yes, very much
We even removed it from our docs, the creator doesn't maintain it nor fixes any bug, don't use it at all
@broken urchin just read what I told you, I gave you all the options
yeah i read it thanks
Yw
the 8 GB variant is more recommended to do so
welp i have the 6gb version
I'm using machine translation. I apologize if the sentences are awkward.
Dear Ai Hub intelligentsia, do you have any guesses as to why people who were once using RVC mainline or Mangio fork feel that Applio sucks after using it? It's hard for me to understand what exactly is wrong with them, because most of the people who make this claim usually treat me as a brainless worshipper of applio, or are so inexperienced with Applio (they're new to it) that they blame the problem on Applio as a whole, rather than on some feature of Applio.
One thing I can be sure of is that they are having an experience that makes them feel that the output from Applio is clearly inferior to the RVC mainline. Does anyone know why? I feel bad for them that they are giving up the conveniences that Applio's developers have worked so hard to create and either going back to the mainline or giving up on using RVC altogether.
How do I train a voice with TITAN? I have RVC training software already installed but the models I have are until rmvpe
applio inference is slightly worse than mainline, yeah
training is better tho
rmvpe is not a model, it's a pitch algorithm extraction
what rvc did you download and what's your pc gpu?
-Colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
@tough fiber #🔍│help-w-okada message that's the latest precompiled mainline rvc, which is around over 1 year ago
@crude flame is there really no newer precompiled for rvc mainline?
atp shouldn't the docs explain how to do it via source?
@tough fiber if you want you could try applio, which is a more updated fork of mainline rvc
Last update: Apr 01, 2024
oh thanks ill try applio
on local
🤷 i am unaware if there are any newer ones
I downloaded the mangio-rvc one, my GPU is 8GB of VRAM (RTX 3060)
that fork is extremely old and not updated
It's better you get Applio
also, in the docs it will be explained how to use pretrains
yoo i need help
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
#🧬│ai-chat message is this what you're asking for?
Thank you so much for the info, yeah I use mangio RVC locally so I have no problems.
But how do people make TITAN remakes of models? That's my question, and how do I make my own?
mangio rvc is old
you should delete it and get something newer and supported like Applio
about 'remakes', you would just need to retrain your model using the titan pretrain https://docs.aihub.gg/rvc/resources/training/#pretrains
Last update: Dec 24, 2024
Noted. Thank you so much for the info again.
yw
Could someone help me make the models' breathing more natural without that robotic sound?
Sample
add a ton of breathing into the dataset or cope
Should I do these one after the other?
doesnt really matter where the breaths are as long as they are there
Okay, I always remove it because they say it adds noise to the model.
if you don't mind robotic breaths as it's common in expressive talking and singing vocals
How do I get the model master role?
https://www.aihub.gg/en/dashboard/apply apply here (i think)
I'll check it out thanks!
Can I use RVC in python code? I want to automate something using python, I generate text using LLM then TTS using RVC
why does the perf thing not appear on my voice changer client
wrong channel
@dull plume
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
show a screenshot of ur wokada in #🔍│help-w-okada
Yes, you can call inference from python code
https://www.instagram.com/reel/DG0cTHIyd78/?igsh=d2t3N3c1YjB4Zm9r
Is there a voice model of this voice from this reel? If there is mention me please 🙏
You can search rvc ai voice models at:
- #1175430844685484042
- In #🔍│find-models , Do /find with @earnest musk
- https://weights.gg/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.ai-hub.wtf/essentials/how-to-make-voice-models/
:wave: @low shard, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
i was afk while my model trained the and the runtime after it finished where can i find the pth file
Did it disconnect automatically?
You might have lost it, check if there's any files on Google drive
i lost it i will just train again its aight
Help?
Be sure to check it once in a while so it doesn't disconnect
Elaborate:
- ur PC GPU
- what guide are u using
- what u did step by step
- what do u want to do
I'm using Applio (google colab) and trying to upload a wav file for training
To train a new voice model
Please elaborate also your PC GPU
same bro
I also just got a GitHub issue with the same issue for my facefusion online ports....
Welp, new cloud issue, I gotta check this
I will check the error although it is most likely an internal gradio problem.
@night rune btw I asked ur PC GPU because if it's good enough you can do it locally without relaying on cloud
That's how I use google colab, it doesn't use my pc as such
But all right, it's a gtx 1650 super.
nah, my pc is trash for this
Ohh, I mean you could technically do it locally on your GPU but it could be kinda slower and limited
try uploading the dataset to applio/assets/datasets through the (imjoy) file manager
I'll wait forever if I try it xd
Do I need to create a new folder for this?
Can anyone tell me which one I download from git? I don't have a video card.
Download what? You asked the same thing in #🔍│help-w-okada
Okay, it worked
Why does my browser say the files are infected?
Probably doesnt understand where the download is coming from in a sense
So it gives a warning in case
ik hina's not working rn, is there a webui thats currently working without many problems? my pc is probably not good enough to do anything on my computer, i use a gtx 1660 ti.
Actually, your PC can run the Wokada deiteris fork
Are you going to use wokada for games? And if so, which?
im creating AI content trying to make a model for transforming instrumentals into beatboxes
i already got the model created now though, i just need to run it on my computer, its trained on applio rvc
looking at the documentation its actually remarkable thinking that i could use this for real-time voice changing. the only thing ive seen like this is that really shitty voice.ai app
ohh you were talking about hina mod mainline rvc ?
You need to elaborate always what you are using, hina created many things
nvm you mean realtime, so you were talking about the hina mod original wokada
then yeah, you got also the wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Let’s talk in #🔍│help-w-okada
after searching up just 'hina' i realized that a little too late haha ^^;; i was talking about the rvc one, just mentioned realtime because i saw it come up in what i was reading 🙏
assuming i cant use wokada for ai covers or changing instrumentals to beatboxes (realtime), would any rvc fork still work with my GTX 1660 Ti? @low shard
for local conversion aka no realtime use applio
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
oh
please elaborate next time
I just tried to guess the most probable one
of course 🙏 my fault original gangster. i got you confused twice in a row because i wasnt explaining it right @_@
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.com: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio UI Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
applio ui works still if you use the localtunnel option instead of gradio #📰│dev-updates
check #📰│dev-updates
lol 😭
alright bet 🙏 thank you again for all the help
Saw it
You’re welcome, and let me know
oh lol
I just needed applio no UI to see the code of one cell
Is it ok if i put most of the dataset normal speech (20-30mins) instead of singing?
bad
damnn applio with my gtx 1660 ti is actually faster than it was in a collab
sad :-(
with applio i gotta separate all the audio first, right? its been a cpl years since ive used a normal fork that doesnt split the instrumental for you
you can upload a long file and applio will slice it for you
oh fr?? thats sick
does it keep the instrumental file?
yes. im using it to turn instrumentals into beatbox

it sounds cooler than you think trust me
i got a herbert the pervert model merged for beatboxing and it goes CRAZY
yes im aware this thing can clone instruments but still lol
guys i wanna start making songs using ai what platform or what should i download to do this and also do i put raw vocals or mixed and instrumentals or not ???
anyone know why my voice changer dosent work??
don't try to infer unseparated mixtures
im using voice changer client btw
wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
@broken crane elaborate:
- ur pc gpu
- what guide did u follow (I hope not those old ass youtube tuts)
- the issue
i got uvr5 and vip models, if you know, which ones would be best to download for splitting instrumentals and vocals? as well as removing echo and reverb
i ended up just getting what i know works from a while ago, de-echo and de-reverb by foxjoy and kim vocal 2

its a LOT faster on my computer than on a colab, i thought itd be slow since i got a gtx 1660 ti ngl but it takes 12 seconds for a 3 minute long audio file
also, is there a local version of the AICoverGen? id like to get that one as well for when im not doing the beatbox thingymabobber
hello, it's just me or aicovergen got error on google colab?
yeah, seems aicovergen is prettychopped ^^;;
i couldnt even get it working locally
#📰│dev-updates / #✨│announcements currently a lot of colabs and stuff that uses gradio are broken rn tho
yeah, I tried locally too but same result as using colab 😦
i tried another one and this one works locally and has a colab available incase you wanted to try it instead:
it has youtube links working again too ^^ woot woot
Thank you, I might try this one
of course
gl gl
Hi, can you tell me what files I need to upload to share a model I have trained with other users?
the .pth and .index, also look at https://docs.aihub.gg/extra/model-maker-role/
Last update: October 20, 2024
the great majority of google colabs are broken, read #📰│dev-updates
can someone help me, when in call and my voice changer is on, whoever is talking can hear there self through there voice
@low shard ??
I told you before, this is the wrong channel #✨│ai-help message
Also, you need to elaborate: #🔍│help-w-okada message
You can't expect me to help without any type of info
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Is it hard to get model master. Because i feel like if I don't get it, its because of my dataset cleaning process.
Oh also, how will I go about knowing how I did. Like will someone message me
you only need 3 hq models
idk how it will let you know its a new thing
Yeah last time I tried it was through a discord bot lol
i wanna know what is the best pretrain model to train a voice on.
This answer might not satisfy you but, there's no universal winner.
Each was trained a lil differently, different people behind it, different settings and different datasets
So one might fit your model ( or rather, dataset / voice / speaker, whatever ) better
Original pretrains, klm hifigan or those experimental refinegan ones is what I personally can recommend.
Best for you to just try and see
But as always, it's a good habit to start with original ones and only try customs if the results aren't good enough ( and after you made sure you've exhausted your opnions; aka, it's not user-error )
like i've tried orignal, ov2super and rigel, sonce my data has like mixed languages, i tried it on different voices and like it worked but it had tearing or artificats in it.
i dunno what do i do to make my dataset to sound better
so other than those 3 pretrain are there any which has like low artifact rate and can produce a better voice model quality
and also do I use the rvc disconnected colab (as its says it's outdated) or the mainline colab?
the latter two are old and rigel is a failure overall (unless they could use better training configurations and train more epochs under some powerful H100s)
@kindred eagle u r ignoring this suggestion
🤔
nope nope had tried the orignal as well
will try ov2 then compare
and then do the rest............
Well, you can start away with ov2 and or klm if you want
but you see, people at times go for what seem " the best " or is the most recommended
without testing stuff, and AI is, well, it ain't deterministic in that way
However, I wouldn't bother with other models than those I wrote about
Most if not all customs at the time were trained on fp16 with exploded gradients ( simply put, aren't that stable
hmmmmm, alright then i will actually compare the 3 models you have mentioned, compare them and will proceed with what i think is the best among them.
Yup, the right approach
ty for the suggestions @glacial pollen
quality needs some decent work, that's just how it is
Yea np man
best of luck and take your time
true
Just to encourage you to not give up, my best model took me few months ~lmao
Ofc let's not go that drastic, just saying
cause some people train 1 or 2 models and give up, quite sad seeing it happen
yeah i've trained a couple of models
but like some turned out crazy good but some had them annoyiing artifacts
Yea like, I know it can be exhausting to be going through various batch_sizes and pretrains, but when it works, it's worth it
lots of things contribute into artifacts to be honest
it is worth it tbh
yup
yup i didnt know that you could actually do a lite version of phantom centre extractor in audacity, so the reverb that was there in the audio after the dereverb process actually F'ed up the model quality
oh, well
for de-reverbering I can only really recommend vx's dereverb
tho, yea, it ain't free and is a vst ( ai powered however
ahh i use the uvr dereverb and denoise it works 75% of the time but yeah that remaining reverb in the audio............aaaaaggghhhhh
Yea, the models aren't the best at certain reverb types, esp those minimal room ones
If you are skilled enough, you can manually yeet them, or at least tame the trails / leftovers
goes like this
yup
and yeah is spectralayers like good or meh
That's rx
not a good idea to directly do center extract, you should first remove post process reverb (many are stereo, some are mono), and then remove the remaining one that's mostly mic room reverb using RX11 dialogue isolate
Spectra layers is decent but, if I had to choose the winner, it's rx
yes i do know the process
Dialogue isolate can damage the audio so, better to be careful
It's far from what I'd call reliable
alright gotcha uninstalling spectralayers
well not really, if you're handy in it, you can use that with no issues
but I just prefer rx
nahh like it crashes 7 times before i can actually work on it. so its better that i uninstall it
specifically, I really like working on spectrograms in rx
change the scale, zoom in, feather if needed, work repeat
oooooooooooooooo
Aaaa
it is mostly some breaths
I've never had any luck with it if it comes to anime type reverb
at best, it'd always castrate the audio
decrease the fullness or screw up the respiratory range
anime reverb? anime has reverb in it? since when or am i dumb to not notice it?
ahh i c the toji model i trained.......... hmm
the stereo ones are quite easy
thats why it was kinda F'ed up but usable
The issue with stereo vs my workflow, is the fact stereo has 2 channels and they are never uniform
I extract stereo, operate on 1 channel, then de-reverb it in mono
100% predictable
yup i started soing that recently
Another thing is, vx lets you finetune the de-reverb to your needs
better quality and prediction tbh
- you tame the rest in rx
and the results are perfect
much more perfect than any automation / models can give you
Why is that? because it doesn't get 100% of it, it expects the user to handle a bit of it
the mono dereverb one is quite tougher for me, esp when there are some breaths
Yea the breaths can get damaged sometimes, but you can always just layer the tracks
and manually de-reverb the breath
it's just some feathered selection yeeting / enveloping
But ye, I get that. Everyone has their own workflow they recommend.
That's why I recommend mine, which is vx's de-reverb pro mono + manual polishing in rx

sorry im really late to all this but if someone could point me how to use refinegan pretrains in applio im just stuck here
refinegan is experimental
there’s no stable version for it yet
the only way is via using the main branch source code
where do i download pls 🙏
I think it's on GitHub?
hmm ok
code > download zip > extract > run install
@jaunty iris what’s ur pc gpu tho
oh ok so not the precompiled 3.2.8 got it
i just upgraded everything im on 4070 super 12gb 
it’s the stable release, this is the main branch source code which could be not stable
its experimental
Great
understood
but yea been wanting to try training again and now i can do local 😁
interested in how this goes i was gonna test w KLM5
still not working sadly 💔 @low shard i went back to make sure i had my dataset file at the same sampling rate as the pretrains but now i genuinely dont know what to do 😭
you seem to be still using the latest stable release and not the source code one
@jaunty iris did you miss this step?
you’re welcome
I’m currently using RVC for voice cloning, but I’m curious if there are any better apps out there that might do a better job. Sometimes, I feel like the slider values in RVC don’t work as well compared to the online RVC forks.
Please don’t judge me, but my GPU is a 1650 mobile. Any suggestions or experiences you can share would be greatly appreciated!
we gonna make 3.2.9 release with refinegan disabled, just to push other changes
right now even the main branch has it disabled in the repo
huh why’s that?
which rvc are u using
kits.ai uses RVC too, the major diffference is its easier to use and automates processes like separating vocals and instrumentals that u can do urself
they have turned greedy and shady, even since before that they have been gatekeeping models trained using the service from downloading locally
^
just to make a build
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
AI HUB Docs