#✨│ai-help
1 messages · Page 211 of 1
basically the thing Noobies said
oh, what did you remove what you showed me in the image?
--branch is an option to clone a specific branch, you want the main branch, not 3.2.8-bugfix, --single-branch is an option to clone just one branch
hello, i just finished training and am not really sure which step i should use as the final model. it looks sort of different compared to the other graphs of models ive trained.
dataset is small, like 22 minutes
if you want main, you need to remove both options
total being at 48 is somewhat hight
take 55k step model and give it a try
thank you!
Sorry, I still don't understand
remove the highlighted part
and after that?
proceed as normal
and where can I charge the pre train klm 5 and refine gan?
i been doing my audios with 44k and i use 40k rate in the collab, do this affect heavily the model? i tried to change the rate to 40 or even 32 but the audio turn slow and low pitched, does that even matter? do i make the audio match his original speed and pitch or i need to redo everything again? sorry for this dumbass questions, i been doing models time from time since 2022 and i never stopped to think about this
sorry, looks like the colab code needs to be updated too
not my jam, so cant help with that
are you talking about inference? traning?
alright my bad, i didnt explain properly, i mean that i do my audios with audacity and i use 44k by default for everything, the collab usually are 32k, 40k and 48k, and i want to know if this affects the model making it bad
and if i can change the rate simply here, the audio turns slow and low pitched, and i want to know if that matter, or i need to pitch up the audio with their original speed
it is a sampling rate, export audio from audacity and pick the sampling rate there
do not change the sampling rate in audacity itself, it simply changes the playback speed
by default RVC preprocess resamples the input audio to the desired SR
so dont really need to bother with changing it in audacity
oh, i'll wait
I think it should be usable on UI colab
but noUI needs parameters passed over
Yes I want to use the UI, since I want to try out the new things it brings.
What setting are we talking about here?
changing the rate here doesn't resample it, but actually speeds up/slows down and change the pitch, so changing from 32k to 48k will speed up to 1.5x and raise the pitch to 1.5x
for this case, you should resample it to the target sample rate (40k)
Anyways i coulndt make the model yet, got a blackout at first try and the second try GPU collab stopped working or something like that so i guedd i try Tomorrow
The pitch
okey so do i have to use speech to speech for good result cause tts isnt giving me much of a good result
rvc isn’t made for TTS
it’s speech to speech
applio just uses ms edge to make an audio from tts, then uses it as an inoug in rvc
is it possible to train on colab anymore? i used to do it before, any link?
Stuff that are in #📰│dev-updates should work
thanks but i dont see any tutorial links? it looks so different
(Do i use Applio No UI or with UI for training?)
Hi guys
I have a question
Does Applio have any way to continue the model after the first training?
Because this was in RVC Disconnected, so I would like to know because I always used this function to create my models.
you can resume training
Which UVR model isn't too heavy on lossy audio? Trying to retain the bitrate instead of clean vocals
no it's recommended to use the lossless flac/wav format
I'm afraid I phrased that poorly, but some UVR models tend to knock down the quality of the files a notch and I'm trying to find which ones aren't too rigorous with it
Some guy gave me a few tips but the message was deleted
Does applio no UI pass new vocoder parameters
How can I do this? Could you help me?
where's the deleted message? I didn't even see it logged
@knotty moth Was from a while ago, around this timestamp
just start training again, do not run preproces/extract features,
add more epochs, use same batch size
they are not on the screen, so no
Guys anybody from tech field I am joining clg next year which course is best
I don't recall that post in 2023, perhaps you mean the lossy source and that none of the separation models are made to reconstruct the missing high end frequencies due to the lossy format
So KLM 4.0 32Hz refinegan?
Okay! ty u
Yes
I'd like to reduce its effects as much as possible, hence my question
Hi
im trying to mix 2 models on applio but i have an error, do you know why ?
it must be two model pth files (the actual model, not the G/D files nor the index files)
what effects to remove? unfortunately don't expect good ones other than the dereverb/echo and denoise models
Thanks
even the audio upscaler/lossy enhancer models are still far from perfect enough for dataset making
Separating vocals/instrumentals as well
I'm unsure which ones don't reduce as much of the audio quality
this gives me a single pth file by mixing the two pth
What do I do next? I need an INDEX file to make a new voice
there's no index mixing
refer to the recommendations https://docs.aihub.gg/rvc/resources/dataset-isolation/#the-best-models-for-uvr-are
Last update: Dec 24, 2024
Do they ensure the best audio quality?
Ok, in fact I'm trying to create a realistic French woman's voice, but there's not much choice in French voices..
you can use an index from any french model
the index file affects what? intonation?
prononciation
english audio inferred using a french index will sound like french person speaking
with an accent
Ok i see
does using the original https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI-only local RVC application for conversion have any difference between other forks?
I did a clone of Emma Stone’s voice using Mangio RVC and another cloning on a paid site. On the paid site I provided only 1 minute of original audio and cloning was done quickly. On Mangio RVC I used 2 minutes of original audio, and 200 epochs. Still, the cloning on the professional site is better. The voice is more similar, and includes the actress' hoarseness. Why would it be?
ok I have a problem with the software the file called RVC GUI.bat does not launch even after a long wait what should I do???
can i ask what link that is? i used to have that link but i forgot it
the link for the website
Last update: Apr 01, 2024
use applio instead
https://docs.aihub.gg/rvc/local/applio/
Last update: Apr 01, 2024
why not is it an alternative?
RVC GUI is outdated asf, you prob watched some yt video on how to make ai covers
what's even ur pc gpu and what u want to do
Mangio RVC is abandoned since 2023
that's the mainline /original one, there's applio which is more user friendly and got some other features, but it's ur choice on what to use
Thanks!
are u looking for training models? what's ur pc gpu
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
where i make ai cover with web ui?
Yes. My GPU is kind of old, NVIDIA GeForce GTX 1650.
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
Great! I will take a look at those options.
How do I put a KLM pre-train in Applio UI?
I have the links for huggingface
Should I download and upload them like this? Or can I download them directly from Applio?
in the download section it is not possible right?
Should the folders be separate? I'm already downloading klm 5 32k.pth
G and D have unique names
So I have to upload it separately, I thought they were in a separate folder, and it goes directly as I understand it.
Noobies, where do I upload the route now?
just use the screen
How can I train a ai model with nothing but a MP3 file?
I want a cloud mp3 to rvc if possible
I can pay for this just need it
It won't let me use refineGAN in Applio UI
to do your "mp3 to rvc" sorcery, you should learn everything in https://docs.aihub.gg/essentials/how-to-make-voice-models/
In the context of RVC, the dataset is an audio file containing the voice the model will replicate. It can be either speaking or singing.
How do I transfer the auto backup to another Drive account and continue retraining in the other account?
- make sure you did not accidentally selected an empty space with a mouse cursor in applio window, press Esc a few times
I resolved this and forgot to delete the message, my bad
How do I make AI sing?
What's ur PC GPU
https://discord.com/channels/1159260121998827560/1163571683848900629 , also Colab isn't suggested unless u pay for pro, lightning.ai is better and u can find more info about it on my GitHub,
Also this isn't the right channel to ask in
Hi, is there a way to maintain original style and specifics of the original voice from the song (effects, chorus, etc.) and change only the sound of it to the one form trained model?
wdym
Can you elaborate? I am still new in this 😅
wdym is an abbreviation for what do you mean
i don’t understand the “change only the sound of it to the one form of trained model” part
ok, I want to convert the voice in the song using custom trained model using RVC. But, when I do this I lose the original voice qualities and effects, so I am wondering if there is a way to maintain those when converting.
Lalal.ai can do this (there is an option for that) but they don't support training custom voice.
Does this happen after extracting the vocals or only after doing inference
because lalal.ai seems to only extract voices, not doing ai covers
What is the best method to create ai covers locally?
lalal. ai has voice changer: https://www.lalal.ai/voice-changer
I would like to achieve similar effect on RVC with my custom model as the conversion in Lalal.ai with options:
keep original accent
and
keep original tone
what’s ur pc gpu
Nvidia GTX 1060 6gb
I think u have to play around with the pitch, which is the tone, and can depend based on original voice and model
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
Hello, may i know how to get the desktop client? The one that popped out is website client(Fork). Sorry if this is dumb question.
ok, thanks. I Will try
I think you’re talking about wokada, this is the wrong channel, RVC is for inference on prerecorded audios and train models, while wokada uses rvc for realtime inference
there’s 2 main versions of wokada
The original made by wok
The modified version made by Deiteris
Wokada uses a web user interface, it’s all normal it opens ur browser
for other things pls ask in #🔍│help-w-okada
i see, thank you.
applio needs internet locally?
no, only for TTS since it uses ms edge tts api to make an audio, to then use as an input for rvc, since rvc is STS
I need it to separate audios voices and make ai unused internet
Applio is for inference (use rvc models) on pre recorded audios and training (making) models
and tts, but that’s only with internet, the other thighs are offline
about separating vocals and instrumentals, you need another program
Which?
either local uvr5 https://docs.aihub.gg/rvc/resources/dataset-isolation/#local-uvr or @viscid moss ’s UVR5 UI https://github.com/Eddycrack864/UVR5-UI
I am installing this https://github.com/eddycrack864/uvr5-uii, what can I do with this?
separate vocals and instrumentals
perfect
Does anyone know why Applio UI does not generate index?
show the log
you f'd up
It doesn't see the audio
"Preprocess completed in 0.00 seconds on 00:00:00 seconds of audio."
what’s inside ur dataset.zip
what did u use to make it
You mean me?
he was replying to pain's message
Oops wrong ping
I uploaded the audio normally, without compression
check the structure of the dataset folder and files, or maybe try the "dataset maker" tool
I 've created a voice model in weights.com, but it comes with pth and json files. So I can’t run it in applio. Is an index file absolutely necessary to run a model in applio? Can we get an index file from weights.com?
index is optional, there's no issues running weights.gg model in applio
Thanks, thanks. This is strange. I can run in applio models I download here and models I create in applio itself. But the execution of the one from weights.gg gives an error message. I will try to restart things here.
what error message?
I restarted the computer, and now it's working. Thanks
An error icon appeared in applio. I didn't manage to copy the message in cmd.
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
-hf
- UVR5 UI, by Eddy and Ilaria Huggingface Spaces
- Ilaria RVC Zero, by thestingerx Huggingface Spaces
- RVC⚡ZERO, by r3gm Huggingface Spaces
- Applio, by IA Hispano Huggingface Spaces
got gtx 1650 and i5 10300H
I use gpu for rvc but the voices are not smooth they become robotic or brake down, any suggestion on how to get it smooth
what tut link are u using btw? and that might depend on how the model was trained
tut link?
yeah like what tutorial are you following / program using?
I tired few different models and try changing the software the setting ( idk any settings just randomly changing ) but its the same
vcclient win cuda something
that's not RVC
that's an old version of the original wokada, prob from a youtube tutorial, which is outdated
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
@copper terrace are you looking for realtime voice changer for calls right?
I'm looking for vc and mainly recoding
so, you need both converting audio files and realtime ?
yes
then you need both RVC and wokada
the recoding in the vcclient is also bad / same as realtime
not just wokada, wokada is made for realtime
well... this is confusing lol
RVC = for converting files
Wokada = realtime
Let's start with RVC:
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
I will tell you which Wokada to use in #🔍│help-w-okada , the newer wokada has also better quality and performance
I have gtx 1650, will rvc run good on it?
it will run good, won't be fast as cloud but you won't have any time limit
I have linked you everything you need for RVC in that message
thanks
the model will be the same dw. I will tell u that rq in #🔍│help-w-okada
ya got it thanks
it's recommended to use the non-realtime rvc as you're not doing live performance and want better quality
I was confused that you were replying to me but good thing it was an accident
Whenever I click the start button on the voice changer stuff it says quote "waiting generate pipeline" then it says it fails saying "Pipeline not initialized"
why does the ai smother a cuss word lol
Help
Autobackup Enabled
The tensorboard extension is already loaded. To reload it, use:
%reload_ext
Starting backup loop...
tensorboard
Files are up to date.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1739398664.153193 3622 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1739398664.162762 3622 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
No wav file found.
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1739398670.200721 3673 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1739398670.207588 3673 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:558: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
Starting training...
The parameters of the pretrain model such as the sample rate or architecture do not match the selected model.
An error occurred extracting the index: need at least one array to concatenate
If you are running this code in a virtual environment, make sure you have enough GPU available to generate the Index file.
the only meaingful error is "The parameters of the pretrain model such as the sample rate or architecture do not match the selected model."
RVC is not Wokada
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Share ur PC GPU, and the link to the tutorial you followed in #🔍│help-w-okada
Ah I apologize
It's fine, share the info I said in that channel
hi im trying to find the most realistic rvc voice models but ones i find online is not always the best quality. Can someone help me?
most people who dont know what they are doing , i asume you want to change your microphone in other voice. If so, search for help in #🔍│help-w-okada .
Furthermore why am i always being pinged 😂 ?
oky im write area
os the client working i have a wait for web server
I'm a man. The improvement of RVC v2 over original RVC is most likely the quality.
For W-Okada the realtime audio changer, go to #🔍│help-w-okada. This channel #✨│ai-help here is about RVC and RVC programs.
bet ive been using Ilaria RVC for over a year now 😢
Which Ilaria RVC? The one from Hugging face or GitHub?
not a good idea to ping a random person and as said above pls go to #🔍│help-w-okada
You know, you should never pinging a random user you didn't know.
the huggingface zerogpu one is still good
Is Easygui crashed
On Google Colab?
Yeah
Can you send the screenshot of it?
You know, you should never use a user copy version of a Colab notebook, as it might have been outdated compared to the main and original one.
And original too
What are you gonna do with RVC? Training a model or doing AI cover? There are some better alternative notebooks to this.
Traning
if you copy the notebook, you should know how to resolve the package conflicts/errors by yourself
Wait a second
Applio the RVC can do that.
has RVC been made more efficient to train locally with a shitty gpu or do you still need a good one or to use google colab
depeds on gpu
6 GB vram as bare minimum, or 8 GB as recommended
GTX 10/16-series gpus may work but RTX 20-series/newer are recommended
ty!
You can train a voice model on GTX 10/16 series GPU. Just don't expect it to finish that fast.
Any GPU that's older than GTX 10/16 GPU, you'd better avoid that for AI.
I tried before i use copy verison
That's the same screenshot you sent to me. Since there's a bug going on with EasyGUI notebook, and you have no idea, we have no idea how to solve it, you may try Applio instead.
Ok
can someone help me figure out which settings are the best to use
Wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Use #🔍│help-w-okada and explain ur PC GPU and CPU and the link of the tutorial you're following
It depends by model and the voice of the audio, u usually need to play around just with the pitch and index ratio
#📰│dev-updates that colab is broken
- pitch: up to +12 for male to female, or -12 for vice versa, otherwise leave it 0
- index rate: if not sure or want to use the model's accent leave it as is
yoo guys What is the best program/modulator for changing a male voice to a female voice? It doesn't have to be in real-time. I've used Applio before, but maybe there's something better since I haven't kept up with this topic for a while.
RVC V2 is the best
Applio is a fork (modified version of RVC
So nope there's nothing better, the quality depends on how the model was trained
HELP!
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\env\lib\site-packages\gradio\queueing.py", line 625, in process_events
response = await route_utils.call_process_api(
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\env\lib\site-packages\gradio\route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\env\lib\site-packages\gradio\blocks.py", line 2088, in process_api
result = await self.call_function(
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\env\lib\site-packages\gradio\blocks.py", line 1635, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\env\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\env\lib\site-packages\anyio_backends_asyncio.py", line 2461, in run_sync_in_worker_thread
return await future
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\env\lib\site-packages\anyio_backends_asyncio.py", line 962, in run
result = context.run(func, *args)
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\env\lib\site-packages\gradio\utils.py", line 883, in wrapper
response = f(*args, **kwargs)
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\env\lib\site-packages\gradio\utils.py", line 883, in wrapper
response = f(*args, **kwargs)
File "C:\Users\DT Ryan\AppData\Local\UVR5-UI\app.py", line 293, in roformer_separator
raise RuntimeError(f"Roformer separation failed: {e}") from e
RuntimeError: Roformer separation failed: [WinError 2] El sistema no puede encontrar el archivo especificado
Local URV5-UI by @viscid moss
why does using my main microphone input for the https://rentry.co/forkvoicechangerguide#opening-on-windows
decrease my pc audio quality, like during discord call and everything, is there anyway i can prevent this??
and acer purified voice keep saying
"the conference call has started"
everything revert back to normal after i turn it into "none"
weird
hi, please use https://discord.com/channels/1159260121998827560/1159290161683767298 channel for voice changer
no problem
I'm going to swtich from applio to e2-f5-tts through pinokio, wondering if my setup is good enough for it to handle it, CPU : ryzen 7 4800h, 16 GB RAM, GPU: GTX 1650
Que estabas tratando de hacer cuando salio eso? Yo creo que el problema es donde lo instalaste, dejalo en descargas no en esa carpeta de AppData
can someone help me
im trying to open w-okada but when i clicked start_http.bat it only opens cmd and then immediately closed
Elaborate
Oh you're following an old YouTube tutorial prolly
You're using the original Wokada which has lower quality and performance than the Wokada Deiteris fork
the tutorial is 5 months old...
what's the new one
Also, rvc isn't Wokada
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
First, tell what u want to do, and ur PC GPU and CPU
just doing some AI voice changer, my GPU is RTX2060 and CPU is Ryzen 5 3600
Alright good setup, I'm guessing real-time so we have to talk in #🔍│help-w-okada since RVC ≠ Wokada
ok
Hi , i have a problem with generating the index file after training my model , im using Applio through Pinokio , after training the model successfully i can't generate the index file , im stuck at this since a few days , i tried reinstalling Applio and retraining my model but it still don't generate the index , if someone could help me that would be awesome 🙏 im running it localy with a 4080 super and here is the error message in the logs : An error occurred extracting the index: need at least one array to concatenate
If you are running this code in a virtual environment, make sure you have enough GPU available to generate the Index file.
No tenía ffmpeg configurado en el PATH. Después configurarlo y volver a intentarlo empezó una descarga que asumo que sea el modelo que elegí para la separación
Sip, primero se te descarga el modelo y despues lo usa.
Y funciono? O sigue el problema?
Ya funcionó sí. No tenía el ffmpeg en PATH
Nice
you did not train anything... meaning you did not split audio, and all it trained on is two mute files
hey , i did not split audio in the options because i did it already myself , i have my dataset already split and processed so i did not chose any of those options in the applio settings , the model did train however , i have the pth files
how big are the chunks? it does not load anything over 9s and training with 5s+ chunk is a waste of electricity
431mo , 300 epoch
i'm not asking that
how many files are in sliced_audios
and what is in model_info.json?
"total_dataset_duration":
3959 files (around 6h of audio)
probably too big
you dont need 6 hours for finetune
had it worked it would take ~5m/epoch x 300 epoch = 25 hours to train
did it take 25 hours? or did it take 10 minutes?
No it took around 1.5 hour to train with a batch size of 16 , also i did not put the audio files in sliced_audio
sliced audios is created automatically by preprocess
anyway, your files were too big to fit into the training buckets
so it discarded most if not all of your data
I see , thanks for helping man , so what amount of audio is enough to have a good model ? i also have a metadata.csv in my data set but im not sure its needed with rvc ?
The thing is that the slicer cut the audio not propelry and it cut in middle of sentences
does not matter, it does overlaps
30-60 minutes requires only batch 4 .. 8
may require some experimenting depending on the dataset
All right , so i cut the big 6 hours audio file to lets say 45 minutes and then just let applio slice the files , it does not need the .csv with the transcripted text right ?
Ok im going to try that , thanks for helping man 🙏
Is it ok to have the 45/60 min audio cut into 2 or 3 audio files or is it better if its a single one ?
Thanks mate
i can't find RMVPE help me
Holy shiet you're running an ancient version
Probably from YouTube tutorials
Tell ur PC GPU and what u want to do
First, tell ur PC GPU and what u want to do
4070 and learnig voice
Sorry if I'm reviving an old post, but the discussion topic currently interests me. Let's assume I want to use the D/G weights from one of my models as pretrain for another (same vocal, but different LR), but I only have the latest D/G weights saved in my log folder. Is there a risk of overtraining? Or should I save all the D/G weights during the first training to use the D/G weights that correspond to the epoch that seems most qualitative to me as pretrain?
@low shard NVidia RTX 4060 Ti
Alright good enough to make models
Also take a look at https://docs.aihub.gg/essentials/how-to-make-voice-models/
In the context of RVC, the dataset is an audio file containing the voice the model will replicate. It can be either speaking or singing.
usually not a good idea as the model loses generalization
Can you elaborate, please?
you take a pretrain that has male and female speaking and singing, you train it on just male speaking
it retains some ability to sing high notes
but it forgets some of the training
it is more specialized now
then you re-train it on another male voice
this model would likely not be able to sing high notes at all
I am aware of the potential issues you mention. However, in my case, the second training would concern the same voice, with a different LR for the training, so the problem of lost training parts doesn't really arise.
if you resume training, it would use the previously saved LR
all you can do is give it a try and report back
I would not expect anything good coming out of it
Not necessarily: hence the approach of using the D and G from the previous training as pretrain.
But perhaps it may not be the right path to follow...
the more you train the model, the more it forgets the previous capabilities
in the end it would only be able to produce the source dataset and nothing else
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
Does anyone have any experience in training voices with heavy SFX? If I clean these kinds of datasets too much it will ruin the voice. Some robotic characters work well but others sound bad or don’t train well. I have had 1 train very well, 1 train ok, and 3 train very badly but can’t figure out why. I have decently sized datasets for 3 of them (10-20 mins) and large datasets for the 2 others (40-50 mins) yet that doesn’t seem to change how well or poorly they train. For the 2 large datasets models, one trained very well while the other trained very poorly. Both were distorted heavily by effects, yet had very different results. Anyone have any ideas?
ensemble means the combination of models? i was using MVSEP
Hi guys
I need help with Applio
I was going to create a model but his voice came out completely different from the dataset.
I don't know what I did wrong...😕
is this normal?
Uh, no
@simple ore Hey man , just passing by to say THANK YOU , i did what you recommend me to do and everything works , i have been able to train my model and continue working on my project i was stuck on for days , big thanks to you , God bless you 🙏
-gui
Of course, RVC GUI is very outdated. The last update for this one is from October 2023.
RVC and any other Python program work best locally if there's a GPU. What is your PC GPU?
a 4070
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
Applio the RVC works best of all.
More like a whole GUI, while there's a console window background showing the status of program.
still it doesn't have rmvpe, even mangio had that in 2023
wait it says the file is 3000kb is that right?
That file size doesn't sound right.
typo?
nope
oh nvm i downloaded the wrong file
i found the actual file
which file was it?
The file size of Applio zip should be 3.9GB.
3.9 GB sounds right
hello, um sorta new to rvc but I just downloaded https://github.com/litagin02/rvc-tts-webui?tab=readme-ov-file this one and I don't quite know how to launch it.
the activate bat doesn't do anything nor does the acitvate file in the script
do i just need to install something else? i'm not sure how to install the tts web ui 
that is outdated
literally archived and doesn’t get an update since 2 years
don’t watch youtube tutorials for this
and tell what’s ur pc gpu and what u want to do
RVC is for STS natively, it’s not TTS, another RVC fork that “does tts” is applio, which does the same thing, using edge tts api to make an audio then use it as an input in rvc
my gpu is a Nvidia rtx 3080 and i'd like to do tts or voice to voice if possible with my rig
so i'd be looking for applio then?
STS =Speech to Speech
alr yw
Application error: a client-side exception has occurred (see the browser console for more information) .
Hi sorry for the ping but i got an error while installing \Miniconda3\Scripts\conda-script.py" shell.cmd.exe activate "C:\AI\applio\Applio\Applio\env" Do i need to redownload miniconda?
elaborate ur pc gpu and what tutorial link u are followijg
did u download the precompiled, unzip it, then run applo.bat?
the compiled for windows has been deleted i got the zip and unzipped it and ran the install.bat
i got the that zip from here. https://github.com/IAHispano/Applio
u used https://huggingface.co/IAHispano/Applio/resolve/main/Compiled/Windows/ApplioV3.2.8-bugfix.zip And not the source code zip, right?
think i used the source code zip
I think i'll download what you sent and let you know how it goes
the release says to use the specific compiled zip
I see where I went wrong I thought the windows was deleted and mistook the bug fix as simply that a bug fix. my mistake thanks again i'll let you know if all is good and running following the install guide on the doc
alright yw
got it to work now thanks again for all your help!
you’re welcome, for any issues let me know
help pls:
/content/FIX
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
/bin/bash: line 1: .venv/bin/activate: No such file or directory
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-8-60c07a872b64> in <cell line: 0>()
55 # Open a subprocess and capture its output
56 get_ipython().system('source .venv/bin/activate')
---> 57 process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, universal_newlines=True)
58
59 # Print the output in real-time
1 frames
/usr/lib/python3.11/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, gid, gids, uid, umask, start_new_session, process_group)
1953 err_msg = os.strerror(errno_num)
1954 if err_filename is not None:
-> 1955 raise child_exception_type(errno_num, err_msg, err_filename)
1956 else:
1957 raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: '/content/FIX/.venv/bin/python'```
elaborate:
- ur pc gpu
- what u want to do
- what guide tutorial / google colab link u are using
Hey a 4060 RTX 8GB good enough for training voice models ?
So applio is better ?
- nvidea 1650
- i want to make an ai cover
- here's the colab link https://colab.research.google.com/drive/1u1brjK8IZt647UsbZuGYfW29oFM2I4tk?usp=sharing#scrollTo=7SsCCgb9Ycqs
i clicked the buttons one by one, put the path to an mp3 file in "song input" from my google drive and tried to click generate cover, i got these errors :((
I’ve used them both and I honestly prefer mainline
I don’t get the applio hype
it gets more updates and it’s more userfriendly, but it’s ur choice
because the author of mainline left the project to work in a new one (gptsovits), while doing this he prevents any dev to add new changes to mainline (1 year since this)
applio is mainline but updated basically
last update was 3 months ago
last applio update was a couple of days ago
ye
more updates and userfrriendly
said update was fixing onnx exporting
which no one uses that anymore bc fork allows amd users to use pth
@viscid moss maybe u should check this
no real code update
also, ur pc gpu is good enough to do inference locally
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
But u can also just wait for eddy response if u want
I do not have permission to write on the pre-train models channel, but it sounds to me that KLM5.0 does not have enough English in the data. The results sounds great but they don't sound 100% fluent in English. Anyone else finding this?
oh, thank you so much!!
i also wanted to ask what program i could use to compile vocal and instrumental files?
use my base then
Yours is great actually. It is in comparison with your RefineGAN model that I made that comment 😉
Hmmm lemme try, I update it yesterday and it was working
I think you meant separate
Last update: Dec 24, 2024
I think it was a colab weird bug, can u try again? It's working for me
Will using UVR give me better audio (I wanna run inference but my recorded audio is so shit
I used Anevew Aggressive DeEcho Dereverb to remove reverb, and while the reverb is gone, the echo is still there.
I need to remove the echo. Which option would be better?
UVR's DeEcho – it seems to cut off at 17.5 kHz, so I'm not sure if I should use it
SUCIAL's DeEcho models – There are several of them, but I don’t know which one is the best. Do you have any recommendations?
Hello, any news about the compatibilty with RTX 5080 card ?
try manual installation but with torch upgraded to 2.8 nightly
might conflict with some other packages so you'd have to fix it
Does anyone know how Google Colab's Compute Units are calculated? Like for how many hours can it last on training voice model or for how many Epochs?
T4 GPU consumes 1.76 compute units per hour
that's from 2 years ago
not sure about now
Google colab GPU time is random
It can be max 4 hours daily tho
It can depend by how much you're using it, how often you use it
Kaggle gives 30 hours weekly of better GPUs instead
How about a GTX 1650
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
or do you want realtime?
a GTX 1650 can run Appolio to train model smoothly ?
eh, could train models but will be limited in batch size, dataset lenght and will be slower
I would suggest cloud
You can train a voice mode locally with GeForce GTX 10/16 GPU. Just don't expect it to train that real fast.
I forgot, when selecting the PTH file do I select the closest one to the Overtrain or select the overtrain itself. I read in the appolio guide it says the closest while in the tensorboard guide it uses the overtrain points themselves.
it is really hard to reach overtraining
Hey does anyone know why RVC is stuttering occasionally when playing Marvel Rivals specifically? This is the only game I've ever had this issue with. The playback will go from sounding very crisp & smooth, to occasionally stuttering a lot or cutting out entirely.
I've ran RVC on max settings in BO6, so I don't think it's a performance issue. I've also made sure that Marvel Rivals is the only game using my audio input (I use Voicemeeter with RVC). I've set my audiodg to High & affinity to CPU 2, but still same issue. This is the only game I have issues with. Does anyone have a possible solution who has encountered this too?
Try some epochs around the ot point and keep the one that sounds best
use voice changer fork
limit fps in the game
marvel rivals is a gpu heavy game
nvidia drivers chooses to give all of the gpu resources to the game, leaving almost nothing to the voice changer
I've tried capping from 240 to 144, but it still seems to have the same issue and at the same capacity. (The stutters are not better/worse regardless of which cap I try)
What's voice changer fork though?
aight thanks
- go to #🔍│help-w-okada and get the fork one
- fps cap to 60/120, lower game settings, enable DLSS/lossless scaling (without frame generation)
limit fps to 60, choose 1080p
all graphics to the lowest possible
that game is just very unoptimized
high graphics at 1080p ask for 8gb of vram 😭
Thanks everyone--I'll try each solution 🙂
Even with this if strange opens a portal you might drop to 40-30 fps 
Though, I'm not sure if I need to cap as low as 60. I've played some very heavy games at 140+ with great results (currently running a 4090), but I'll try to figure out if that's the issue--Marvel Rivals is the only one thats been tricky to get this working with
LMAO yeah 😭 
your gpu can do all of this, the thing is that the nvidia driver decides to leave all of the 4090 resources to the game
Sweet, so if I can find a way to prevent it from giving all its resources, I can potentially keep the fps higher than 60 (maybe) 😊
you can't, thats how the driver works
Oh.. rip.
But also yes, that game is very poorly optimized right now--every time I close it manually it tells me it has had a GPU crash, and it will tell me my GPU doesn't have 10 GB of RAM to compile shaders 1/4 times I launch the game lol
i love that game but jesus the optimization is ass 😭
Who's your main?
jeff
just reached plat yesterday
Jeff such a goober
How are plat lobbies?
Rip
Every rank feels the same for some reason in that game
Lol, do people comm in plat?
fr
rarely lol
Gm 2+ is quite different
I'm GM2+ and I feel like some games are a coinflip still 😭
I don't understand the ranking system at all because I'm seeing mistakes people make in Plat all the way in Celestial--consistently too
It's too generous imo
I don't feel like I deserve gm 2 but here I am
3 weeks ago, I solo q'd from bronze to GM in 1 sitting with only 4 losses. But at the same time, my main was stuck in Diamond 1--I played 16 hours on my main and just lost -14 points in total from the whooole session
Maybe I got boosted by my godly tank duo
Ranked in this game makes no sense and I'ma lose my mind 😭
Wild
I once went on a 21 game win streak
Btw the right channel is #🔍│help-w-okada @twilit bridge
Mhm
Ohh, sorry. I wasn't sure where to post--I had posted on there previously but no one had responded. I'll also stop yapping about Rivals here too 😆
im plat 2 in rct3
That's heat
you dont even know rct3 🚬
Roller coaster tycoon
based and tycoon pilled
It's fine, Wokada is for realtime, while RVC is for audios and model making
I wonder what happened to the Remote Code Execution exploit to marvel rivals tho
i ate it
nom
Didn't even know there was a exploit in it
WHAT
https://www.youtube.com/watch?v=ydQKPBgWKsI
@crude flame
Sponsored by SurfShark - Sign up at https://surfshark.com/ericparker for a free 4 months
Shalzuth Video - https://www.youtube.com/watch?v=sSXoH1xYIcE
In this video I showcase a potential security hole in Marvel Rivals and provide some advice on how you can avoid this issue, along with how developers can avoid this mistake. Not checking SSL is a...
The launcher is written in python 💔
😭 😭 😭 😭 😭 😭 😭
Once you said that, that's the video I was already watching haha
uninstalling rn
Huh
Oh so it's useless
oh
The game doesn't even verify that it's connected to the real game server, and the game runs on admin privileges bc anti cheat
From what I know it's only for people on the same network
Still really weird
i hope they don't find a way to make it global 
@twilit bridge
Still weird the game has an unsigned dll tho

They gonna hack the game likely they think Marvel's recent movies aren't too good. 
Hina Mod AICoverGen gives me an error, it used to work just fine
How do I fix "'NoneType' object has no attribute 'setdefault'"
Id suggest you use something thats not this outdated. Do you want recommendations
yep
is there something similar to hina?
model that separates vocals and allows yt links?
Colab blocks YT downloads, u need to upload the audio manually
iirc all google colab block yt links
bc it goes against yt tos
youtube is owned by google, and google colab is owned by google
soooooo
Why tf didnt it happen before
because it's a google moment
how tf do you sign up to youtube, I have my fucking google account on both youtube and colab
The only way is that you manually download the video yourself and upload it
when you use google colab you use a remote good pc owned by google
oh ok
just google any "youtube mp4 downloader site" or use yt-dlp on github
Is google colab fixed
I recommend:
Train (make) RVC Models on cloud:
- Applio (ui)
- Mainline (UI)
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI, no guide as of right now)
- Applio by Shirou (UI, no guide as of right now)
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.gg: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio (ui)
Most, not all #📰│dev-updates message
Thanks
cant set cable output as output
When easy gui will be fixed
the creator isn't active in rvc anymore, I already told him since like a month, don't expect that colab to be ever fixed or come back any time soon
just use smt else
wrong channel, rvc isn't for realtime, we already talked in #🔍│help-w-okada
oh shit mb
We'll be wait
Yo guys I am new with AI and I wanna train my model I have (I got a rtx 4070 ti) is there any recommdation for software to use!
Cool
another question is
like what is training your model
what do you need to train a model
because im a little confused
I think everything you need is in https://docs.aihub.gg/essentials/how-to-make-voice-models/
In the context of RVC, the dataset is an audio file containing the voice the model will replicate. It can be either speaking or singing.
Our Docs got everything
Which version of Applio ? 3.2.8-Bugfix?
that's the main branch
not officially released yet
Ok thx
wait can english models speak other languages
How GPU intensive is making an RVC model locally?
I'm asking because I am planning on doing a model based off of Michael Jackson in the Bad era
I haven't had much luck with training on colabs
For reference, I have a Gigabyte 3060
if you use an index, there may be some accent, but other than that it is just a voice model
VRAM matters, 8GB is good enough
gotcha, i dunno why its doing this but it sounds completely messed up
especially if you get to like 0:50
seems like the source audio is quite not what the model was trained to do
are there seperate music models?
I did not mean that
this is russian audio with english model
and this is the same without index
so if you're using an english model with an index and you're trying to infer a russian song that sung with a weird voice, the index search may not produce an adequate result
ah gotcha, not sure how i got this then, it was from a different place but this website did fine with both english and non english songs
i couldn't access that site anymore/it got paywalled so i tried appolio but dunno how id go about getting similar results
any ways i can work this or do i need a completely different model?
turn off the index
do you happen to know where that setting is in applio by any chance?
or remove
Just wanna make sure I don't burn out my gpu
yeah it has 8 gb of vram iirc
it wont
do you know what causes the seizures?
oh yeah youre right
and you need to de-echo it
gotcha
this is not the type of audio that can be inferred
is it possible to do in applio or should i use a different software?
which voice changer software is the best according to yall, I used to have one but i forgot
i genuinely have no idea what im looking at in here im so lost
so, i was training my model using refineGAN vocoder, but the result is so robotic, idk why.
i guess it's about pretrain right? i was trying to use the original one. I have checked your refineGAN pretrains. Should i download the 200 epoch version or 300 epoch version? @simple ore
-rt
Interaction has expired, use the command again for a new interaction.
what even is that,, i feel brain dead
i dont know any of these words i feel like a literal ape someone please help me
I think 32k 300e is messed up, so perhaps 44k 150 or 32k 200
ok, is this pretrain good for singing?
How am I supposed to explain to you again? W-Okada is the AI realtime audio/voice changer. RVC is the AI audio changer.
Links from Automaze:
- the first link is a guide for Detris' W-Okada. Download links for NVIDIA and AMD/Intel GPU versions are there, alongside the installation and recommended settings for your GPU. This version of W-Okada runs better than the original version, especially the performance.
- the second link is the original W-Okada. Almost everything is just like the first link. The difference from Detris' fork is that original version eats more performance than that one, and it's still outdated no more update from there.
- I don't know about the third link, but I think it's a realtime mode in RVC GUI, which the program itself is way outdated and has its performance even worse than W-Okada.
- the fourth and final link is about how to use TTS with RVC.
Guys! I joined this server for voice models but I don't know how make AI covers with these models. (I watched some videos but some methods or programs don't work.)
cuz most of the videos are outdated, just like the software they explain
And... how to make AI covers now?
you should check out this guide https://docs.aihub.gg/essentials/how-to-make-ai-cover/
Have the audio file of your song ready, & let's extract the vocals from it with an audio isolation software.
OK
I'll read it
Thank you mister sir

-collab
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
-kaggle
- Applio Notebook, by Vidal Kaggle
- Applio Notebook, by Shirou Kaggle
- Music Source Separation, by Shirou Kaggle
- UVR5 NO UI, by Eddy Kaggle
- Original W-Okada's Voice Changer, Kaggle
- Modified W-Okada's Voice Changer, Kaggle
- 🆕 UVR5 UI, by Eddy, ArisDev & Nick088 Kaggle
- 🆕 RVC AI Cover Maker UI, by Shirou & ArisDev Kaggle
- 📖 How to use RVC Mainline on Kaggle by Cauthess
Note: Kaggle limits GPU usage to 30 hours per week.
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Hi, what are the ideal models for real-time Russian female voices? Many models crackle for some reason. What should I look at—epochs, dataset length, or something else?
wrong channel
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-reocrded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
I will ping in #🔍│help-w-okada
LMAO i just told him to ask in here 😭
Cause of epochs, dataset etc. questions
Installing requirements...
error: File not found: `requirements.txt
Help, I get this error when installing applio UI
guys does anyone have a google colab generator mine's not working anymore
chatgpt could generate some colab for you
really? i'll try that
what guide link are u following and what's ur pc gpu
it's just writing code
I use colab, and I only loaded the backup to start the training since in the other account I was disconnected and I can't connect anymore, that installed well, I connected to drive and uploaded the backup and loaded it, but when instantakar the dependencies it gave me that error, and in start applio it said, not found directory files (error 2) something like that
be sure to start it it from the start, and use the same settings and load backup
I did it in this order:
mount Drive Load backup, and start backup, then install and start
Can you help me download pre trains faster? I upload them manually but it takes a long time
you need to first do install, then mount drive, then load back up, then start
like custom ones that aren't one of the applio custom pretrains?
bc there are already some custom pretrains u can download in applio
No, I have some from KLM from this year that I want to use.
KLM 5?
No, one called KLM (Kpop AI Universe) just came out, by the same creator, but manually it is very slow.
I think the only ways are either u upload them or modify here the code so that it will use google servers to download directly on the machine
hello guys i have a question is it good if i use ai rvc if i want to match between a certain vocalist to confirm if she's featured in a song? i have the files of the supposed vocalist in flac 24bit 96kHZ and the vocals in the hook in 256kbps
i have 25 songs of the supposed vocalist
So how do I put it?
wdym
modifying the code adding the download links as displayed
i have a chorus of an unknown female vocalist that i want to identify who it is
right
and i have the name of the supposed vocalist and i have 25 songs of the supposed vocalist in 24bit 9khz quality
the chorus is in 256kbps quality
idk if @nocturne mural maybe knows an easier way to add custom pretrains in applio
to increase the links and put their names as the others are?
also, be sure that that pretrain is hifigan, if it's refine gan, you will have to use the main branch applio instead of the 3.2.8-bugfix
you want rvc to know who sang that song ?
should i put all of the vocal files in 1 giant .wav file?
yes
i want to see if the vocals match or not
its for a search of unidentified media
rvc doesn't have a database that can identify vocals
it doesn't work like that
rvc can do inference (use models) on pre-recorded audios, and make you train (make) your own models
i can't use rvc to sing over an existing audio?
but if u just give it a random audio, it can't tell you who was the person talking in the audio
Thanks, if you could give that option in the Download section to download them from huggingface.co it would be great and easier!
I don't maintain that colab, you would have to ask vidal or someone in in iahispano
Thanks
you said you wanted RVC to identify who was in that song
not to make an ai cover
you're welcome
i misworded it
you mean chorus is more than one ppl or harmonies singing together?
no its just 1 person
i want to hear by ear if it matches or not
sorry english is not my first language @low shard
it's fine, I just don't understand your request of help
do you still not understand it?
I heard some paper about speaker identification almost a decade ago, but would like to see if you find some decent ones
do i need to compile everything into 1 giant audio file? the supposed female vocalist's vocals?
can i ping?
i'll wait for other help then
what are pitch guys ? does it effect the result of the cover ?
if what matches to what
to the 256kbps vocals
yes, it's the tone of the voice
higher = more feminine, higher
lower = more masculine, deeper
so, you want rvc to match and compare those 2 different audio files ?
@low shard (sorry for ping it's just that i asked this question twice already)
sorry, but RVC can't compare 2 files to see how they match
Hey everyone! What’s the best way to create a voice? Any recommendations?
what's ur pc gpu
3070
Can i make high quality ai covers and which rvc gui should i use, help would be appreciated, here's the specs
The RAM is 3200 Mhz
And my gpu is gtx 1080 8 GB
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
the quality of the ai cover will mostly depend by how good is the model you're using though, and to check that you can only listen to samples or test it yourself
@low shard this one i was looking for
yeah that one is old asf
don't use it at all
alright
there's better software now
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours daily, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
I would personally suggest Applio
you're welcome, let me know
Does apolio really needs to be in C:/
Can't i just put it in my 1 TB hard drive.
it can have issues when used in other external drives
But it would take all of my storage then, i only have really small amount of storage on my ssd.
Any alternatives that i can put on my hdd?
Does Applio support RefineGAN
U can try mainline but it's less user friendly and less features
Only the main branch
Okay but is it possible to inference refinegan models on the applio colab
Only if you modify the cell code to use the main branch
Okay how do I do that?Is it also for training?
if you're using Applio UI Colab, you can change the cell that clones the repo and pick the main branch instead
yes, you just need to remove the blue part
Do i have to replace something?
Do I have to replace something?
other than the code above, no
I need help
i downloaded applio and it's like that
@low shard
Nevermind
i got it to work
oh I didn't see the msg, yes
my bad was eating
All good
only remove the part I told you
I did once but sometimes it fails at dataset preprocessing
show how's ur dataset and tell what u used to make it
Why what
Does it matter
ofcourse, I need to check you used the right file formats and how you made it
How I made it? wdym
I used audio separation models
what did you put in the dataset path?
The drive folder path why it worked when I didn't clone any repository
it works for me
be sure you put the dataset path without the .zip
𝖨 𝗎𝗌𝖾 𝖺𝗉𝗉𝗅𝗂𝗈 𝗇𝗈 𝗎𝗂 & 𝖨 𝗎𝗉𝗅𝗈𝖺𝖽 𝗍𝗁𝖾 𝖺𝗎𝖽𝗂𝗈𝗌 𝗂𝗇 𝖺 𝖿𝗈𝗅𝖽𝖾𝗋 & 𝗂𝗇𝖿𝖾𝗋𝖾𝗇𝖼𝗂𝗇𝗀 𝗂𝗌 𝗐𝗁𝖺𝗍 𝗂𝗆 𝖺𝗌𝗄𝗂𝗇𝗀 𝖺𝖻𝗈𝗎𝗍 𝗇𝗈𝗐
so, you're having that issue still on applio no ui?
or is that issue fixed and you're asking about another one ?
𝗂𝗆 𝖺𝗌𝗄𝗂𝗇𝗀 𝗂𝖿 𝗆𝗈𝖽𝖾𝗅𝗌 𝗍𝗋𝖺𝗂𝗇𝖾𝖽 𝗐 𝗋𝖾𝖿𝗂𝗇𝖾𝗀𝖺𝗇 𝗉𝗋𝖾𝗍𝗋𝖺𝗂𝗇𝗌 𝖺𝗋𝖾 𝖼𝗈𝗆𝗉𝖺𝗍𝗂𝖻𝗅𝖾 𝗐𝗂𝗍𝗁 𝖺𝗉𝗉𝗅𝗂𝗈 𝖼𝗈𝗅𝖺𝖻, 𝗂𝗇𝖿𝖾𝗋𝖾𝗇𝖼𝖾
𝖨𝗇𝖿𝖾𝗋𝖾𝗇𝗓𝖺
𝖢𝖺𝗉𝗂𝗌𝖼
𝗌𝗍𝗈 𝖼𝗁𝗂𝖾𝖽𝖾𝗇𝖽𝗈
models trained with refinegan are compatible only with Applio Main Branch
𝗁𝗆𝗆 𝗌𝗈 𝖨 𝖼𝖺𝗇𝗍 𝗂𝗇𝖿𝖾𝗋𝖾𝗇𝖼𝖾?
you can only if you remove that part of code i told you earlier, because doing so, will make so you will use the applio main branch
also is that training issue dataset path fixed now that I told you to remove the .zip?
𝗂𝗆 𝗇𝗈𝗍 𝗍𝗋𝖺𝗂𝗇𝗂𝗇𝗀 𝖺𝗇𝗒𝗍𝗁𝗂𝗇𝗀 𝗋𝗇 𝖻𝗎𝗍 𝗂 𝗅𝗅 𝖼𝗈𝗆𝖾 𝖻𝖺𝖼𝗄 𝗍𝗈 𝗍𝗋𝖺𝗂𝗇𝗂𝗇𝗀 𝗌𝗈𝗈𝗇
alright
@low shard i'm wondering if you or anyone who knows could help me. So i started making some ai songs and they are okay, but i came across this guy and in his songs ai voices are sounding amazing. My question is what kind of software is he using that is making Arthur and others sound so good, they're as close to in game version as you can get. Are they using some RVC that i'm not aware about, are they training their own models or is there a different reason. I'd appreciate any insight that would help. Here is the vid i'm referring to: https://youtu.be/2oE6plJb7f4?si=ALKuaEQN-L8gglWj
It mostly depends on how good the model has been trained
That was my initial thought, cause every model i tried wasn't it, i mean it was okay but nothing insanely good, so it could be the model then
Currently i'm using one that is 700 epochs, but i think the data that the model was trained with is more important no?
Epochs don't mean more quality
The dataset length, and mostly important the quality, matters
Yeah well thanks a lot. Any idea where i might find some good models, i tried one on weights and it is okay.
You can search rvc ai voice models at:
- #1175430844685484042
- In #🔍│find-models , Do /find with @earnest musk
- https://weights.gg/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.ai-hub.wtf/essentials/how-to-make-voice-models/
:wave: @low shard, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
Thanks a lot for help!
Yw
Found this on hugging face, there are 2 .pth files but no index one, what am i supposed to download beside this 2 .pth files
Seems like 2 different epochs of the same model?
Maybe try the 1st
I'm downloading both just in case, will try them out, listened to example audio files and they sound extremely good, well that's all from me, thanks for your time, much appreciated.
Hey everyone! For those using Weights.gg, do you know how many epochs are used for training vocal models? I couldn’t find the info on their site. Thanks in advance!
Hello, I need some help. Is Applio or Mainline better for AI covers? Which one is better supported by my graphics card (2070 Super)?
They both support your GPU, they both are good, I would suggest Applio for it's more updates and user friendliness
im trying to train locally but the training isnt showing up in the scalars on tensorboard
Hi guys, I want to have my own voice for the TTS model but I want it to be in Indonesian language, I have dataset and RVC ready, have 8GB RTX 4070 VRAM, can I just train without doing any setup using my language?
you need tts that can do indonesian language
Applio can do that?
applio is for voice conversion
there's built in TTS (MIcrosoft Edge screen reader)
it may have indonesian languge
Text -> Microsoft Edge Screen Reader (TTS) -> tts output.wav -> Voice Changer -> output.wav
Thanks for this, I will try it
When I download the ai I don't get the voice of a girl or anything, I use amd and I'm a laptop who knows why
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
Goood day everyone. Please, i do need help connecting my RVC software to realtime on any socail platform.
Not a good day for me, but sure that's my job teaching people how to install RVC/W-Okada, I guess.
If you meant by an RVC program that can convert audio in realtime, you might be looking for W-Okada.
My problem is that I download the AI and put my laptop microphone and I'm from AMD and no one listens to me in a deep voice oh girl just with my voxz and I need a help someone who knows that can help me good night
For W-Okada, go to #🔍│help-w-okada. This channel #✨│ai-help is about RVC and related RVC programs.
If you meant by a realtime mode that found in RVC GUI, this one is too old now, and never use this.
I have W-Okada installed on my pc alread and i've been using it to change my voice in realtime over some months now but i haven't been able to set-up a connection for it to work on any social platform. Please do help and i would ruly appreciate.Thank you
Let me tell y'all again, for W-Okada, let's move to #🔍│help-w-okada. #✨│ai-help is all about RVC and any related RVC program.
ok lemme go there now...
Hi sir, I'm sorry for jumping in, do you know any RVC that can support Indonesian language?
Ah nvm, I think I will try with RVC base V2, found some Indonesian create some model using it
Hehe I'm so sorry 🙇♂️ I'm just gonna try everything first
I don't know if there's any Indonesian model that can be used for RVC TTS on Applio, but this is all I can find.
There are three English-Indonesian TTS models available on Applio, and that's pretty much it.
Oh, ok, I got it. The language code for Indonesian is id.
Ardi Neutral and Gadis Neutral are indonesian voices
the are few more multi-lingual
not all of them are available for Edge TTS
aaah thanks so much @hallow thistle & @simple ore , I really appreciate it !
not only that there are also sundanese and javanese models
does anyone know how to reduce reverb while making a cover?
my output vocal had a bit reverb when im set the pitch to low point
well shit my model didn't work well
it now sounds robotic as well
unlucky
i used weights.gg
i guess its not good
W-Okada or RVC?
rvc
i think i know why
but i give up already
if you want to can help me by making for me a model but i doubt anyone would
I think this looks a lot better
i tried both mirrors
that isn't my internet speed
there no way that's the speed
wtf is wrong with ur wifi
it's not my wifi
try using vpn
is there any other mirror
I think VPN would slow it tho, even if it could be true that he could be away from where ever is the server
#✦│chat message tbf with such a old gpu I would suggest Cloud
I'm not sure how good it will be running the best models with that gpu locally
not always so
I mean depends on the VPN he uses, and most free VPNs aren't that good
yeah the gtx 1080 is kinda old, so that's why I'm saying I'm not much sure how much time will it take to use the latest models
It runs for sure, just saying idk how good the speed will be
i can't find the cloud one
Cloud would be faster, tho your choice
everything at the right of local
it can still run UVR, although may not be faster than 2060/2070
yeah that's for sure
Tysm!
yw
Is that spinel in ur pfp
I remember when steven universe was still running
go above
best free rvc converter?
how can i make local applio link public?
i want to let my friend do stuff while my pc runs
edit the run-applio.bat and add '--share'
Yo everyone,
I’m having an issue with MMVCServerSIO when trying to import a voice model I created. Here’s the error I’m getting:
File "voice_changer/RVC/RVCModelSlotGenerator.py", line 42, in load_model
slotInfo = cls._setInfoByPytorch(modelPath, slotInfo)
File "voice_changer/RVC/RVCModelSlotGenerator.py", line 58, in _setInfoByPytorch
config_len = len(cpt["config"])
^^^^^^^^^^
KeyError: 'config'
I’m on Windows 11 with an RTX 3080, using the latest version of the software. My .pth file is there, but it seems like something is missing in the config.
Here’s what I’ve tried so far:
✅ Checked if a config.json is required (not sure if it’s mandatory or auto-generated)
✅ Tried recreating the config, but not sure if I did it correctly
✅ Tested another model, and that one works fine
If anyone knows where the issue is coming from and how to fix it, I’d really appreciate the help. Thanks in advance! 🙏
Wokada/MMVCServerSIO related issues in #🔍│help-w-okada
Are you using the alpha/beta version of wokada?
why the quality of the voice on this app is much worse then when using the applio web app?
I already added all the filters that appear on the front end but it's still much worse then what I get on applio web
I can't send files '-'
Hey! Thanks for the reply. I don’t think it’s related to the version of Wokada. Other voice models work fine, but the one I created is throwing this ‘KeyError: config’.
Wokada and MMVCServerSIO are the same thing
its just a label thing
Nvm i just understood what you meant, you were not questioning that
I am still asking because we are currently recommending the fork wokada instead of the vcclient since it currently runs better. So was checking in if you have the fork or not
I think the missing filter is the one that forces the voice to sound more like the RVC one, using 1 as value
I’m using the forked version of Wokada. The issue only happens with my custom model, not with other models. Do you think it’s a formatting issue with my config.json or .pth file?
idk what's the code name for this filter on python, do you know @pastel oak ?
With what software did you train your custom model?
Mainline, Applio, Codenames Fork, etc.
I dont know what app/software youre using
applio
And youre saying running applio local sounds worse than applio colab?
yeah, I developed an automation but the same voice sounds better on applio web version
I trained my model using RVC. Here is a screenshot of my model files
I dont think I can help, sorry. Wait for someone else or ask in their discord
https://discord.com/invite/urxFjYmYYh
That is not a voice model. You uploaded the (D)iscriminator that people would need to use if they want to continue training a model, alongside the G
ok
Check your weights folder for the voice model, should be name <YOUR MODEL NAME>_eXXX_sXXX.pth , probably
I checked my files, and here are the ones generated during my model training.
I want to make sure I’m using the right file for MMVCServerSIO. I see multiple .pth files, but I’m not sure which one is the correct voice model.
Can you tell me which file format or naming convention I should look for? I trained my model using RVC. Thanks for the help!
If you are using Applio, its in the same folder, like this in the first screenshot
If you are using Mainline, you are in the wrong folder. Go back to the start of the folder where you also find the start to the program. Then go to assets, weights and look for the name in there (2nd screenshot)
And btw you can delete all of those D_ files and G files up until the most recent file (the one with highest number) to save space
If you ever plan on continuing to train the model
cause theyre 837mb for just one file 😭
can someone recommend some good pydub/pytorch post processing filters to increase VC quality?
Thanks, it works! I was in the wrong folder actually.
someoone help
when i import a model like gojo it tells me that i cant upload files of that type
@pastel oak rlly sorry but plsmhelp
What program are you using and whats the gojo file youre trying to upload, send screenshots or something
looking in the wrong place
!give-media-perms 30m @fair ivy
How do I use https://discord.com/channels/1159260121998827560/1255206706598772796 ? Which one do I select?
RVC?
you download D and G files and use them as custom pretrain weights
Im using voice ai
My laptop is dead so my tomorrow
Hi guys. Someone said the loss/g/total lowest point is not true in Tenserboard. He said avarege/loss/g is the right graphic to examine. But in the Colab version of applio, i couldn't find any "avarage" related graphic.
Can you tell me who this "someone" is
we dont offer support for voice.ai on this server
those have not been released yet
Ok then what app?
i can give you link, can you first tell me what gpu you have?
Is it kind of experimental then ? Can I stick with the old graphics if it isn't differing too much?
I guess @simple ore, already replied to me
Intel graphics
thats not good enough to run the voice changer locally, so you can do colab/kaggle which is online based. its all for free
For Realtime Voice Changing for Calls on Cloud (remote good pc for those who don't have a good one, YOU CANT DO THIS ON MOBILE):
- Google Colabs (4 hours daily of free T4 gpu, easy to use, require only a google account) :
- How to use Hina's Modified Original W-Okada's Realtime Voice Changer Google Colab (has a Guide)
- W-Okada's Deiteris Fork Realtime Realtime Voice Changer Google Colab (no guide but explained in the cells)
- Kaggles (30 hours weekly of better GPUs, T4x2 & P100, harder to use, requires an account and a phone number)
- Hina's Modified Original W-Okada's Realtime Voice Changer Kaggle (no guide but explained in the cells)
- W-Okada's Deiteris Fork Realtime Voice Changer Kaggle (no guide but explained in the cells)
old total charts provide a bad picture. they may be useable if your epochs are not too long
49 step for each epoch, 16 minute dataset. Did you mean the total epoch count until overtraining?
i mean 49s/e is small enough
total loss values are from a random batch from the epoch, but they are short enough, so the smoothed charts are somewhat okay
thank you very much
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Does anyone know what file the index rate settings for the Real time voice changer client would be? Thanks!
is this overtraining?
Hello everyone, I recently encountered a problem, I used this neural network about 3 months ago and now I downloaded it again to my PC, nothing has changed, but at the same time my voice lags and resembles a robot (on all models), the delay and so on do not help at all.
rvc is not wokada
use the voice changer help channel 👉 #🔍│help-w-okada

AI HUB Docs