#✨│ai-help
1 messages · Page 218 of 1
so if it goes up for more than 15 - 30 mins then i stop it
alr
but if you aren't sure just tell me
all the credit goes to him, atp i been just watching
after you noticed the g/total rises forever you choose the lowest point BEFORE the rising
thats your best model
i see
also can someone describe what an epoch is in simple terms, i wanna explain it to my friend but i don't understand it well enough to explain it
epochs are snapshots of the model
thats actually a pretty good way to put it
rvc saves these calculations in epochs
lmaoo
there are bad calculations and good ones
the bad calculations are when the graph raises
and the good ones are when it goes down
in one point of the training rvc gets very smart
and vomits a very good epoch
(the lowest point in the graph)
and that epoch stores the very good calculations and predictions
Hey all! So, I'm trying to use Applio for the first time. I followed all the instructions, buuut.. When I try to do some TTS using a voice model, I get an error. My GPU is an RX 9070, which I'm assuming is the problem tbh.. So far, I've had no real luck in using it for playing with AI things. Anyway, here's the error, if it matters.
rocBLAS error: Could not initialize Tensile host:
regex_error(error_backref): The expression contained an invalid back reference.```
zluda doesnt support 9000 series yet as far as i know
Annnd that explains it! I didn't know it was zluda as a whole that didn't support them. I'll be looking forward to it getting some support.
Thank you.
i thought u said rtx 9070 and thought u were a time traveller for a sec
😭
I wish!! But no. I probably should have said "AMD 9070".
lmao nah i understood
i was considering getting amd but decided nvidia
glad i did because literally everything python requires nvidia graphics card 💀
@analog obsidian uh the graph hasn't changed
a bit
o i forgot to tell you something
o
WOAHHH
XD
things changed
it's a flat line now 😭
nah it's still beginning basically
i gotta wait
click the third box
or something more ez
just press f5

no joking btw, this does indeed updates the graphs xD
Description add the support (optional ) for direct ml backend Problem no Proposed Solution https://pypi.org/project/torch-directml/ Alternatives Considered no at my knowledge
for 9070
i think imma continue it tmrw
for resuming the training put the model's name
don't extract nor preprocess again (important)
put the same batch size and save epoch frequency
and don't use fresh training
if done correctly the graphs are gonna be fine
and they will continue logging as usual
alr
🦈
also what time is it for u rn
if i were to guess, 8pm
the previous day
am i correct
Ooo, thank you! I'll try this out later tonight, or tomorrow.
you may need to amend the patch zluda .bat to download the latest version
i dont have AMD card in my PC any more, so I cant test the latest zluda and the patch it requires
rvc\lib\zluda.py
I think I understand. I'm downloading Rocm 6.2 now. Once that's done, I'll start trying zluda patches. Though, I'm getting pulled away for now.
there are some hijacks in that file, but they may not be requred with latest zluda
The latest version works fine for me and I didn't change any code
I got a 6700xt
zluda 3.9.1?
Yeah
I think Lee made some changes like enabling CUDNN and other stuff
so there's no need to disable them
In order to enable cuDNN acceleration on supported device, you should download and unpack HIP SDK extension upon your existing HIP SDK 6.2 installation.
HIP SDK extension: DOWNLOAD
is there any google collab active?
been trying since yesterday
tensorboard doesnt work
make sure numpy==1.26.4 gets installed
make a new code cell and pip install it
So, peeking in at the zluda patch... And forgive my ignorance ahead of time!
But in the zluda patch, it has a line that says
curl -s -L then it points to a github Zluda release. I'll update that link to https://github.com/lshqqytiger/ZLUDA/releases/download/rel.ae0540beb129ffd140226ce956b386619b38f84c/ZLUDA-nightly-windows-rocm6-amd64.zip
? Which that's the newest Zluda release.
And I'll leave the stuff like copy zluda\cublas.dll env\Lib\site-packages\torch\lib\cublas64_11.dll /y alone, I assume? (There are 3 lines similar to that one, all with different .dll files.)
you can download the .zip manually
unzip it to zluda folder
and just use the part of .bat that copies the .dlls into site packages
Ah, ok. Thank you!
does anyone use software like voicemeter [i think] to modify the mic settings before it gets sent to RVC? i use a usb mic and theres no noise cancelation or anything on it to change settings
with nvidia you can use broadcast app
mic -> app -> voice changer
yeah thats true, i tried it and dident really get better results than whats in rvc. do you use that personally?
it is amazing
alright bet ill run it again. i really do appreciate the advice
that was 5 years ago
thats wild lmao, i used it for a while on normal streams but i thought it made the audio quality rough compared to just the noise cancelation in obs
@bleak nymph fyi fresh training is to start over training, deleting the G & D files and tfevents file (tensorboard log)
by not selecting that option, leaving those files that means training will resume
fresh training = i want to clean up the training atttempt and start over
it does not delete sliced audios and other things, just D/G and events
yea
if you have D/G files in the folder the training will resume as long as the number of epoch is larger than saved
can’t wait to continue it tmrw
good call about nvidia broadcast, its def better than what we have in obs
So is there a way to separate a part of the voices in this reverb audio? I need to get that since it was in the lyrics
Same thing with this too
no
it is hard to unbake a cake into a flour and sugar
But what if I use flour and sugar but different techique to bake a cake?
find original stems, master them differently, sure
So this one works but I cannot remember if UVR-DeEcho-DeReverb used to be the best model for dereverb
Is a 3060ti 8GB TI LHR good enough to have a realistic sounding voice? I've gotten some models alright but ALL models give me static during long sentences.
I'm using MMVC
For W-Okada, go to #🔍│help-w-okada. This channel here #✨│ai-help is about RVC programs, not W-Okada.
In this context, RVC doesn't refer as realtime voice changer. RVC is another program that used to do AI voice cover.
If I change the batch size to 3 instead of 4, will the quality be better?
@hallow thistle
Do you have any information about this subject?
less batch size, more instability in the training process
Duration is not important, will the quality increase?
depends on the set
small batch = smaller steps, may get stuck in a local minima
very large batch = big steps, may overshoot the target
it is all about experimenting
ok
Hey which rvc i should use for training voice? I have RTX 3060 12gb
should i go for mainline or applio pls?
Applio.
Is it better quality?
More like it's easier to use, but it should give better quality at the same time.
Ok ty👍
May I ask about UVR5-UI on local?
Do I have to download the model manually?
If u re using UVR5 u need to import the new models manually, But on UVR5 UI models are downloaded automatically, pls refer to this thread
https://discord.com/channels/1159260121998827560/1343297771104501871
Also, when I use De-Reverb by anvuew, it detects some parts of the lyrics as reverb and separate it. Do you know what model should I use then?
Hmmm try the Sucial ones, and maybe the anvuew mono de-reverb. Ur songs are kinda hard to separate 
because it is not centered and gets rid of
Similar result when using the Sucial ones
Have u already separate the lead vocals and backvocals?
I don't know, I just separate the vocal and music first then de-reverb
Btw, when I use the UVR-DeEcho-DeReverb model, this works okay but idk if this model is the best option
Use the karaoke model first and then the dereveb model
Well, that model is kinda old
sucial ones are worse, and most lead vocals & speech under normal mics are centered so it should be easy to separate
the remaining one might be room reverb which can be removed using dereverb mono or RX 11 dialogue isolate
it still works for mono one or backing vocals but has 17.5khz cutoff
I see
uvr de-echo can also work quite more aggressive
though note that it basically noise gates weak signals
The Karaoke model took the same parts too
Damn
So any solutions for this now?
As far I know, nope 
Damn, guess I have to use UVR-DeEcho-DeReverb and then De-Reverb by anvuew
is this clean enough for conversion?
it takes a min for my local installation to start and i dont got enough resources to run it with UVR at the same time so im fishing for opinions before i try
inferring that file?
garbage in, garbage out
using it as a training set? double garbage
it is a repeated nonsense
If u don't have a good pc for UVR use it on Colab or HF Spaces
not using it as training or anything, literally just putting annoying orange voices over it to mock it
i get that its bad. thats why im putting shitty ai over a shit song. MYOB and answer my question next time instead of scrolling a help channel to start an argument
wait... he's mathematically 
XD
alr thanks
ive been avoiding colabs because most of them are kinda broken ATM but my computers just barely good enough to use ultimate-rvc to convert songs in abt 40sec
Well, u can also use HF Spaces it's easier than Colab.
Max audio lenght is 20 mins btw, and it takes like 1 min to separate
-hf
- UVR5 UI, by Eddy and Ilaria Huggingface Spaces
- Ilaria RVC Zero, by thestingerx Huggingface Spaces
- RVC⚡ZERO, by r3gm Huggingface Spaces
- Applio, by IA Hispano Huggingface Spaces
sweeet, tysm ^^
Ur welcome
hi guys
i've been training for a few hrs
my model is at 355 epoch
but the tensorboard line is still flat
idk what that means
it's still improving a lot? or no
doesn't seem tobe going up
it's set to finish at 500 epochs so i'm wondering if it's not enough if i need to train it more than that
reduce smoothing to 0.6 and turn on ignore outliers
how do i do that
o
on tensorboard? 😭
im not good at it
OH ON THE SIDE
i'm dumb
sorry
both are already set to that
restart the tensorboard site
and press the third square button thats below the graph
wait
hold on
you should be looking the scalars tab
go there
gray graphs = not good
oh 😭
this one right
it's still improving ig
395 epoch almost 400
it seems if it continues at this rate, even 500 epoch will not be enough (i think)
its not improving lmao
so the model stop improving after 10k?
around there
i cant tell
need a more smooth graph
anything above 15k is rising and overtraining
oh i see
it isn’t?
oh 😭
wait i thought lower was better
actually, isn't this the best one?
idk how this really works
reduce smoothing to 0 and turn off ignore outliers
choose the lowest point before 20k
i stopped the training for now
also show me the graph bc im not sure when it started to rise
should i open it back up
oh alr
started to die around 15-20k but i want to be sure
which graph should i show u
avg 50 g/total
you still havent disabled ignore outliers bro
wait it's cut off 😭
i did
no???
i aint the sharpest tool in the shed
the bright colour or the faded colour
does that matter
u mean tthis point, right?
yuh
find the epoch closer to that number
this one, right?
yup
alr sick
bruh, you're such a noob here compared to you in the audio separation server 😭
its always the next epoch after that value just in case
if its 12.4k you choose 12420 ish, never one prior like 13800
LMFAO so true 😭🙏
we all start from somewhere tbf
iis there a guide somewhere on how to get rvc the newest version running? im a bit confused
real
well its a noisy dataset
also like noobies said
every losses actually matters
not just that one exactly
G loss may go down like that, but if g norm is in 50000+ range, it is likely messed up
then there are things like this
disc is getting better, so fm loss is slowly going up and norm g also going up because of that
but as long as it is not a sharp rise from 2 to 10 it is fine
is possible for a small model reach that range tho?
I saw that with finetunes when I used a very different dataset comparing to the pretrain.. .like singing dataset vs speaking pretrain
i wonder if someday this will get fixed
idk how seoul managed to keep fm going down
loool yea
lemme check that
🦈 👍
👀
next test is default pretrain
nice
its fine, fm raises pretty quickly
um
chat
my model trained on me speaking only works better on singing then it does on actual speaking
xD
😭
i tried putting in a file of speaking but it sounded nothing like me because it copied the range of the original speaker (way higher than me) so i set it to -12st and it became so raspy and robotic and cursed
it never usually does this
well your dataset is noisy and uncleaned
really? usually an octave works fine but alr then
🦈
thing is i'm a bass he's a tenor
soooo
yeah
i tried -6 still weird and dont sound like me
-3 then
the thing that happen at 0:05 is when the dataset lacks vocal fry
vocal fry? i don't know techniques that much
aka raspy voice
u mean like raspiness right
ah
my voice is usually rough and raspy which is the interesting part
🦈
for reference ig
i just got off an hour long vc lol
and the audio used for inference?
that voice is way more raspier than yours
fitness gram pacer test lol
yeah true
in interviews it's a lot lot more
when singing, not raspy at all i don't think
woah
you just need to have raspiness in the set to inference raspy voices
yeah seems like it
also voice similarity depends in various complex factors
not in here tho
consistency matters more
so a consistent accent?
consistent voice
so not mimicking someone's else voice
well i wasn't mimicking anyones voice in my dataset
thing is i'm a complete noobie at singing too so i copy other people's voices for some reason 😭 it's so weird and idk why i do that
i feel like if i added singing to the dataset it'd be worse
you were laughing a lot in the dataset?
but i think what affects the most is when the set is too monotone
i feel that's good tho, more variety
that usually causes the voice to not sound exactly like the original one
I noticed this when training game character models, if I infer voices very different from the character's, the result will be weird (deepl translate XD)
that's interesting, i would assume that monotone voices are easier to make ai's out of than expressive voices, right? but no apparently not
monotone voices are easier to train
oh
but the least accurate ones
yeah that's what i meant
i would assume it'd be the more accurate ones but nah
the apple one is ridiculously accurate though it's crazy
too bad it's just text to speech
try to infer voices closer to yours
what do you mean by that
don't try to inference voices too high or too deep
yeah no it wasn't
it was a regular VC i was in and i recorded myself in audacity
i didn't sing or go higher than i usually go
or lower
my model always sound like the original person regardless of what i try to inference because the dataset has every pitch he can do


skill issue
how the model went?
i think that's why it's harder to train ai's on me
still testing
good when he inference monotone voices
bad when he tries something too different from his
due to his dataset lacking pitches
also known as monotone voice
gotta get a dataset of u playing a fun game💀
LMAO
it was
it was a dataset of me talking to my friend and playing lots of games
no need to go crazy in the set in order to have near perfect voice similarity
just have recordings of your whole voice range
yeah nahhhhh
but it was my whole vocal range
my model was trained using regular conversation, nothing crazy
me straining my voice isn't a part of my vocal range
i strain when i sing
i don't strain when i talk
so the dataset was me talking comfortably and i wasn't straining
speech models arent good at inferencing singing i already told u that
then why'd it work better on singing then talking?
imma try different speaking audio
fitness gram pacer test
thats what u believe
also the original pretrain was trained using very monotone speech
oh
hm
my highest note i've hit was in the 3rd octave
that's quite low
is it not
i think i just need to get better and more expressive
def more of a me fault than ai fault
something happened
rip 😭
it was too long i think
i cut it down and it work
maybe too high
Ok, so. I did all this (properly, I think?)
I can now run Applio. The only error I got when starting it, was
INFO: Could not find files for the given pattern(s).```
Which is odd, but I assume not problematic.
I'm trying to run an AMD 9070 with Applio. Noobies was trying to help me get it set up yesterday, which I'm grateful for.
Buuut I do have a problem. Which is that, when I try to use inference, or TTS, pretty much nothing happens. TTS gave me a driver crash every time I try it. My GPU usage doesn't move on either. On inference, my CPU will use about 15-20%. The Applio cmd window says "Compilation is in progress. Please wait...", so I waited for 15 minutes, and nothing seemed to happen. I've tried different voice models and audio clips. Just for the sake of testing it, I cut an audio clip down to 2 seconds. That's the one I waited 15 minutes on, with no result.
you did not read the installation note for amd
Oh! I did read that, but I thought that was with the first startup of Applio. So the first time I started it up using the run-applio-amd.bat, I just went afk for about an hour, and came back. My apologies!
no, it only does it when you run inference and traning
inference has a small subset of computing tasks, but anyway.. should take ~20 min
index is a lookup table that translates voice features.. like english audio 'the' into german voice model's 'ze'
use index search > 0
Okay. I Got the first Zluda thing done. Now it's just taking a couple of seconds on inference and TTS. However, audio is completely blank. On TTS, there's just a Pop! sound at the start, and end. On inference, it's purely silent the entire time. I opened it in Audicity to double-check, and yeah, it's just a flat line.
Sorry for being a pain. I am trying to search if there's a solution to my problems, before I ask.
why cant i send pic here idk how to describe the problem
inference empty too?
close applio, edit app.py and comment out line
import rvc.lib.zluda
with #
then try again
The place where I input audio is working. But the output (Bottom right of Convert), has no audio being output.
I did as you said, so now app.py has
# import rvc.lib.zluda```
Now, as soon as I click Convert, the Applio command prompt gives:
``` File "C:\Applio\env\lib\site-packages\torch\functional.py", line 665, in stft
return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined]
RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR```
Oops, that's only the last line. It starts with
An error occurred during audio conversion: cuFFT error: CUFFT_INTERNAL_ERROR
And then it shows a total of 8 files. I can paste it all in here, if you like. And again, thank you for trying to help me with this! I do appreciate it.
okay, revert back
I want to ask, I want to dub my own animation using rvc for female voice. If I use rvc will I get into trouble like copyright. How do I know which rvc model doesn't have that problem?
I'm trying it in fp32, using Edge now instead of Firefox... Juuust in case that matters!
Annd it just finished. Same thing. No audio.
And I did restart Applio entirely, after changing to fp32.
Is rvc/infer/infer.py The only part I need? Or should I be using everything in that page?
just that for inference
I did try to use just the infer.py changes (By opening infer.py, and then copy/pasting everything from the new one on github. If there was a better way to do this, that just shows how ignorant in all this I am!)
Annd I get yet another error. It ends with ModuleNotFoundError: No module named 'soxr'
But starts with Traceback (most recent call last):, then lists 12 files. Last file being C:\Applio\rvc\infer\inferp.y
It is one thing after another with me, I'm so sorry!
RuntimeError: expected scalar type Half but found Float
I tried on fp16 and fp32. Same thing either way.
where?
only means you missed some .float()
hm
or perhaps need to also change lib/predictors
File "C:\Applio\rvc\infer\infer.py", line 602, in convert_audio
audio_opt = self.vc.pipeline(
-------
File "C:\Applio\env\lib\site-packages\torch\nn\modules\conv.py", line 306, in _conv_forward
return F.conv1d(input, weight, bias, self.stride,
RuntimeError: expected scalar type Half but found Float```
I cut out a *bunch* in between, so it'd fit here.
There were like 10 or so other files.
okay, change rvc/infer/pipeline
and those two lib/predictors
or alternarively, just clone the whole repo
... Ok, forgive my noobishness 😛 (Yet again)
I'll open cmd in Applio's folder, and do
git clone https://github.com/IAHispano/Applio/commit/837b945c69d3c057e5de452ef5af7d252ca8293b
Correct? I don't wanna mess things up.
(With that github link, being the one you sent a bit ago)
I have to step away for 30 mins or so. Once again, thank you so much for taking your time to help me!
can someone help me make a stupif ai song cover i needed it for a video but i tried half an hour yesterday trying to make one with rvc and it always shows some error
nobody can help if you just say "always shows some error"
wdymm?? i know nothing about this and i cant send image here😭😭 i have no clue
we do not posess an ouja board that would tell us what error you saw
no telepathy either
well uhm sorry for bothering then i actually have no clue how to say the error
by providing a screenshot?
okay okay i will thanks 😭😭
Surprise, surprise!.. It's me again >.>
I backed up my Appolio folder first. Then downloaded the file you linked here, updated my current install with this new main.zip.
And I'm greeted with a new error, sadly.
File "C:\Applio\env\lib\site-packages\gradio\queueing.py", line 624, in process_events
response = await route_utils.call_process_api(
--- (lots more files here, ending with:)
ModuleNotFoundError: No module named 'rvc.lib.algorithm.generators.hifigan_mrf'; 'rvc.lib.algorithm.generators' is not a package```
I went ahead and tried to do `env\python -m pip install`, along with the two names of the things listed. Since this error seemed the same as the `soxr` one from before. But both time I got an error saying `Could not find a version that satisfies the requirement rvc.lib.algorithm.generators.hifigan_mrf (from versions: none)`
I ran the error through ChatGPT (for better or worse?), and it said to put a blank file called __init__.py in rvc/lib/algorithm/generators/ (assuming the hifigan_mrf.py file exists there, which it does)
That got rid of the error, but I'm back at step 1: No audio comes out from anything I do in Applio.
Yup! The file is there. But as soon as I click the Convert button in Applio, it gives me the error
ModuleNotFoundError: No module named 'rvc.lib.algorithm.generators.hifigan_mrf'; 'rvc.lib.algorithm.generators' is not a package
Well, it has the whole
File "C:\Applio\env\lib\site-packages\gradio\queueing.py", line 624, in process_events
response = await route_utils.call_process_api(
--- (lots more files here)```
Stuff above that.
that's on you.. i've just installed that main.zip, and everything works
so perhaps you did not unzip things properly, did not move all files over
clone from the repo as fresh installation, not doing in-place update
Just to double check, I redownloaded it. Extracted using 7zip. Opened the newly downloaded Applio-main folder, where I then see all the files (Such as run-applio.bat).
I select all files, then go to my main install folder of Applio (C:\Applio), and move all the files there. It asks to overwrite, I click yes. All the files move over. Then I get that error again.
also the main repo is the most recent compared to the precompiled one
I'll probably have to go through all my previous steps again, to fix all the issues I've been having, from step one. I will try this though.
well, the only thing you need to do is run the install and then replace torch and patch zluda
That's what I tried the first time around, and now we're here! I have no luck with this, it seems. I'll try it all again, but probably tomorrow.
Thank you for all the help though. Sorry if I've been a pain! I've been trying, but am waaay out of my element with all this.

try asking the voice owner for permission
Hi, just wanted to ask about how people uploading their models have those vocal only files to show them off? Does anyone know what website they use to produce those?
You mean by a voice model, right? Most of model makers here use an RVC program to train voice model.
Yeah. For a project im doing I need the vocals standalone so i can edit them properly myself, so I was wondering what program they use
Or programs*
Guys i'm having Trouble making the VC client Work. The only think that works is Beatrice T^T Also can someone send me or atleast message me a Tutorial on Downloading the VC Client Properly? I'm Strugging so hard 😭 Also i have Python and Pytorch and Cuda Installed Just tell me if Python and Cuda and Pytorch ain't needed but Overall the question is I want help installing the VC Client Properly T^T. Also I'm using a PnP Speaker and Mic not a Cable Jack or something
I just want some help T^T
Also Please?
I'm gonna wait.
For W-Okada, go to #🔍│help-w-okada. This channel #✨│ai-help is about RVC programs, not realtime voice changer.
Thanks!
it's model files (pth & index) to show, not the audio as dataset
is there a tutorial how to record something?
LIke a song or something with the realtime voice changer
I tried setting up virtual audio cable but im confused about how to route it and where to
rvc stands for retrieval-based-voice-conversion and not realtime voice changer
rvc is its own thing, w-okada allows the usage of rvc models in realtime
anyways, you have to use rvc to clone pre-recorded audio
place your model in the logs folder
then for conversion follow the instructions of the gui
ai-based voice cloning software
does not stand for realtime voice changer
but retrieval based voice conversion
i see
so this discord is things like that mostly? was hoping it was big on music
the discord is about doing rvc models (training)
and using them (inference, realtime inference)
ok thank you
please go to #🔍│help-w-okada for voice changer and read the pinned guide there
but for recording, it's recommended to use applio rvc on prerecorded audios
What is this cell for & how do I use it?
The phase Fixer cell
There are options like "transfer magnitude" "transfer phase"?
what r u talking about?
phase fixer is for instrumentals
it's not related to the rvc I was referring
Okay but how do I use that cell
On what kind of files
it's just an additional tool, read the notes at the bottom for what the use case
Uhm
It's for noise removal okay what is "transfer magnitude" for?
go read in wikipedia to understand about magnitude & phase in fast fourier transform
Okay
well not the right way to ping me cuz I thought you referred that topic with yours that is completely different
Hmmm
where's can I read the guide about applio rvc? I didn't find anything in pinned
https://applio.org/
is this the correct site?
elaborate:
- your pc gpu
- what you want to do
i use a 3080ti
all i want to do is record the voice changer really into audio files so i can use it to do stuff
though i use prerecorded audio
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
so, you need RVC or wokada?
also your gpu is good
btw, Applio is an RVC Fork (modified version)
i see
UHm well
I'm not sure what i need?
I just have pre recorded audio clips which i need to turn into a funny voice..
I already have the models
they are rvc models or something
I'm guessing RVC, wokada is a realtime version for VCs basically
what's your OS?
alright you can just follow https://docs.aihub.gg/ which gives RVC info, like using and downloading it
Last update: Oct 21, 2024
this is our documentation
applio also got their own documentation https://docs.applio.org/applio
Thanks so much!
RVC and W-Okada are two different programs. In this context, RVC isn't an abbreviation of realtime voice changer.
where do i find the pitch
Transpose?
maybe?
there is this extended pitch slider
so I assume there is an option for this like volume
what program
it's general settings in voice ai
rtx 2060s
Thanks vtarcelia for corrections, Nick088 for contributions. Most technical information comes from deiteris.
Latest Version b2332 from December 2024
RTX 5000 series support is here, but not integrated into w-okada itself, it is a stand-alone release. You can get it from here
Translations (outdate...
any other questions in #🔍│help-w-okada
thanks man big help
how do i get to the google collab for appolio
Okay, thanks for clarifying
first of all, tell:
- your pc gpu
- what you want to do
I'm getting an error on Applio that says "No module named 'tqdm.auto'". Any help, please?
elaborate:
- your pc gpu
- what u want to do
- what guide link are u following
which fork/guide am i supposed to install/follow to run local inference on apple silicon?
latest applio would use CPU for inference
ah that sucks
MPS torch is not there yet
for how much they push the AI capabilities of their arm chips you really cannot run much on them besides LLMs
at least UVR is fast on it
opm_num_threads = 1, otherwise it just hangs after 1st inference
apple has its own AI model, it is tricky to convert RVC to it
Core ML
Anyone have any idea why AttributeError: 'NoneType' object has no attribute 'pipeline'
Is showing up every time I try to synthesize on Applio?
huh wasn't it there but then removed bc of macs overheating?
elarboate:
- your pc gpu
- what guide link are u using
- what u want to do
- a screenshot ofthe error
you get that error when you did not select a voice model
it just buggy, hangs up after 1st inference
cpu works fine
since I dont have mac I cant do anything about it
im guessing any overheating reports would come from macbook air owners since those are fanless 🤔
and that guide fixes that issue?
I checked it fast and it seems enabling mps
I think it was made before we nuked it 🙂
so 3.2.7 should work?
technically it can be enabled back
rvc/configs/config.py
but I dont recommend
how do i increase the volume, I am using this atm
volume increase nowhere to be found
wrong channel again
use #🔍│help-w-okada

oh
can someone tell me why the uploadbutton while uploading models doest work
@low shard
i'm kinda curious to see what it'd sound like if i made an ai of me talking and singing for 3 hrs 😭
it's 3 hr long audio file of me
insane
probably would be better than my current one
it loads the entire audio to memory like what UVR does
unless the audio is split at most 30 mins each
depending on ram capacity
is applio good for mac if I want to use ai voice model to say a specific word?
it may run in cpu mode, or in mps (rather experimental, only for apple silicon M series)
Ok.
Intel or Apple Silicon Mac? Applio will sure work with only CPU.
it's still plenty enough for most tasks but splitting is always a good thing to do for streams/podcasts. I usually did it around 5-15 minutes each or per session.
where do you download the voice changer (not models)
thanks
Not sure if this is the right place and I don't have the software... but what software do I need to mess around with songs so that I can convert one voice to another?
are u still having issues? elaborate the program and the issue deeply
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
I'm guessing you want RVC
Ahh... thank you.
And is there a specfic website to grab this software?
what's your pc gpu?
Nvida, I think.
that's just the brand, nvidia made a lot of gpus
Oh...
some nvidia gpus could be too old, some others are good, some others are faster
I can't know if your gpu is good enough by just the brand
GeForce RTX 3060 Ti
there's no such thing as the rtx 1060 ti, the rtx come out after the 20 series, before that, it was gtx
ohh 3060
nice
I made a typo!
Thank you.
yw and lmk for issues
I am blind as a bat. So I'm on the huggingface.co site, and I don't see the download button on the right hand side. Do I need to sign up?
nope you don't need to sign up, open the link on the guide, go to windows (guessing you're on windows), select the only .zip file, then do as said in the pic
Doesn't look like that on my end, but I'm gonna guess it's the ApplioV3.2.8-bugfix.zip
yep, u gotta click that
Thank you. I got it now. 🙂
I found a setting I like, but how do I make it lose the autotune in the voice on some of the higher notes?
it might depend on how the model was trained
Oh.
you can either try to lower the index (which contains the accent), or use another one
looks like its a model problem
or using crepe with a high hop length can cause that too
autotuned voices could feel less natural
fair...
does anyone know how many epochs i should set for like 5mins of audio?
no its fine
im just tryna get epoches for a 25 second dataset 
the guy im doing barely talks
So I am using RVC v2 Disconnected, I also have done Feature Extraction and got this error? What did I done wrong?
/content/Mangio-RVC-Fork/logs/Kilixa_v1/3_feature768
Exception Traceback (most recent call last)
<ipython-input-38-86df4775937e> in <cell line: 0>()
29 listdir_res = list(os.listdir(feature_dir))
30 if len(listdir_res) == 0:
---> 31 raise Exception("No features exist for this model yet. Did you run Feature Extraction?")
32
33 try:Exception: No features exist for this model yet. Did you run Feature Extraction?
For RVC fandubbing projects
Problem solved
RVC v2 is outdated and will NEVER come back, do NOT use it
you shouldnt use it at all
it's based on mangio fork, an rvc fork abandoned since 2023
its slower than mainline and other rvc forks
and is more unstable
@livid cosmos #📰│dev-updates message , do not use rvc disconnected
whats your pc gpu and what do you want to do
Gigabyte GeForce RTX 3060 Gaming 12GB, and I want to train a voice model for myself
It's crazy to think that people using RVC v2 Disconnected in 2025, even if there are the Mainline and Applio available.
That GPU should good enough for voice model training, with that VRAM.
Just give the locally program a try.
Which is the best way to do it locally?
you don't even need to use google colab, google colab is a cloud (remote good pc service) for people with a bad pc
use Applio
locally
Applio is easier to use. It has web UI.
I got used to it before I didn't have a good GPU/PC xd

But yeah, I will try with Applio then
you gotta always adapt to updates 🔥
does these look fine
ive ignored any tensorboard talk since it makes my head burst so idfk atp
@austere nexus ma'am what the hell r u doing here
LMAO WHY CAN'T I
Why did you tag my msg that's like 16 years ago and LMAO at me 😭😭💀
it's the most recent one you sent 😭
for only 2k steps it is fine
she sounds like you put the saturator on max in fl but whatever
gonna let it cook fo a bit 
yo how can I change the effect of a voice model? i.e. i wanna sound to 25% like juice, but still recognize my voice, dm me if you can help
Can I find here some modules with russian accent? Because using an English one cause some issues
you can't like partially apply the voice, but you can adjust the index rate
also if you're talking about realtime voice changer, that's wokada, use #🔍│help-w-okada
be sure to not use ty tuts
u mean models?
You can search rvc ai voice models at:
- #1175430844685484042
- In #🔍│find-models , Do /find with @earnest musk
- https://weights.com/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.aihub.gg/essentials/how-to-make-voice-models/
:wave: @low shard, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
alright what do i do when im n this screen
maybe help instead of being a dickhead
which one are u using
tbh I would suggest you to use applio since its more updated
there is a written guide
i have applio
had it for a while
be sure u got the latest version
for some reason whenever i do the run applio bat thing nothing opens in my browser
delete the one you got, read the guide I sent, and download it from there
alright Ill get back to you shortly
this the download right @low shard
yes
can anyone help me? i download a model in Applio the model does not show up
nevermind
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Hola alguien tiene link de rvc?
This server is English only
Also elaborate ur PC GPU and what u want to do
U sure u extracted it before
Go on the top bar, type CMD and enter, then type run-applio.bat and show the output
alright it worked
whenever i upload my index i get this error
Did u upload the pth first, wait for it to upload, then the index?
Also share which index file download link
Can someone help me with this?
FileExistsError Traceback (most recent call last)
<ipython-input-12-e49906b94ca3> in <cell line: 0>()
17 if os.path.exists(time_):
18 shutil.copy(time_, time__)
---> 19 shutil.copytree(source_path, destination_path)
20 print("Model backup loaded successfully.")
2 frames
/usr/lib/python3.11/os.py in makedirs(name, mode, exist_ok)
FileExistsError: [Errno 17] File exists: '/content/program_ml/logs/Ahyeon
dms
There's no need, u can share the download link here
Elaborate:
- ur PC GPU
- what u want to do
- what are u using
- what did u do
Applio colab
You need to elaborate everything I said
Not just one single part
Also there isn't only 1 applio colab, there's the UI version and the no UI one
Hello, its possible to make a release pre-compiled of Applio included the latest torchaudio for RTX 5000 users ?
How to (unofficially) use Applio for RTX 50 serie cards
Follow to download it as said it in https://docs.aihub.gg/rvc/local/applio/
After you extracted the precompiled, go to the path in Windows explorer, write "CMD" and press enter, then in CMD write env\python -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
If you get any already satisfied requirement issue, run env\python -m pip uninstall torch torchvision torchaudio then the command said above
Last update: Apr 01, 2024
I use Applio UI Collab and when I run the Load Auto Backup cell I got this:
Ok thank you Nick
Yw and lmk
Also, what's your pc GPU? And tell what you did step by step, you might have skipped a step
I installed the latest version of Applio in queue, then mounted the Drive, then connected the backup, loaded the model and got that error, and the files and name were correct (for retraining) and finally downloaded the custom pretrain to run Applio finally but I can't do it because of that error
he was asking for your gpu, if you have at least 3060 or 8 gb card, it's recommended to train locally
I use mobile haha it worked fine until now
just put pth and index into /logs/mymodel
That still doesn't answer what he asked for, but close enough to where you don't even have a PC.
I've been training on my cell phone for two months. I always made my models on my phone. It never gave me any mistakes, only on that model.

then explain the issue and error message
where do i change voice for singer?
it sounds so fucked at the higher pitches but who gaf
How can I fix this?
use an up to date colab
@viscid moss check this out
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Fixed
Nah, some dependency was broken so i remove it.
oh lol
RVC-AI-Cover-Maker-UI
kaggle
only happen when trying to switch the Dereverb Model
Same with switching the Deeecho one
@viscid moss 
Hi, does anyone have a link to EasyGUI google colab with the UI interface?
EasyGUI is outdated since months, don’t use it
the creator is busy irl
what’s your pc gpu and what do you want to do?
uh really? what can I use?
I've used that a while back to train 3 voices from audio and it was pretty easy to use
I want to run on colabs not local possibly
ai progresses fast, things change alot, never stuck with a program for many many years
I’m asking because local is actually better than cloud (remote good pc), you won’t hage a random limit of max 4 hours daily of gpu
however that depends on how good is your gpu
i mean if you really don’t wanna tell it’s fine, i’m just trying to help out
yeah, better on cloud if there's any option
btw i would still need to know what you need to do, so i know what colab to give you
there’s colab only for inference, others only for training, others that are more automated for ai covers, others for both training and inference yk
ah yeah, I want to train a voice model from an audio recording so that it gives me back a pth and a index file
in order to use the model in applio
I need only training
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio UI Colab: RVC Fork with some extra features like TTS
- RVC AI Cover Maker UI: Automatically Separates the Vocals and Instrumentals, converts the voice and mixes them back
you should try another working colab, i.e. applio
https://colab.research.google.com/github/iahispano/applio/blob/master/assets/Applio.ipynb
can I train in applio too?
for example kaggle gives more time and better gpu than colab but its harder to setup
yeah, tbh i would suggest the kaggle version of applio for training
kaggle is an alternative of colabs right?
Colab: random daily gpu of max 4 hours
Kaggle: 30 hours of free gpu weekly granted, with better gpus, but needs phone number and is harder
Both are owned by google
yes, they are both cloud (remote good pc)
is it complex to train from audio?
i mean u would just need to follow the guide
sure I will
From your experience any recommendation for settings to get a better result?
the secret of good result is trying a lot and failing a lot until you figure what works best for your specific dataset
Hmmmm I'll try to fix it
guys
how can i use the huggingface models in rvc
can someone give like a tutorial please
@long forge
@acoustic scarab
on pinikio
I'm pretty sure that u are using an outdated version or u are using the outdated link from Shiromiya
Here's the new link:
https://www.kaggle.com/code/eddycrack864/rvc-ai-cover-maker-ui
-kaggle
- Applio Notebook, by Vidal Kaggle
- Applio Notebook, by Shirou Kaggle
- Music Source Separation, by Shirou Kaggle
- UVR5 NO UI, by Eddy Kaggle
- Original W-Okada's Voice Changer, Kaggle
- Modified W-Okada's Voice Changer, Kaggle
- 🆕 UVR5 UI, by Eddy, ArisDev & Nick088 Kaggle
- 🆕 RVC AI Cover Maker UI, by Shirou & ArisDev Kaggle
- 📖 How to use RVC Mainline on Kaggle by Cauthess
Note: Kaggle limits GPU usage to 30 hours per week.
Kaggle is a Cloud (Remote Good PC) Service that offers 30 hours of GPU weekly, but needs a phone number verification
by Vidal Kaggle
by Hina & Deiteris
Kaggle
by Eddy, ArisDev & Nick088 Kaggle
by Eddy Kaggle
by Shirou & ArisDev Kaggle
by Shirou Kaggle
@low shard Kaggle links needs to be updated
the sapphire ones are the updated ones, I'm working on it
oh right u took ownership of rvc ai cover maker
gonna fix it soon
What are some good websites to find rvc voice models?
You can search rvc ai voice models at:
- #1175430844685484042
- In #🔍│find-models , Do /find with @earnest musk
- https://weights.com/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.aihub.gg/essentials/how-to-make-voice-models/
:wave: @low shard, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
Can anyone make a new RVC V2 api on replicate?
I don't think anyone is going to do that, we mostly use free stuff, like colab, kaggle, lightning.ai and ofc hosting things locally
Is there a video tutorial on lightling.ai about RVC V2?
youtube tutorials are old, don't use them at all
the only updated guides are written ones
by the way, what's your pc gpu and what do you want to do?
I have a intel gpu nothing else
oh, I'm guessing an integrated intel gpu in your laptop?
well, yeah you could run it locally but would be slow as hell, cloud (remote good pc) is way better in your case
just a reminder:
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
I can give you cloud links, but I need to know what program you need to give you them
I use this one of kaggle and it stile dose the same thing
Hmmmm
I tested with normal deecho and bs roformer dereverb, lemme try that ones
cause that aggressive is too aggressive and bs roformer de-reverb it's better than MDX23C
it only happens when I try to switch it to someing like bs roformer de-reverb or change the Deeecho Model to Aggressive
The normal settings works fine. It’s just that somehow switching it causes the error.
wait, you have the problem even with the BS Roformer Dereverb?
🧐
weird
got an error but not the same error as u
mdx23c dereverb seems to not being working
ok found, if u choose MDX23C dereverb it gets bugged so if u cange to roformer again it will no work
Yep, MDX23C dereverb and Deecho Agg isn't working. The rest of the models are working
Yeah I rest and chose just roformer for dereveb and its working
Just Deecho agg and the other Derever is not working
Ye, for now use it like that. I'll improve it when the CoverMaker rework is ready
@low shard Hey Nick,could you look at the discord private dm please?
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
the newer anvuew mel is better
though it doesnt remove delay echoes
ye but haven't added that model yet
It will come with the rework of CoverMaker
hello beautifull people, if there is anyone out there that knows how to create ai voice models for rvc and has a bit of time to show me how, a dm or mention would be very greatly apreciated <3
In the context of RVC, the dataset is an audio file containing the voice the model will replicate. It can be either speaking or singing.
if you training the same model in diferent days , you need to execute the "EXTRACT FEATURES " section in the 2nd or 3rd time? in google colab
Hey everyone, I’m running into an issue with TensorBoard. Here’s the error I’m getting:
tensorboard_venv already exists, skipping creation and activation...
Launching TensorBoard...
'tensorboard' n’est pas reconnu en tant que commande interne
ou externe, un programme exécutable ou un fichier de commandes.
Keeping the command prompt open...
Appuyez sur une touche pour continuer...
Alr
What google colab link? What's your PC GPU?
I really hope your not using rvc disconnected or other old yt tuts
What RVC program are you using? Can you give the name of it?
RVC1006Nvidia https://docs.aihub.gg/rvc/local/mainline/
And what is your PC GPU?
Because the latest update for this RVC program is from 2023. There's a better one available.
rtx 3060 12gb
where pls
There's Applio. The UI overall looks similar to "mainline" RVC, but it's easier to use, and it has TTS built in it. https://docs.applio.org/applio
ty
and btw how many minutes of audio data do you recommend I collect? Also, where can I find good quality datasets pls?
?
30 - 60 minutes of combined dataset audio is enough, if you mean by to train a voice model. I'm not sure where the good source to find audio dataset is, it can be from some YouTube videos containing spoken audio.
bcs I want to make a model to use with Okada, so it needs to be the highest quality possible. Do you think 60 minutes is enough?
Yes, but the audio quality is also what makes a voice model to sound good.
While I can only give some basic ideas on how to train a voice model, some more advanced things go for some other helpers instead. If you have any problem training a voice model, you can go to #📑│making-models.
but not from different sources
being shorter but stick on one of them is better
compromise the quantity before consistency
5 minutes of dataset from a single fine quality source is better than longer one using multiple sources or worse consistency
I've tried a couple times to make a cover, but I keep getting a runtime error saying I don't have enough memory when I know I do. Is there something I'm doing wrong?
[Voice Changer] MMVC_SocketIOApp initializing... done. how do i fix this
I don't know why it keeps happening
you may want to start with explaining what's your GPU, what error message you're getting
I don't have a GPU, it reverts to CPU and the message is
RuntimeError: [enforce fail at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 379509352 bytes.
which application are u using?
ye you should use cpu mode
I am using CPU
RVC Gui, or do you mean something different?
it's too old



