#✨│ai-help
1 messages · Page 310 of 1
Does anyone have a good voice changer recommendation i should use? i have a 5060ti 16 gig. and i need to use certain ai voice models.
it runs wayyyy better
I don't understand this "fronted" thing
if you're using old models most likely it will not change when using a different voice changer
from how long ago?
i'd suggest retraining it with applio, using the legacy core v1.5 pretrain
it's never recommended to have non-ASCII characters in the names
have u tried vonovox?
fair enough
regular rmvpe is better

if you're on amd ur only choice is onnx
so I assume regular one is better
hey @analog obsidian , I want your opinion on this person arguing that old version is better as well as those other points
please correct anything dumb I've said lyery
onnx only exists for compatibility, regular rmvpe is better
for nvidia use .pth, onnx uses more cpu and might affect pitch accuracy
also take a look at the whole convo
same, the f0 accuracy gets affected
it's better on amd
but it tracks pitch worse
@analog obsidian also that argument, where 131k extra on old wokada is equivalent to the maxed extra and 16k extra is equivalent to 2.7s on deiteris/tg-develop fork
max extra technically is better since it's closer to local inference (not realtime) but adds more delay
also convince them that the performance of 3060 or 4060 isn't that bad, except probably for some competitive gaming purpose where a single digit ms delay difference could determine between winning or losing the game
as a 4060 user im not happy with the performance of realtime with this thing lol
yea better gpus have less delay
no, none of AIB 4070 models have x8 lane, only x16
there was a discussion about this in another server, yea actually pcie speed affects realtime delay too, crazy
you prob mean 4060 Ti
I dont think any normal users would deliberately hamper the pcie performance from x16 gen 4 to x8 gen 3 or 2, except some crazy reviewers
even the older Z490 motherboards and the AMD equivalent i forgot have pcie 3 x16 lane. forgot how much the performance hit but it'd be mostly cpu bottleneck rather than pcie (that happens on having not enough vram and forcing to load texture data from system ram in gaming)
depend on what pcie gen
it seems to make sense that pcie bandwidth difference has more impact on ai workloads than just gaming
ye someone said upgrading their pcie 3 to pcie 4 decreased their realtime delay by about 30ms
random obscure fact (?

and even both X870 motherboard and RTX 5070/higher have pcie 5 x16 support, though not sure if it is fully utilized yet
slower gpus are less impacted by pcie difference, e.g. 4060 Ti has some little impact by x8 gen 4 and any slower gpus are less impacted
Because many people here have had "Hide extensions for known file types" enabled on their Windows Explorer, which hides the file extension of the file for cleaner look. When try to open the .zip.001 (or .zip in disguise which is actually .zip.001 in the first screenshot) whether with 7-Zip or WinRAR, the error often occurs, including the corrupted archive error, to much confusing. The only workaround to the root of the cause is to go to Explorer, go to Settings, "View" tab, navigate "Hide extensions for known file types" and then uncheck it, click Apply and then OK. Check your Explorer again, and now every file will show its real file extension respectively, which includes .001. If a file doesn't show its file extension after setting, try press F5 to refresh. This workaround not only fixes the paradoxical ".zip.001" issue, but it can also help you easier to distinguishing each file type (like a fake .pdf file but it actually ends with .pdf.exe), though with a bit messy look for some.
For more pronounceable performance, there's NVIDIA GeForce RTX 4080. The performance can be far little or unnoticable for typical users if either GPU is downgraded to use PCIe 3.0 x16 (or x8), but anything below x4 and eventually x1 (of any PCIe version equals or below 4.0) will have a noticeable to significant performance drop.
hello i have a vps and i want to run rvc on it how do it do it is there anyt tutorial out there
What is the very best option for a 5090, I want very accurate and real time voice changing for use on voice chat
how can i get it to work on a 5060
Try Tg Develop's W-Okada fork. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ https://github.com/tg-develop/voice-changer/releases/tag/b2397 Or Vonovox. https://github.com/dr87/Vonovox/releases/tag/v1.6.9 Older versions like b2332 and v.1.5.3.18a are not compiled for RTX 50 series.
@lusty holly you too as well.
Are some websites blocked in your country's internet? What is your PC GPU? And are you looking for Applio RVC or W-Okada voice changer?
thank you ✌️
vonovox says "Cannot start voice conversation - No GPU available. You can still access settings and configuration." even though i have a 5060
try update the Nvidia driver and make sure to be using the 169 version or you can also try manual setup https://docs.aihub.gg/realtime-voice-changer/local/vonovox/#precompiled-setup-nvidia-on-windows
Last update: November 21, 2025
ok
also the prerequisite as well
Deepwoken player spotted
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
There are options. Weights.com can train a voice model there, but the result would usually lower than that the one in Applio RVC and that's really it. What is your PC GPU?
Inference with Applio 3.6.0 on windows 10. I get errors in console
The .wav file producing that error was 60 seconds long. If I use only a 10-second long .wav file, there is no error. How can I tweak Applio settings, or Windows settings, in order to work with longer files?
why is the voice changer app not opening?
what is epocs
See #🧬│ai-chat message.
Bro
?
hey uhh.. is downloading isolated converted vocal isn't free anymore or is it a bug?- a week ago it was until i come back again-
Never suggest weights for training it's bad
hey can someone help me whit the voice changer set up? and where to download
-rvc
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
-train
whats all that?, are these voice changers?
what gpu do u have
AMD Ryzen 5 5500
u can use Wokada tg fork, it's what I use rn it's pretty cool
where do i download that
i download both?
the first one is to have the voice changer connect to games and discord ect, second is the voice changer
ye
thanks can i use these voice models then?
u can use any of them except ones made with refinegan, there aren't many of those at all tho
did u figure out how it works btw or do u need further help
I’m downloading right now if u need help I will tell u
oki
can the audio ‘watashi no koi wa’ also be used as a benchmark for male models?
That audio is mainly used to see how well the model can generalize high notes. For a male voice, I don't think it's very useful
You could make an inference using pitch -6 with that audio if you want, but I would recommend looking for better audio samples to see how well the model generalizes
what's that
this is good for testing a models range of pitch bc it has both a female singer in it and a male singer
If you need urgent help, please checkout our AI Hub Docs or ask for help here following the [Guidelines](#1402790586028789830 message)
aaa okay okay, thanks man
the audio that rvc devs often use to test the models they’ve made (cmiiw)
thanks, i’ll use this audio to test my model
I need the audio to listen to bc I don't have the best memory
aaa i see, wait a sec
np I have plenty of random samples in a folder if u want
ohh yea that one
whoaa, i’ll let you know if i need more samples to use for testing
For benchmarking, I recommend trying different audio files. For example, your model may sound good with the watashi audio file, but it may have other problems. The most common one I see is when they glitch when inferring a sound they didn't learn very well
is testing the weights one by one only for finding a sound that’s not too harsh on the ears, or can it also be used to find results with better pronunciation?
yea can be used for both, i had cases where one early epoch glitched in a part where the latest epoch didnt
there will be a point were the pronunciation is going to eventually stop improving tho
is there a way i can make an ai voice model out of drum toms
and make them actually pitch correct
why not just using a daw rather than making a rvc model
because im curious
so is there actually a shortcut to finding the best epoch? like jumping back 10 or 100 epochs. or is the only way really to check them one by one? i want to look at the tensorboard graph, but i’m training a singing model with batch size 4, so the graph is super messy
i can do it in a daw yeah but im curious on if that works
it works
is there an example of this or do you know itll work because they have a f0
them having a f0 is the reason why they works
i see
this guy makes drums models
i wonder how these drums are made because theyre not "singing"
theyre atonal just like drums
sadly there is no way to tell, i dont like rvc graphs
rvc graphs are not really that useful
to me the most important one is fm, you dont want that one to have high values
whats fm?
haha same, people keep telling me to pay attention to this graph and that graph, but in the end i still end up testing the epochs one by one lol
but the only way to decrease the fm value is to have a big dataset so.. yea
ah and lowering the batch size decreases fm iirc
from loss_avg or loss_avg_50?
imo they're really a waste of time
unironically i use the mainline graphs lol
idk maybe use avg 50 (?
offtopic but i wonder if rvc is ever gonna be continued
we have rvc v2 but nothing of v3 i think
?
idk why, but fm from avg 50 is more noisy than the regular loss
rvc is abandoned, development seems to be ceased
oh
so like its just up to the users to "continue" it
both have a very bad trend

high values and rising
ye, basically, applio have their own "v3" but i did not liked it at all
whats different about it
a new vocoder and a fusion between mpd and mrd
dataset issue?
what about it audibly?
does it pitch weirdly, weird consonants
noise
@analog obsidian I loved reading your LegacyCore thread in voice-models, it was like a "Making of" documentary.
how big is the batch size?
4
xD
and 18 minutes dataset
sometimes i type a lot oops
graphs don't matter anyway, does the model sounds good? thats what matter the most
can we just ignore the fact rvc graphs exists?
they just add fear
lol
i only use them when im training pretrains to monitor stability
So I had to jump in and ask you a few questions, if you don't mind...
wat
this sounds good to me, yea ignore the loss lol
fm raising and having values of 10 is not good tho but at the end the model still sounds good
i'm using your 1.5 pretrain btw
having values of 10 in a pretrain is very bad, but for finetuning i dont think it really matters
the power of legacy core 
no one will know why they chose the worst dataset ever for the og pretrain
lol
m4singer was already a thing in 2023
I read in the guide that in TensorBoard if I filter by g/total, I can watch til it starts to trend up and then stop training. But I just saw you say that you don't rely on those graphs. Does it mean I should continue training and manually check the epochs after the graph starts going up?
g/total is such a stupid metric because it adds the FM metric, which is not a generator metric
😭
it's a discriminator thing
OK, maybe that's why I wasn't happy with the results I got, I probably stopped too early
yea dont watch the graphs, they just add confusion
overtraining is something you hear, not something you see in a graph
g/total raises when fm raises, and fm raises when the discriminator got too good
yea, train the model and watch netflix (forget about monitoring the graphs) 
im sorry for all of the misinformation out there
fm going up and having values over 10 is not good... but at the end you can still have a good model despite having such crazy graphs
My first training session was with the original pretrain cos I wanted to use it as a quality baseline before I try your improved LC2.5. I stopped training after about 230 epochs cos the graph told me too😢 and the pronunciation wasn't as crisp as I hoped. From now on, I'll listen to all the epochs
The most stable og pretrain is the 32k, i know because i have done multiple tests with all of them
Darn,I used 40k cos that was my dataset. I thought, the higher the better
it is better (quality wise) but the og pretrains have critical flaws
honestly just checking is the best, check around after 100 tho
40k and 48k do not work properly with nsf hifigan, they have sibilant and breath problems
found it already ✌️ 😁
but if you want to try high sr i'd recommend legacy core 1.5 48k pretrain (even if you have a 40k dataset)
yk, i’m often torn between choosing 32k or 40k for my singing model. but for some reason, after trying the 32k pretrain, i’m really starting to doubt training a 40k model lol
40k is fine, the quality is there, but it's noisy and the breaths are just worse
Is LC1.5 still out there? Didn't see it on your HF. I read in that thread that 2.5 is the best?
how? won’t the results be messed up if the dataset isn’t 48k?
yep, click the link i sent, the pretrain is there
about which one is the best im not sure... every dataset reacts different to a pretrain
no, who told you that?
Yep, I thought if you use 48 pretrain, the code auto resamples your dataset to 48?
the last time i did that was with the og pretrain, but i don’t know how the results would be with your pretrain
back in 2023 ig
the model just learns the frequency cutoff of the dataset
the sibilants are noisy and the breaths are weird but thats just because hifigan is bad at 48k
lol
ooo i see
the quality is good (assuming you have a high sr dataset)
what is sr? something like sdr?
Some of my dataset is higher quality (20kH highs according to Spek) but the rest tapers off at around 16k, but I feel it's still good enough for singing cos I usually roll off those harsh highs anyway.#
is this a good way to add non voice stuff to a dataset without them just being in there, the coughing was cut and put somewhere where it sounds natural instead of being on its own
@analog obsidian what do u think? is editing the audio to be more natural going to give better results for human voices?
what batch size should i use for 17 minutes of 41.1khz dataset
should I do 6 (3 on kaggle)
The experts seem to have left, I've read in the docs.aihub guide that if your dataset is less than 30 mins, then go for the batch size of 4. But, the experts here have been debunking that guide all day, so maybe it's not the definitive source I thought it was...🤔
MJ forever, anyway👍
i dont think nsf hifigan can generate natural sounding coughs, let me check
edit: it can generate coughs but eh, its not that good
it will not decrease the quality of the model, just in case
also, you're doing it wrong, it should be like this:
cough voice
the sound needs to be closer to the voice, you put the sound too far away
batch 4
i did
I downloaded Wokada voice changer but the files :
- voice-changer-windows-amd64-cuda.zip.001
- voice-changer-windows-amd64-cuda.zip.001
this files is corrupt, what happen?
oh you're mixing 32k data and 40k data in the same dataset and want to train 40k? hmm, I don't know how “good” this is, but I know someone else tried it and had good results. It's not something I would personally do, though xD
I know that for pretrains it's fine because someone else already tried training a 40k pretrain with a dataset that have some speakers with 32k data
I did the same for the 48k version of 1.5
sid0 is 48k, rest varies between 32k, 40k and 48k
https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ this is what most use nowadays i think
Last update: November 22, 2025
that and vonovox
Perhaps the result will be a model that sounds like 32k despite having a sample rate of 40k (? idk im just guessing
I simply didn't have enough high quality singing samples. I only had around 13mins which resulted in a subpar first voice model. So I prepared more samples, but they sound like they "lack the high end" and you could see on Spek that they don't have anything over 15k. So, yeah,I was hoping to train my next model using both sets of samples (of course converted to one final sample rate (40k). But still the underlying samples are of different quality. It makes me sad that you think it might not work. I didn't even think of that. And you recommend 15mins minimum dataset for your pretrain.😢
I guess, I could convert ALL samples to 32k and train on your 32k pretrain.
It will work, hmm but since most of the data is 32k maybe the model is going to sound mostly "32k", but i have no clue, i can't predict it
You can try to experiment and train using the 40k pretrain
doesn't hurt to try
Yep, experimenting is key, I guess. The good thing is that you recommend so few epochs with your pretrain. It won't take too long!😀
also dont worry you can train a dataset below 15 min, however you may have some problems with stability and pronunciation, it depends on how many phonemes there are in your dataset. If there is a wide variety of words, there may not be such serious problems
For example, it is entirely possible to have problems with pronunciation even in a larger dataset
My high quality dataset (13mins) didn't have enough variety unfortunately. That's why the model struggled with some phenoms.
Phenoms! lol. Phonemes!
My recommendation is 40-100 epochs, your best model will be around that area
Of course, that also varies. There are cases where others have found their best epoch to be 200
That's another bad thing about rvc, we can't guess when the model will sound “right." 
it just randomly happens
That's great. Oh, wow. Ok, I'll run it for 300 epochs then just in case. And then will manually check.
yeah it's all experimentation
Also, do you recommend including speech, for a singing dataset. And how much?
If the speech is good (not monotonous), add everything you got
I feel, most speech ends up being monotonous
Not theatrical enough, if you see what I mean
Typical interviews and such sound monotonous
hmm you could add some monotone data to help the model with the pronunciation
I don't know how much would be the “right” amount
Gemini 3 says 40%, speech for RVC, but do you think it knows about this topic?
it has some knowledge of the inner workings but it doesn't know anything about training models*
It mentioned that you need speech for pronunciation (just like you confirmed). It said it's very important because sustained singing doesn't really gives that
true, singing alone can't provide good pronunciation
What RVC does is average the audio files of your dataset (I think Gemini knows this too) and create an unique voice model
Funny story. When I read about your LegacyCore pretrain, I eventually got to the part where someone said (YourLocalWorm) that you are not going to make any more public updates. I was bummed. So I checked your Hugging Face. Got the 2.5 and saw that there was a model you uploaded 1 day ago. (It was a couple of weeks back) The model didn't have a name, but I still downloaded that hoping it's some fresh new thing you've been working on. Sorry for stalking. 😀
I changed my mind lol, but i wont be releasing any new pretrains for the default rvc architecture because is old and bad, but i'll be training a new pretrain for a new architecture
Is that the next big thing for singing?
imo seems thats the case, the model generalizes singing very well
also 48k reproduction is way better than nsf hifigan
Are you one of the first who jumped on it? I haven't heard it mentioned anywhere
it's not here, is from another rvc server
I see, let us know when you have something cool Please.
sure
Would you say your internal tests blow everything else you had done out of the water?
btw we better gotta talk in #🧬│ai-chat coz we're kinda spamming this channel lol
Cool. I think I've asked you all the questions I could remember for now. Thanks. Let's switch channels
I just downloaded from that link too, and i was choosing "Download NVIDIA on Windows" then downloaded that files i told before. I cant extract the file and told me if the files corrupted (sorry for my bad eng)
i cant install it or cant open start_http because the smart app controll is blocking it but i dont want to turn it off because i cant turn it on isnt there another way to get it ?
i am trying to install the the voice changer
isnt there another copie of it or another way to install it ?
old voice changer
what gpu do u have
a good rtx one
MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.18a
this is the latest one no ?
do u know the number s of it orr
nah that's Ancient
ye that is good
you should try vonovox
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Deiteris' fork (modified version) of wokada that doesn't get updates anymore. GUIDE
For Windows Nvidia, Both Wokada Tg-Develop fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Tg-Develop Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
yeah but a laptop one tho its not as much as powerful lol
oh
ohhhhhh didnt know
:(
is it made by the same people ?
yeah 🙁 wbu what u got ?
completely different person, but it's a huge upgrade compared to that one u have
still realtime, still ai voice changer
oh
yo chat so ive done everything correctly and the voice changer works but when i try to speak w the voice changer it just cuts everything out and only makes one sound and it has huge delay
it exist in here so i think its legit ? and safe ?
it is yes
did you get it off a yt tutorial?
yeah
alr thx
i use intel
😔
intel is the one thing that just cannot run ai at all
btw
gotta spend for nvidia if u want the best
yh i use a old laptop
what guid i should use there is 3 here lol
i dont got cash
the guid gave me a complete diffrent download which one i should be using ?
the one u gave me isnt the same ig ?
also it says its not real time voice changer ?
RVC does NOT mean realtime voice changer. RVC means Retrieval-based-Voice-Conversion.
@viral mason can u help ?
explaining
the vonovox one
the one I gav u is the most recent one
someone can suggest me what voice changer should i use?
okay thx i will give it a try
even if there isnt really a tutorial
BRUH
same issue
its blocked
your pc is weird
what gpu do u have
smart app control is the weird
Hi
but it cant be annoying sometimes
i think its blocking it because its a bat file from unkown source
however once you turn it off you cant turn it on until you redo your windows
so yeah
like restart your pc?
nope reinstalling windows
rtx 5070 laptop
naaaah bro this exists in every pc that have wibdows how do you dont know about it ?
I don't use it ig
its just stupid microsoft policy
how to use it?
but its exellent tho it works on karnel level technically the best but can be annoying cuz it can block stuff that are not bad
it doesnt get talked about enough tbh
what is it?
connects voice changer with games and discord
I never used it tho, didn't know it even existed until now
setup or setup64?
anyone know how to get exclusive mode to work on vonovox
lol good for u ig
its not known that much ig but it always works in the background
unless its off
didnt know much about it too but yeah since it caused me that issue i may let it go too
idk why its on anyway it should be in the normal mode which isnt that much stricted
anyway thx
64
Did u pay?
no
What do u mean then by exclusive mode
Probably should tbh if it's causing you so much issues
it didnt until now tbh
i finally manged to start the setup file
however
what should i change in Vonovox settings?
its not downloading just stuck on
- 0.1/2.9 GB 1.3 MB/s eta 0:36:49
its so slow damn
and idk if i can do the same with the start file or not
i hope this actually worth it
can i just cancel it btw ?
Awesome
No you have to run setup first then start
i mean about the blocking thing
also its stuck
okay
ayo
how much storage does it take ?
its taking forever to install it
@viral mason finally the setup is done but the start isnt doing anything for me ???
its been installing for an hour
and now ive got this shit when i try to run start
C:\Windows\System32>runtime\python.exe launcher.py
The system cannot find the path specified.
C:\Windows\System32>pause
Press any key to continue . . .
I'm not sure how to fix any errors so u should join the official server for vonovox
okay
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
why cant i upload models
@viral mason hey so I now am in the browser on the voice changer but it doesn’t pass through to discord I don’t even know if It’s working at all
could u send screenshot?
@viral mason so i installed everything, the vc audio cables, and the models and the ai doesnt change my voice at all
I send them to u private cause I can’t send pictures here
did u get it off youtube
Get what off of youtube
I did exactly what the video said yeah
the voice changer
I came here and got the model
change cpu to gpu
it's old then, what gpu do u have
I did now but I still can’t hear anything when I go on dc voice test
u and him should use vonovox, well actually what gpu does he have
not sure about him, but hes 0 help
0 help?
should ask him, but what u got is outdated, should use smth better
to use this just run setup, then run start
What setting do I have to use on the vc are mine right?
is iit the same process to setup?
run setup, then run start
ok how do i add models
there's some slots at the top u add the pth and then add index at the bottom where it says index
then i add vb cable?
yup, regular mic input and output as vb cable
u got good settings?
crossfade at 0.15 is really all u need to change tbh
"Exclusive mode" is an audio mode of WASAPI and ASIO. WASAPI has both shared and exclusive modes, while ASIO only has exclusive mode. When you have "exclusive mode" on, only one program (like Vonovox in this case) can output audio at the time, while other programs (like Discord) will be muted if they are on the same audio system (e.g. Realtek HD Audio). This mode is not very ideal if you want to hear other programs at the time if they all on the same audio system. When audio API is set to "WASAPI" and you have "exclusive mode" off, the program will use "shared mode" instead.
For "exclusive mode", one of the good uses of this mode is when you're making music on a DAW software, and you prioritize bit-perfect audio over other having to go through system's mixer (like one in Windows) and letting other programs to output audio at the same time.
You're making an assumption. I was just answering threads, where I found answers where none where given. I'm not interested in asking others to do it for now.
There was no need to drag my attention twice either
Multiple people will use search and check threads, not just the OP
Because of that having info in the right place is handy, so I thought
When you go in one's thread that has already been long solved long ago and say anything there, it can considered "invading someone's thread" or potentially "necroposting". While these older threads have already been solved, you better wait for a newer one and then you go answer with your information, or at least have a talk with staff member(s) that could or might open possibilities for Beatrice voice models in "AI Hub by Weights". If you are so confused with why these happening, you better ask a mod or admin to clarify, I'm a Helper and it won't be my duty to abuse or change guidelines into something else. 
I could say the same, as in; apparently you are not sure whether or not it is actually considered as such, but you didn't ask the rest of the team before immediately going hostile.
And likely it is not.
The rule reads: Search the Forum: See if a solution to your problem already exists. Be sure to not invade their post.
This is not about sharing information, instead it is about taking over the question with your own.
I should also note that, the website reads:
Contributions
We'll appreciate any feedback, big or small.
Just saying that I do not mean what I said as hostile, simply as feedback.
If it is unwelcome, then I will not continue.
but at the same time, it means that the rules aren't clear enough and require clarity.
can anyone help me download the voice changer idk how to i search youtube and it aint working
dm me
What is your PC GPU?
Try Tg Develop's W-Okada. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ https://github.com/tg-develop/voice-changer/releases/tag/b2397
Click "voice-changer-windows-amd64-cuda.zip.001" and "voice-changer-windows-amd64-cuda.zip.002".
where is that
GitHub.
oh but it says amd here tho
Don't get confused with the term "AMD64"; it's a CPU architecture where it's the same thing as x86-64 (Intel).
oh ok
so i need both ?
Must.
ok
These two files are split parts of a single zip. Check if "Hide extensions for known file types" is disabled in your Explorer. Use WinRAR or 7-Zip to open and extract the .zip.001 one.
.zip.002 is another part of .zip.001, you don't need to open it. Just open .001 one and the program will temporarily combine them into one.
but do i need to extract it

huh
When you ask for every step-by-step step, the progress would be too little to going or ending up in mess. I can focus on settings in the program, not these.
These steps are so simple to follow. You simply extract to somewhere like "D:\MMVCServerSIO" or "C:\Users\your username\Downloads\MMVCServerSIO".
how do i start it i extracted it
Oh, really?
Go inside "MMVCServerSIO" folder, you'll see a file named "MMVCServerSIO.exe". That's the program itself, you simply double click on it to launch the program.
this ?
Always make sure you read the guide carefully alongside steps, before start asking something too simple.
ok
so i added a model to it but how do i connect and use it with discord and other platforms
For how to download a voice model from #1175430844685484042, go to #1175430844685484042, go to a post/thread there, spot a link (usually being Hugging Face) and then click it.
Because another person here asked how to download a voice model, yet they deleted their message because the query might sound stupid.
ik how to download it
but how do i get it to work
This https://cdn.discordapp.com/attachments/1159290139609137264/1456249974525526185/image.png?ex=6959a839&is=695856b9&hm=2fcbbfb436b8185706268db3f3e1b48f3733828e2be264ebb8a053a36daf62a9& and this https://cdn.discordapp.com/attachments/1159290139609137264/1455527099539521568/image.png?ex=6959a9fe&is=6958587e&hm=9df1ee2b38711b83d2ef6259f8829acd3206a5b4c0e8d3711775fa5ed55e4d63& might give you some ideas.
Because you didn't download and install Virtual Audio Cable. Go download the program from this link. https://software.muzychenko.net/freeware/vac470lite.zip
so after i extract it is should work
?
That's not how it work.
uh
Unlike "MMVCServerSIO", Virtual Audio Cable needs to be installed in a traditional way. Inside "vac470lite" folder, double click on "setup64.exe" to install the program.
ok
ok i installed it and it shows up in discord but i dont hear anything when i speak
when i do the mic test
Screenshot your voice changer.
Chunk: aronud 320 - 400 ms
Extra: 2.1 or 2.7 s
GPU: NVIDIA GeForce GTX 1070
Pitch extraction: rmvpe (not rmvpe_onnx)
Input: Microphone
Output: Line 1 (Virtual Audio Cable)
Monitor: your speakers/headphones.
Always make sure to check the perf number at top right of voice changer; if the perf number is red or yellow, try increase chunk value up until perf green. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/#finding-my-own-settings-for-chunk-size-and-extra-processing-time
Last update: November 22, 2025
i set everything as u said and its still not working
so weird
hey i also tried to follow along does it matter which version i download on amd gpu?
yes
but idk what u needd for amd
@hallow thistle pls help :(
What do you mean it's not working? You sure you have clicked on green "start" button already?
What is your AMD Radeon GPU?
im fucking stupid
9060xt
Try Tg Develop's W-Okada fork. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork https://github.com/tg-develop/voice-changer/releases/download/b2397/voice-changer-windows-amd64-dml.zip
Last update: November 22, 2025
how do i decrease the ping
See #✨│ai-help message again.
what do i do now? i downloaded it and it brought me to rvc voice changer
nvm
i got it, it was being weird it gave me two options of my mic and wasnt working with first one
RVC (retrieval-based voice conversion) and W-Okada voice changer are two different programs, where the latter primarily uses RVC voice models.
Here's your settings if you haven't set anything:
Chunk: around 60 - 70 ms
Extra: 2.7 s
GPU: AMD Radeon RX 9060 XT
Pitch extraction: rmvpe_onnx
Input: your microphone
Output: Line 1 (Virtual Audio Cable)
Monitor: your** speakers**
@dire drum
helo
Alright I need a voice changer for local running
This time I have a good pc with rtx 5060 ti and 16gb (ram goes brrr)
Try Vonovox.
Uh how to get it
RUN SETUP.BAT AGAIN TO USE SWIFT!!!!
1.6.9
Small Algo update for Swift-F0
Swift-F0 has been added a a pitch extractor option. More info here
https://github.com/lars76/swift-f0/tree/main
https://git...
Uhhh
This has no tutorial in YouTube
It's because you won't need it. There's a guide doc.https://docs.aihub.gg/realtime-voice-changer/local/vonovox/
Last update: November 21, 2025
I have Applio working, but for inference it can only handle .wav files less than 30 seconds long. Are there any tweaks to allow longer files? I don't mind if the tweaks slow the inferencing down a little.
How do I configure Applio to run in CPU mode? and how do I configure RVC-WebUI to run in CPU mode?
The current recommended method is using Tg-Develop's fork: https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/
Note: the instructions are partly incomplete and partly confusing, but it should get you set up at least. The process is basically very simple:
Download zip, extract zip, run server, click stuff in the browser window that opens up later.
rvc isnt working but beatrice is
Hey everyone 👋
I work mostly on the pre-automation side of things helping spot when a workflow or automation is about to amplify a broken decision instead of fixing it.
If you’re dealing with automations that “work” but still create problems, I’m happy to help think it through or look at assumptions before you build.
What is your PC GPU?
What is your PC GPU? And did you follow any tutorial or guide?
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
my gpu is radeon 580 rx series
yes i followed the guide
This AMD Radeon RX 580 would struggle even on any W-Okada version. You either try Tg Develop's W-Okada fork (b2397) DirectML https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ https://github.com/tg-develop/voice-changer/releases/download/b2397/voice-changer-windows-amd64-dml.zip , or go for online option like Kaggle for more performance. https://docs.aihub.gg/realtime-voice-changer/cloud/tg-develops-w-okada-fork-kaggle/
Last update: November 22, 2025
Last update: September 6, 2025
bruh

Your voice changer is older than the ancient Egyptians, what gpu does your computer have?
?
best ai to use for hyper realistic models?
What even is that
???
im tryna run applio on dione
idk what dione is but applio can be run on kaggle, it's very simple
Which do you recommend I use, Legacy Core 1.5 or 2.5?
having an issue trying to setup microphone and such
cant send pic.
when I need to
can someone help me.. my voice changer isn't working for some reason..! if someone helps me ill give them nitro
your voice changer is super old, what gpu do u have
dw I just disconnected my headset and now it works I setup new just now so all is good
thx for the help tho.
what gpu do u have?
anyway since you wanna know my GPU, 4060 Mobile.
mobile?
I'm confused
nvidia
u know the numbers orr
3060
How do I set its parameters?
Batch Size: 8
Dataset Length: 30 Minutes
Hop Length: 64
Pretrain: DMR V1
Precision: FP32
Sample Rate: 32K
@patent trellis
Those are settings it was trained on you can't set those, also sapphire is a bot
How can I use the same on the application?
On what application?
Voice Changer
i ping you in ai help forum
Hello everyone, how are you? I’m new here, even though I’ve been around for some time looking for RVC models and pretrains. I finally decided to join in order to clear up some questions I have about this, and I’d like to thank in advance anyone who can help clarify things for me
What things would you like clarification for?
I use Applio to train my models. At the very beginning, I used it with Colab until I was able to invest in a machine with a GPU. Since then, I’ve become more familiar with the process, using different pretrains, and yesterday I was able to test a model that I retrained using SPIN v2 as the embedder, and the results are quite noticeable in terms of speech clarity and overall quality. In this case, I used Legacy Core 1.5, which has a version designed to work with SPIN v2. My question is: do I necessarily need a pretrain specific to SPIN v2 or any other pretrain I might use, or will it work if I simply choose SPIN v2 instead of ContentVec?
@analog obsidian did you make a version of legacy core that works with spinv2 I don't remember this
yes
Ok was just checking thx for the quick response 👍
If you choose a pretrain made to work with spinv2 it will only work with spinv2 and vise versa
This is why original pretrain doesn't have a spinv2 model as it wasn't made to work with spinv2
busco a un español que sepa generar ads con ai
It’s just that I thought pretrain models and embedding models were different things and that one didn’t depend on the other.
spanish?
they're different things but with a small dataset you cannot correctly finetune embeddings from a different embedder
so your model wont be able to speak unless you train like a 44 hour dataset minimum
but yes it's possible to train spinv2 embeddings with a cvec pretrain, its just that you need a really big dataset for the model to be able to speak
tldr just use a spinv2 pretrain if you want to train spinv2
Certainly, it really was a question I had, since pretrains for SPIN v2 have only started to appear now, and I thought I could use the pretrains that use ContentVec. Thank you for the clarification.
Your pretrains are very good. Congratulations
I used Legacy Core 2.5 before, and I already liked it a lot
?
its been waiting for web server
for a while
and it's not loading much
the VCC client
whats the best was to use realtimevoicechangerclient with vb virtual audio cable on a laptop that only has an i3 and intergrated graphics?
i dont mean voice settings i mean settings so that it runs as smooth as possible even if its delayed by 1 minute or smthn
anyway to make vovonox more sensitive?
can someone help me the app wont open it keep saying wait web server 0-9999 and then it shows my ip and port
yo is there ever a real time thing so i can talk while i play games?
hey bro how to make models
I've asked you what your PC GPU is before, though I doubt if you still remember it. 
no i dont remember🙏😭
and my gpu is so bad
gt 610🙏😭
we can do with cloud right

NVIDIA GeForce GT 610 is very unlikely for interence or train anything locally. It's best to go with colud/online service.
-rvc
idk what to read
If you don't want to pick one there, see https://docs.aihub.gg/rvc/cloud/applio-kaggle/.
Last update: September 30, 2025
While I can focus on installation and some settings, for preparing dataset it's best to ask a fellow voice model maker.
hmm
is it hard to make a voice model
and need lots of coding knowledge?
Training one model won't be too hard if you aren't lazy.
hmmmmmmm like what
I have a possibly helpful tip to share about using RVC WebUI, I have a very weak laptop and Windows 10. RVC kept crashing my system after a few minutes, so I tried this advice:
Increase TDR timeout: Open Registry Editor (regedit), navigate to
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers. Create/add
DWORD values: TdrDelay (set to 60 decimal) and TdrDdiDelay (set to 60M. Restart PC. This
extends the timeout to 60 seconds.
mine doesnt crash
How can I answer your query? That's an awkward one.
Increasing the TDR timeout worked! Now I can do inferencing on files up to 5 minutes long with no crashes, and the laptop doesn't heat up too much either
I was going to put my laptop in the refrigerator but that's a bad idea, condensation will cause water to appear inside the laptop
This bro's laptop has NVIDIA GeForce MX, by the way.
so take it me from boys, don't put your laptop in the fridge
i mean which part of it needs me to be not lazy
First of all, go create a Kaggle account.
thanks I thought to book some tickets to antarctica and live there
not colab?
why no colab twin
Last update: August 9, 2025
yes i also wish to know why Kaggle is preferred over Colab
william shakespeare of today💔🥀
it requires a phone number?? Jaysus
yes kaggle requires a phone number which it sucks tbh
oh i forgot i already have an account in kaggle
well then kaggle can suck my balls
Gay.
eh atleast u can use two gpus at once
two balls💔😭🙏
yes that's worth a few balls I suppose
If you don't like Kaggle or a cloud server, because you refuse to do so, just try train anything on your old laptop and tell me how you feel about it. 
two gpus with 16 gigaballs of vram
I'm not doing training, I'm only doing inferencing with existing models
Lie.
I tried training once with Applio and the laptop got so hot I was sweating
and then the laptop gave bro some backshots
Enough of off-topic topics, how about your progress on Applio RVC on Kaggle?
You asked for it, but you don't seem to do anything.
is ur laptop have a intergrated gpu or u have an actual laptop gpu?
mb mb mb im watching some anime wait sometime twin😭💔
tbh I don't know if it's integrated gpu or not, i bought the laptop used
oh wait i remember now ur gpu is a nvidia geforce 940mx
how can I tell? let me pull up hwinfo
NVIDIA GeForce MX, still less powerful than those GeForce RTX Mobile GPUs.
yes geforce 940mx -- can that be "integrated" or is it actually a separate component?
yea ik its only good for low end games or minecraft
its separate
ok
Every NVIDIA GeForce GPU is a dedicated GPU. The integrated GPU is something like Intel UHD Graphics.
Hwinfo says the geforce has 4GB graphics memory, but I read somewhere that might only be the equivalent of 2GB because 2GB might be "shared" ??
ur mobile gpu has only 2gb of vram sooooo i suggest u shouldn't use applio locally ;-;
Yes, there's W-Okada voice changer. What is your PC GPU? There's a specifive version you might like to try.
I can get Applio to do inferencing but it heats up my machine MUCH more than RVC WebUi
cuz ur reading the shared memory if u look at dedicated gpu memory its 2gb
i can't find anything about shared and dedicated gpu memory in this hwinfo utility, can you suggest another way i can see this information?
task manager
yes
so it's 4gb dedicated?
yes
whoo hoo I am God
but the geforce 940mx is a ddr3 gpu so its gonna be slow at inferencing
not bad for a second hand laptop = )
ddr3 is outdated and old ;-;
I am also outdated and old
While Applio RVC actually detects your laptop GPU, when your GeForce MX GPU is 4GB variant, the program still struggles to do anything there because almost every GeForce MX wasn't made for AI. Not only that, GeForce MX has fewer CUDA cores than what GTX and RTX ones have.
its ddr3
that's also terrible
I may get myself a reasonably good gaming desktop PC this year, what's a suitable NVIDIA card that isn't the most expensive?
my budget is only moderate
namari's circuits are overloading with my complex demands
gtx 1080 ti
its old but still good
yes that looks affordable
There are a few variants of GeForce 940MX that use GDDR5 or DDR3, I'm surprised there's a kind of entry-level GPU that uses a type of VRAM similar to the main PC RAM (DDR). But still very slower anyway.
there's also the rtx 4060 which its only 300-370 usd
.
I assume these suggestions are good for AI as well as gaming?
cool
I got my 5060 for just 320€ it's amazing for ai and gaming. (I only use it for AI workloads on my server lol)
when I test inference on the same file in Applio vs. RVC WebUi, Applio heats up the machine much more. Could this be better programming in RVC Webui?
damn and hello yui
depends on what runs faster. It could also be better programming on the applio side for using your whole PC
ah yes, perhaps Applio would outperform RVC if I had a proper machine and GPU
For much better performance, the desktop one is always better than a mobile laptop. Those GeForce RTX Mobiles GPUs can be a step down of desktop GPUs (like "GeForce RTX 4080 desktop" has 16 GB GDDR6X while the mobile one has 12 GB GDDR6). Gaming laptops, while capable of training a model if they have dedicated GPU, can be questionable for thermal issues.
found this on a car resale website recently. Apparently it still runs
that's me
what CPU/GPU do u have?
intel hd graphics 620, nvidia geforce 940mx (4gb vram, not shared)
7th gen Intel Core. 
good enough for Minecraft
heheh
ayo my old PC (still one of my servers because it runs) has a 4th gen i5 without any dedicated gpu
4 cores without hyperthrrading go crazy
with low end shaders 🗿
I think I destroyed my last used laptop by putting in the fridge quite a few times while it ran heavy stuff
wasn't talking about shaders. I had a laptop once that couldn't even run Minecraft java without anything
I'm better looking for a laptop that uses Intel Core Ultra CPU or traditional Intel Core i5. 
condensation in the fridge goes crazy, otherwise good idea
no inten
AMD
Intel is so far behind in terms of power and efficiency
poor old Intel, they are crumbling
fr
top 13 most efficient CPUs are AMD lol
both in terms of fps/watt and fps/dollar
some billionnaires should prop up Intel, competition is good, if AMD becomes the only game in town, prices will go up and quality will decline
What about AMD Ryzen AI? This specific CPU competes Intel Core Ultra.
also the new core ultra bullshitters are worse than 14th gen intel CPUs. they literally went backwards 😭😭😭
When you say anything shit about Intel CPU, it won't make me to switch for AMD either. I don't need those top-tier performance or efficiency, I simply go for what is working or currently available from shops.
haven't had it myself and haven't seen benchmarks so can't say. But imo still better since Intel is literally struggling with everything except stuff that says "more cores = better, idgaf about how good they are"
im just saying :p
guys when i try to hear myself it lags very badly have any of u had that problem
u have a bad gpu
we should all hope that Intel recovers
valid. Can't say anything against that. Also Intel will probably get cheaper since they can't compete I think.
what gpu would u need to run it
which one do you have?
one sec
anything above a 1060 means ur config is probably beoken
is training more resource-intensive than inferencing?
always
What is your PC GPU?
that's what i though, just checking
checking rn
well yo can i just dm u cuz its bunz tbh
u can just say it here -_-
To check your PC GPU, go to Task Manager, go to Performance tab, see GPU 0 or GPU 1 and say the name any of these.
absolutley not bro majority of the people here got amd nvidea
k bet
lemme guess intel?
yes bro..
Intel UHD Graphics or Intel Arc?
arc
o not bad
You can't always assume anything like that.
exact model? A series or B580?
when was i assuming my guy i was guessing
Is it Intel Arc A/B (dedicated) or simply Intel Arc Graphics (integrated)?
uhh where do i check that?
in your GPU package or task manager
note that the arc A series may be less optimized than B580
Check this #✨│ai-help message again, but you click on GPU 0 or GPU 1 to reveal its full name in the right panel.
if you're using the recommended tg-develop or deiteris fork instead of the original wokada, there's no need to be confused with gpu 0 or gpu 1
This is not Intel Arc. This is Intel HD Graphics, an integrated GPU.
Intel Arc Graphics (integrated GPU) typically found in certain Intel Core Ultra systems. 8th and generations onwards Intel Core CPU systems use Intel UHD Graphics, older gens below 7th use Intel HD Graphics, though there are some desktop CPU models that do not offer an integrated GPU. Intel Arc "aXXX or bXXX" are a dedicated one, which is a different type of GPU.

hi why are my voice so laggy and buggy
most likely your chunk size being to small
What is your PC GPU? And did you follow any tutorial or guide before?
You have been asking for the voice changer since last year. Did you follow any tutorial or guide before? Which W-Okada (MMVCServerSIO) version are you trying to use? And what is your PC GPU?
NVIDIA GeForce RTX 4060 Mobile, basically a dedicated laptop GPU. Historically, the terms "notebook" and "laptop" once referred to different types of computers, but today now refer to the same or identical computer type (clamshell-type portable PC with keyboard attached).
whats your gpu
What are you looking for?
rtx 1650
the voice changer:)
how do I use the voice models for a voice changer I downloaded, it only lets me use it for musics and stuff
For better working voice changer versions for NVIDIA GeForce RTX GPU, considering for Tg Develop's W-Okada fork (b2397) or Vonovox. If you meant by an older "AMD Radeon HD 7770", this one would unlikely suitable for any AI program.
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Deiteris' fork (modified version) of wokada that doesn't get updates anymore. GUIDE
For Windows Nvidia, Both Wokada Tg-Develop fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Tg-Develop Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
Ok
Try Tg Develop's W-Okada fork. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ https://github.com/tg-develop/voice-changer/releases/tag/b2397
no its more like, I search the voice changer I want, then I go to the link and it puts me on this RVC page where I can only use the voice changer for things like singing music and stuff but not to download it and move it to W okada. or at least idk how to do that
yea but how do I do what I want
Deiteris W-Okada fork (b2332), released in December 2024, is slightly older than b2397 W-Okada, though this version can only be used if b2397 W-Okada doesn't work.
RVC (retrieval-based voice conversion; like Applio RVC) and the realtime voice changer (like W-Okada) are two different programs, even if they both use RVC voice models.
someone should prolly update the ai docs to show the latest one
A staff member who's responsible for docs on AI Hub is "Nick", though I think some admins or related mods might have access to them as well.
amd something like that im not a pc pro and yeah i watch many tutorials
To check your AMD Radeon GPU name, open Task Manager, go to Performance tab, check GPU 0 or GPU 1.
radeon rx 5700
gpu 0
Try Tg Develop's W-Okada fork https://github.com/tg-develop/voice-changer/releases/download/b2397/voice-changer-windows-amd64-dml.zip and its guide https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/.
Last update: November 22, 2025
okay thx
I haven’t used Google Colab for a while, and now I’ve come back to use it. I want to create a voice model with RVC v2 disconnected, but when I run the first cell (dependencies), I get an error during the code execution at “installing faireq”.
Is anyone else experiencing this issue, and is there any solution?
The RVC Disconnected has long ended, see #📰│dev-updates message. It's better to go for a more recent RVC fork like Applio RVC instead.
-rvc
Running the batch file directly inside Applio RVC folder would be better than running within Dione Launcher environment. The native batch file would show error logs within terminal at one place, while Dione might not. The latest version for Applio RVC is 3.6.0.
where do i get voice models
okay
Can you send me it's colab notebook for training a model plz?
For no-UI one, there's this notebook. https://colab.research.google.com/github/IAHispano/Applio/blob/main/assets/Applio_NoUI.ipynb
It seems that not all resources are being used properly. I remember that when I used RVC disconnected, almost all resources were utilized fully, although it took a much longer time to train the model.
If the Applio uses only little of VRAM, try change "batch size" up.
I don't train voice model, but this observation usually indicates that the batch size is set too low, which is why it uses very low VRAM (3.3 of 15.0 GB). Train and inference speed also depend on GPU clock speed too, not just the batch size of the program or GPU VRAM size.
Hm, how do I build Hybrid production-grade architecture with constrained scope based on current AI knowledge?
Something like: improved transformer, learning loop, memory systems, limitations / rules / reasoning, interaction layer, etc.
Any good guides / papers / frameworks / tips?
Hello! sorry for the late reply.
My GPU is RTX 3060 Laptop.
I was wondering what is it that he is using, like if there is any new application, because if i recall correctly, AI Voice changers cannot do cries/coughs realistically..?
And, i am sorry if i was being unclear about anything!🙏
im using applio
im on mac
onnxcpu nocuda
is the one im using
how do i train my corpse husband model do i just keep talking with it on or what
Does your Apple Mac uses Intel CPU or Apple Silicon (M2, M3)?
There's a newer W-Okada fork made by Tg Develop. https://github.com/tg-develop/voice-changer/releases/tag/b2397 https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/
Like RVC voice models, the voice model can do some crying and coughing noises, if the model was trained with such dataset audio, but that's about it.
french voice or french serv
after extraktin the model i only gives me the pth and not pth and index
can some1 help me
Uh, it failed
What to do
some1 pls help when i open start http batch file it dosnt launch it opens and closes in 0.1s pls helpp
yo can someone help me with the voice changer when im speaking it talks like 2 or 3 times
@tame oracle
how to make own voice models?
How do i get and use vonovox?
hi
is there a good lightweight voice changer for linux?
i use rx 570 and i3-8100(yes ik very bad lol)
i tried reinstalling applio,
Traceback (most recent call last):
File "C:\apps\Applio\applio\app.py", line 6, in <module>
import gradio as gr
File "C:\apps\Applio\applio\env\Lib\site-packages\gradio_init_.py", line 3, in <module>
import gradio.simple_templates
File "C:\apps\Applio\applio\env\Lib\site-packages\gradio_simple_templates_init.py", line 1, in <module>
from .simpledropdown import SimpleDropdown
File "C:\apps\Applio\applio\env\Lib\site-packages\gradio_simple_templates\simpledropdown.py", line 7, in <module>
from gradio.components.base import Component, FormComponent
File "C:\apps\Applio\applio\env\Lib\site-packages\gradio\components_init_.py", line 1, in <module>
from gradio.components.annotated_image import AnnotatedImage
File "C:\apps\Applio\applio\env\Lib\site-packages\gradio\components\annotated_image.py", line 15, in <module>
from gradio.components.base import Component
File "C:\apps\Applio\applio\env\Lib\site-packages\gradio\components\base.py", line 21, in <module>
from gradio.blocks import Block, BlockContext
ModuleNotFoundError: No module named 'gradio.blocks'
help?
I'm new to this myself, but I think you'd need to set your Output Device and Monitor Output Device to None.
-uvr
is UVR still used? like, is it still the best for vocal isolation locally?
last thing I've heard from it, was that is no longer being updated
when i put output to none it doesnt talk
can someone help me when i talk there is like a echo or sum and its speaking the sentence i said
can someone help me when i talk there is like a echo or sum and its speaking the sentence i said
@craggy bough @silk stirrup @viscid moss please help
i wanna train this corpse husband model but i dont know where to start does anyone got any realistic ones
you're trying to use v1 model, there's a small bug with that
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
can you help me when i talk there is like a echo or sum and its speaking the sentence i said
dont ping mods for stuff like this
then maybe here should be more helpers or stuff to help people
can u help me or not
wow what a fucking beatiful staff that doesnt even help lol
we are all volunteers, be rude and you wont get any help
for what are u even here
to moderate the server
ur not even looking here u just came rn
yeah cuz you pinged the entire staff team???
yes
its a dead server lol
keep deleting my messages
😂 good staff that doesnt even look in the server and doesnt help
and comes because i pinged him and its his biggest worry
in vonovox, what is "import" and "export" effects?
why did you make an alt account just to fake ask for help then argue with staff

wdym alt lol
its my normal acc dude
i asked for help but like i can see the staff is bad
import is u upload a file and export is u download or save a file
well yea but what type of file
like what effects
i dunno i dont use vonovox ¯\_(ツ)_/¯
@viral mason
Can i ask what is the best ai for make a cover song?
no clue I use fl studio for voice effects
ur lit a mod so u know how the changer works so u can help others to fix it or not? a mod does help too?????????
how are people like that even real lmao
could anyone help me? why does my mic cut out like crazy? i have a rec of it but cant send in this channel
did u find fix mine is doing the same
must be settings
no idea either, im assuming its an alt from his recent creation and join date
eh who knows
but how can you tell if another account matches his🤔
using linguistic patterns?
i would never reply to my main acc if i was on my alt
then again not everyone is me so 
how do i fix

I have run another inferencing test with RVC WebUi, this time a 4.5 minute wave file, on my weak little old windows 10 laptop with 4GB VRAM GPU, after making a suggested registry tweak (increase Timeout Detection and Recovery to 60 seconds). Again success!
But now I would like to experiment with some Python code bits to add to the RVC scripts which are supposed to cause the cpu/gpu to run a bit cooler -- does anyone have any experience in this regard?
omg this looks complicated -- if anyone is interested, see this: https://www.perplexity.ai/search/i-found-some-python-suggestion-LZEbgL_GS3Sr5ajoYytZcw#1
I wonder if I should try to communicate with the people who actually wrote the RVC WebUi python scripts?
Hello, I am training a RVC model using a 15 minutes worth of audio dataset and the dataset is clean as it is generated using my cloned voice (text to speech), now I am creating a speech to speech model but the results are so bad, I am using the settings mentioned on the official documentation with noraml pretrained weights for a epoch of 4 then when I try to train using epoch 8 the results are worst as compare to batch size 4 (I though it was better).
Any suggestions?
what settings do i change for it to stop being a little glitchy
someone can help me to improve a RVC Model? like idk how make it sound better / less "lag"?
any super realistic deep voice changer's every single one i use sound's very bad
Which models are used to create the images with the public figures? OpenAI, Gemini etc don't allow it but I see lots of pics edited / modified involving celebrities and politicians.
Im using intel as my gpu and cpu so what do i download
AMD Radeon(TM), a likely integrated GPU for "AMD Ryzen AI", is still not recommended for running AI program like W-Okada voice changer anyway. While an integrated NPU does exist within AMD Ryzen AI system, it's not optimized for running more-intensive AI and neither voice changer compiled for it. The best bet is to looking for a dedicated GPU like AMD Radeon RX, stay for CPU-only which might be faster than a traditional "AMD Ryzen" or go for online cloud instead.
Does your PC has Intel UHD Graphics, Intel Arc Graphics, Intel Arc Axxx/Bxxx or something?
See #1175430844685484042.
How do i check
To check your PC GPU, open Task Manager, go to Performance tab, see GPU 0 or GPU 1.
It say's Intel(R) Arc (TM) A770 graphics
This Intel Arc is a dedicated GPU, so W-Okada voice changer DirectML should work with that one. Try Tg Develop's W-Okada fork. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ https://github.com/tg-develop/voice-changer/releases/download/b2397/voice-changer-windows-amd64-dml.zip
Last update: November 22, 2025
I also have voicemod installed will that cause a problem
You can still have that one installed, and while it's better to not use that as your main voice changer, Voicemod can still be useful if you wanna use it for specific audio effects.
do i need to download the VAC (virtual audio cable)
What is your PC GPU? And are you following any tutorial or guide?
Theres one on the site should i downnload that instead
If you mean on official Muzychenko website, look for "lite" one. If you mean by VB-Cable, it's a complete different program.
im talking about the site you said to go download the voicechanger
What site? There are three.
i found the lite version its made by muzychenko
this site
VAC Lite on AI Hub doc, it's the exact same one as https://software.muzychenko.net/freeware/vac470lite.zip.
so it doesnt matter which one i download, and another question do i NEED to download them
If I say "sure", it means you should download the program, so sure. No need to confirm another time.
Do you really think "all" moderators and admins here are supposed to know about W-Okada voice changer? Those staff members with "Helper" role sure know about voice changer (and also RVC) as basics, and one of those Helpers is me. 
i downloaded it what now
Use 7-Zip or WinRAR to extract "vac470lite.zip" into a directory. After extract, go inside wac470lite folder, spot "setup64.exe" and double it.
Wait. How do i extract it into a "directory"
Make sure to read guide before you ask anything.
I put it in WinRAR but im not too sure how to double it
A member here once voiced his opinion where "Helper members" aren't supposed to tell overly many steps to other members, but because most of the time some members don't even read the guide doc as reference, and when I have to tell so it feels more like a "forced" strategy. I kind of agreed with that statement, but even if there's such opinion it won't fix anything.
the guide isnt helping me unless im not looking hard enough
Alright,
I need a Voice changer besides Vonovox
Mainly Voice changer that has A way to train any voice in
Besides with TTs, I want Just upload audio and change into different voice
Real time voice changer doesn't seem necessary for me,
I need voice actors
W-Okada voice changer doesn't train a voice model. Applio RVC can. Your queries look conflicting, but if you're looking for a local voice actor, you can try hire or commission them.
To request for a voice model, there's #1159289738314919936, and if you want to train one from a user's thread in #1159289738314919936 you might have to get "model maker" role first.

