#✨│ai-help
1 messages · Page 237 of 1
You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
Do I use Mainline or the other one
use the 1st link, wokada deiteris fork
be sure to read it all
if you skip parts, you're going to have issues ofcourse
@timber bramble let me know
There's no such a thing named "Mainline" in W-Okada. The actual mainline in W-Okada is bascially the original version of it, which is not really recommended to download and use so.
@timber bramble I meant this btw
If you got confused by "RVC > Local > Mainline" at the left side bar, it's a regular RVC program where you make AI cover, not RVC "Voice Changer".
It says Intel(R) Iris(R) Xe Graphics
Sorry for late response internet was out
I'm on laptop
only cpu mode is viable
cooked, weak integrated graphics
are u sure u dont got other gpus?
Damn
if you dont afford the processing time, there are cloud solutions like huggingface or colab to search
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (max 4 hours of daily T4 16gb gpu not granted for free, not much hours for training, but easy to use, there's a paid tier):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus, either T4x2 16gb each or P100 16gb, only free):
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly, but there's a paid tier):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio UI Colab: RVC Fork with some extra features like TTS
- RVC AI Cover Maker UI: Automatically Separates the Vocals and Instrumentals, converts the voice and mixes them back
so, either:
- use cloud (MOST SUGGESTED) which is using a remote good pc for limited free time
- buy a better pc
- wait a shit ton of time to run it on cpu
just use cloud, it's free
u can't expect ai to run on poor hardware
chatgpt doesn't run on your pc, it runs on cloud
Google?
Where do I get dat
Sorry gang not a big pc guy
I gave you every single link above #✨│ai-help message
everything works except mainline
That's an integrated GPU, which is never supposed to be used in AIs.
does anyone know how to train a separation model? i want to make one so i can isolate this specific artists music cleanly by training a model off their official instrumentals and vocals
The custom stem separation model? I see UVR5. But I've not seen anyone here ever done that.
There are plently of UVR5 models available for use extract audio.
i just have a very specific type of music that a lot of models arent exactly made for
so i get very poor isolations
edit 05.05.25 deton24’s Instrumental and vocal & stems separation & mastering (UVR 5 GUI: VR/MDX-Net/MDX23C/Demucs 1-4, and BS/Mel-Roformer in beta MVSEP-MDX23-Colab/KaraFan/drumsep/LarsNet/SCNet x-minus.pro (uvronline.app)/mvsep.com/ GSEP/Dango.ai/Audioshake/Music.ai) General reading advice | D...
It's hard
i figured it would be
On that guide u can found how to make a separation model
training separation models ideally needs a huge amount of dataset
I recommend u to train a Mel Band Roformer one
i have about 3 hours
of data
we've got good enough voc-inst models, so no more
ive been using the one on mvsep,
however you could consider trying to get some guitar dataset and train on it
how to import a voice model to the server?
You trained your own voice model?
im using a premade
Do you wish to upload it to #1175430844685484042 or something else?
no i mean like upload it to the voice model server
Sorry, but if it's not #1175430844685484042 in this server, then I'm not sure what you mean by "voice model server" then.
i mean like uh
uploading the voice model to the ngrok server thingy to run on cloud
Do you mean like W-Okada? At least just remember and say the name of the notebook you're using.
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
ok so basically im using ngrok right and i cant figure out how to upload a voice model to it
How many thing you don't know? ngrok is a website that gives token from your account to be used in a cloud service project, so that the program can display its UI on separated ngrok website.
Well, guess I have to tell you how to "upload a voice model into W-Okada" then. Telling every step by step is crazy.
i js need a quick tutorial on where to upload the index and pth file
On W-Okada. There's a red arrow pointing to edit button, this is where you upload RVC voice models.
alright
My guy, this is the quickest tutorial I can do.
You click on edit button, there's this popup. At the red arrow, there's an upload button.
Oh, before you upload voice model, make sure you have extracted one or two files (.pth, .index) from a voice model zip into a folder. Upload these files, and click "upload" button.
got it
The instruction for W-Okada on both locally and cloud service is pretty much the same. The difference is where you run the program from.
whats the audio ai upscaler link the google colab version
I got a question for u
Wdym
What's your PC GPU? What do you want to do
Btw lmk since I told ya
I'm getting a widely different and unusable responses from anything llm compared to just using lm studio
the temperature is the same
@low shard Sorry for the ping , Just let you know that I got my prefeed result with "TTS" library!
As for the robotic voice i don't have to integrate RVC at all, I just used some voice filter with "signal" library to make it robotic and the client really liked it.
Thanks for the help yesterday, I really appreciate it 🙂
Yw
tell me the settings that will make the voice as real as possible. And tell me which model is good for Russian or Ukrainian language?
Russian I would said
go
pls help
Idk what u doing sorry
I thought you were asking if it's better Russian or ukran language
@low shard help tell me the settings that will make the voice as real as possible. And tell me which model is good for Russian or Ukrainian language?
uncheck sup1
uninstall vb audio cable
get vac lite from the same program guide
then get a good model and play with the pitch
Can you please tell me where I can get a quality model? You know the kind, maybe streamers are famous.
the only way is to check the sample and test them yourself
Last update: May 5, 2025
yes, but uninstal vb audio cable
???
show a screenshot of ur wokada
read it
I'm not quite sure why I'm doing this and that it's probably a translation difficulty. I will now find information on how to remove the cable and all that is left is to find a quality model
I downloaded several models of the original of which Russian language but still the quality leaves much to be desired. Besides pitch should I touch something else?
is it starting to over train yet
No
you can try enabling force fp32 mode on in advanced settings, but it really depends much on how good was the trained model + the way you play with the pitch
any ai tools to transcribe a video for me? I need the full script of what I said in it
@glacial pollen is there any guide on finetuning melroformer models
You'd have to ask on audio separation discord really
they are the masterminds and lots of people there who actually trained most known models
thanks a lot
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
Hey I have a question in my mind that i use live voice conversion on windows using nvdia gpu card and its working perfectly , my question is - windows take more process than linux so if i use linux than it can possible i will get less delay on that cause linux take less compute to run that os in same nvdia card and spec , i need conclusion in same specification the rvc will work more faster than the windows one? (using same specs and same nvdia gpu card)
please response....
@low shard
AMD rx 6600 and i want to use the voice changer in real time just for communication
"! C:\Users\Blick\Downloads\voice-changer-windows-amd64-cuda.zip.001: Unexpected end of archive"
when i try to open the wokada deiteris fork file
corrupted archive
also try to unlock the archive
( right click, unlock zip or however it's named there for you )
Can someone walk me through how to download this i'm trying to download this and catch predators
💀
I acutally do that shit can you help me or nah?
Uah, nobody asked for a hero ™️
We provide support for AI, we don't ask you for ideals or reasons you need it working
so spare that for yourself if you may
Cool
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
Ps, it doesn't hurt to say " thank you " my dude ✨
In case of troubles, @ nick008 and others can help
Nah after that hero shit nah
It's facts.
-realtime
1st link
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
I hate the direction rvc is taking in most people's eyes
i downloaded everything it just sounds so robotic and idk which settings are good for my pc
Well idk what other intentions its made for lol
Have patience
Show a screenshot of ur wokada
!give-media-perms 1h @austere siren
got you
fun and entertainment, not catfishing or predators hunting.
Not in my eyes, at least
take it or hate it, I won't change my attitude
Well we are doing something better with it rather then dwelling and chatting with strangers
Shit should be left to proper institutions and police
Well technically Linux uses less resources and could result in less delay, tho I don't think there's any helper using Linux there
Fair enough, but for future, legit spare the details
Ah I see why
it's not like it's gonna increase your chances of getting help
Nah gfy
why every time i use virtual cap to use rvc on discrod my laptop loses sound
virtual cable?
Yeh
well, did you route it properly?
voice changer -> vb -> whatever uses it
main line of sound should stay unchanged
Chunk: 192
Extra: 2.7
If you're planning to use Server to have less delay, use wasapi https://rentry.co/LessDelayWasapi
Also be sure you got a good model and play with the pitch
so ig unless you screwed up the routing or chain, idk
Vb? Vb audio cable gives issues on windows
It's reported by users that it randomly stops working, so we always suggest vac lite instead
then vac
ye I use em both actually ( work just fine
my b
but perhaps cause of my drivers being a lil tweaked
Elaborate more:
- your PC GPU
- what tutorial link did u use
- what's exactly the issue
- a screenshot of ur wokada
Im not sure either let me check again :v
!give-media-perms 1h @modern swallow
In any case, make sure you use what Nick wrote, the vac lite
then you wanna route the voice changer's output to:
And be sure to be using Wokada deiteris fork
and if you wanna hear yourself
Not some random YouTube tutorial 
you wanna get this lil helpful thingy
You don't need that to hear yourself
it's for lowest latency
There's an option called monitor in wokada deiteris fork
You can set it to your headphones
is it also ultra-low latency?
cause it gives nearly live-monitoring delay ( idk, in mics / interfaces
will do thank you
It's pretty good, have you tried Wokada deiteris fork?
I personally never had any delay issues with wokada deiteris fork monitor option
by proper tweaks
tho ye u right
if simpler solution exists, better recommend that
I can't run server file its a .spec
Lemme ask in staff chat if other staffers ever felt like monitor had delay issues though because I personally never felt that
it ain't about issues tho
but rather actual delay
latency
I'm just very sensitive to it, that's really all there is to it
( am an ex osu player so audio latency is important as hell for me
Yeah it's better to ask if others tried that program too and felt it had less delay
Probs not
not a lot of people know of it
quite obscure on the web, in ' monitoring ' purposes like that
this helped so much thank you
anyway, you can guys test it out and see if you like it
( careful, you can potentially trade cpu for latency
Anyone able to help me
Elaborate
I downloaded the file extracted I see no exe to run
- what's your PC GPU
- what do you want to do
- what tutorial link are you using
- what's the issue
There are thousands of ai programs and files you could be talking about right now
You need to elaborate
@magic belfry please reply to all this
I know that's the right tutorial link for realtime voice changing, Wokada deiteris fork, but I need to know all info to help you out
Never played OSU honestly
Nor geometry dash
3080
Have a working girl voicechanger
https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/
Can't find the exe to run or setup the thing
Alright, did you download the windows Nvidia version and used WinRAR/7-Zip/Peazip to extract it?
Yes
When it's extracting, there will be 2 .bat files and then a MMVCSERVERSIO folder, you need to open the only folder there and then there will be a MMVCSERVERSIO.exe
I see 4
Did you git clone the repository
I feel like you went on the GitHub repository and either git cloned it or clicked the download as zip
Delete what you got, read the guide link you sent, there will be the already compiled version there
Remember to not skip steps
btw nick, if you can give this dude a warning, I'd be grateful ^
Ain't happy seeing some kids going around gfy'ing around
I just joined after seems you were a tad rude when he is just trying to help people and you made a side comment.
Nah, I was actually quite reasonable
- ^ to extend my view on this clown-fest
I don't want software I care about and I work on to be associated with such practices
rvc / applio should be associated with fun and creativity, not self-claimed justice heroes
And no shitty " mr go f yourself " gonna make me think any different.
If you ask for help, you ask for help, we ain't asking you why or for what purpose.
If one needs to show their ego on how important is their goal, then that's seriously fked up.
( cause that's the kind of vibes I got from the dude )
And most importantly, regardless of what's going on, there's no excuse for going as far as the dude allowed himself to.
saying its corrupt
I think bro is sensitive
agree a tad
And I think bro's full of himself sending his minions on us
sad but you do you kiddie and be nice
it works pretty well on discord with me calling to another device but my laptop itself has no sound :v
? I need help i'm not with no one
whatev, redownload the archive and try again
Rude
Wsp with you saying Kidde so much and saying all this hero stuff and how it ain't cool is there something going on?
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
Cause you def act like a kiddie
talking 'bout being sensitive ay
You like talking to kids or sum?
You're legit out of mind 
Not everyone is a kid pal
👐
You surely act like one by going all sensitive on my opinion on what you do
Lol
and yet spitting out " sensitive " into my face, fkin hypocrite-act there
fr
I'm 24 yo and too old for that kind of shit. Get out of my face man. do your stuff, topic's done
Its saying its corrupt still
Perhaps try to use 7-zip instead
also, where did you download the archive
a
You gave me a good chuckle :thumb:
AI hub is a server
😌 🍷
oh, now I see what you did
because these are parts
you need both, then unpack from 001 ( as in, open up that archive and unpack
whenever you see an archive with " 001 " and such
means it's parted and you need all to unpack it properly
ah
Yeah, happens, dw
Aw thanks
Np, hope it goes well for ya
Can I use a voice meeter so that the sound can still go through the virtual cable and I can still hear it?
does it work in vietnamese voice?
same problem
never skip steps
@austere siren @modern swallow uninstall vb audio cable and read https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#virtual-audio-cable
Last update: May 5, 2025
Hi! How to make voice not so robot and say the exact word?
Elaborate
- what's ur PC GPU
- what tut link are u talking about
- what is the exact issue
There's thousands of ai programs
hello gys,
what are the best techniques to improve the accuracy of a classification model (tabular data with alot of categorical variables)
Hello, i want to know which f0 det is good, rmvpe,crepe or fcpe?
Is there any way i can make my own ai voice model
by just using a clip of them talking
Yeah, what's your PC GPU?
rtx 4070
also what should my input and output device be
Rmvpe is the most robust to noise and good quality, crepe is least robust to noise but soft, fcpe is not heavy and soft but not as accurate as rmvpe
it is more than enough for training
In wokada, input should be microphone and output should be line 1,
In other programs, input should be line 1 and output ur headphones
mostly you'd need to train (finetune) on a single speaker
oh ight
whats finetune
applio would be better right?
Yes
Also, be sure to use wokada deiteris fork, and not video tutorials
For best quality and performance
training using a base pretrained model
the voice changer is working well but as i finish my sentence it sounds robotic and strange
Elaborate:
- your PC GPU
- what tut link did u use
- a screenshot of ur wokada
!give-media-perms 1h @vague edge
Lemme guess, you used a video tutorial?
yeah lmao
Because you're using original wokada which is worse in quality and performance
i just found the link of the github on their
And the settings are completely wrong
i thought so because on other peoples screenshots their ui is different
sorry
Video tutorials are extremely outdated for wokada and rvc
Uninstall all you got off it
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
Dw
Yw
Chunk controls the delay, extra controls the quality
what setting should i use
Share a screenshot of ur wokada
And tell ur PC GPU
i just did bro
I meant a screenshot of the program
To be sure you're using wokadaa deiteris fork
And not some old YouTube tutorial for original wokada
!give-media-perms 1h @paper lotus
Also so I can check all your settings
And be sure to give the best ones
try these settings https://rentry.co/forkvoicechangerguide#known-working-settings-for-chunk-and-extra
We need to check before that they are even using wokada deiteris fork 😭
Else the settings would be different
Just quickly making sure is this the deiteris one? https://github.com/deiteris/voice-changer for nvidia?
well I mean the version being used like^
i was just testing out the one thats hosted in web
its acting weird
Yes, be sure to get the download link from the guide which is precompiled, because the one on GitHub is divided in 3 zip parts because of the file size limit
f0: rmvpe without onnx
Extra: 2.7 for not having cutoff issues
Chunk: 128
You can optionally use server for less delay which is explained in the guide
Also you can optionally use force fp32 mode on in advanced settings for more delay but better quality and stability
Yes
@low shard it still sounds a bit robotic at the end, heres my settings.
Install vac lite from the same guide
Set extra to 2.7
Chunk: 192
web version or mcvv version which is better? When I speak, it's like sometimes it's silent, sometimes it's noisy, sometimes it makes some kind of sound. Even if I hear the sound of the keyboard, it still accepts it (I speak Vietnamese).
☠️
It isn't called web version/ mcvvitd commonly called wokada deiteris fork there
Share a screenshot of the program
ok
!give-media-perms 1h @strange mountain
you mean the program settings right?
The whole program yes
Be sure to get a better model and play with the pitch, enable sup2
You can optionally use force fp32 mode on in advanced settings for better quality and stability at cost of more delay
oh ok
can i ask ? thank you, also is it coming from my mic (to the quality of the sound)
A decent microphone should be fine for wokada
Unless you have a 10 cents microphone off Temu
I am trying to merge two checkpoints using Supermerger in Automatic1111 or ForgeUI, and I discovered that one of the checkpoints, which I want to merge, does not have any metadata so the merge errors out. Is there a way to merge checkpoints with no metadata?
it really works but the sound is very low is there any way to fix it?
ah okie
than i will try that and if theres any good changes appear i will inform you guys thank you
Elaborate
- your PC GPU
- what u want to do
- a screenshot of ur wokada
Yw
So guys
Does the gtx 1650 work on the ai voice thingy
While playing
Like does it play it so good
While dc is open
what is the best way to clean audio for a model
and whats the most effective model creation tool
Hi, I feel that the sound is unrealistic and not beautiful in FORK
Arabic dialect
@low shard
What should I do, please? When I enter into a discussion or talk with someone, I feel that the voice seems unreal and beautiful.
Are there any settings I should change to make it look better?
chunk is wrong for 4070 i think
Why how much should I put it?
@low shard
I did that and the sound got worse and faster
for what/
like this
mostly light games like roblox
the fork version should be capable for that https://rentry.co/forkvoicechangerguide
im on the ai hub doc
ohh thank you for the upload instructions
move to Colab/Kaggle or local if still
literally am not 😭
out of quota?
lemme test
wait no I didn't
and.... it's working for me
same model and output format as u btw
I think is kinda bugged
I've seen that problem before
DM
sure
is there a fix?
nope, just wait and see
try logging out and then logging in again
try without logging too
alr
i got a question that i feel like is probably asked alot
i manage to learn how to setup the realtime rvc voice but like how do i get it to connect to other applications like a recorder, i want the voice changer to work while i voice chat and game
as in i want the voice changer to work realtime as im playing aswell, where the voice changer can be heard by other people
generally we recommend fork wokada over the rvc realtime
i am now more confused, where do i get the fork version?, from the github or from the guide website
No need to ping again, have patience
Force fp32 mode on
Try a better model
Play with the pitch
Not all models are good
Btw don't share that link, we are using the docs link now
As of right now it should have the same things but not sure if it will change
What's your PC GPU?
amd radeon
Which
ryzen 7000
That's a CPU
mistook that
You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
The GPU is extremely important
Since it's needed for all heavy tasks like AI
amd radeom (TM) graphics
well I was too away from checking the updated docs
That's cooked, weak integrated graphics
Do you have any other GPUs?
Like GPU 0, GPU 1
It's the first link in -realtime
none at all unfortunately
Then your PC is extremely too weak to run it
Your only options are:
- buy a better PC
- try CPU mode which is unstable
- use cloud (remote good PC) with limited free time
@hollow beacon Kaggle is a google cloud computing service that offers 30 hours weekly of good GPU if you verify your phone number, that should be enough free time for you right?
ill find other plans, for now thanks for letting me know
Wdym?
It's free
can you suggest a good linux os for using live voice conversion using nvdia gpu card? 🙂 @low shard
i tested on few os where nvdia drivers are giving error to download
I don't really use Linux, maybe Ubuntu or Mint
I wouldn't really change my whole OS just for trying to get lower delay on a single program
i dont have any problem i use my spare pc for this and i just wana check so suggest me ubuntu or mint?
They are pretty easy to use iirc yeah
he's trying to get it working with Nvidia gpu driver under linux
yess
no worries let me try in ubuntu
it works but it's small, my gpu is still working fine, i just want to fix that. 💀
hey im having some serious issues with this same version of the rvc voice changer. why is it that 90% of the time when i launch it it just doesnt work.
when i press start. nothing happens.
im abt to lose my mind
it only works when i set the chunk to max
before it worked fine
like the ms line just doesnt even start
its like i never hit start
see its started but nothing is happening
which is why i’m asking to you to elaborate
to check ur settings
F0: rmvpe onmx
Output: line 1
what program are u using it with
set extra to 2.7, uninstall vb audio cable
uhm so every time i speak the sound still works fine yea and the sound is very low i tested it several times and read the Docs (still cant fix)
yes im asking which program are you using along the program, like discord vc or a game
Try to set in and out to an higher value, if it still doesn’t work, show a screenshot
i'm doing this withougt games rn but its fucking up somehow
check the triangle
amd gpus have to always check it everytime they change settings
the triangle will save ya 🙏
this?
yes
i'll try to do it
you have to always do that when u change settings since ur on amd
same
A VAC (Virtual Audio Cable) makes a fake audio device, used to re-route the audio of different programs
In Wokada context, it's used to get the output of wokada as the input in other programs
TL;DR: yes
alr thanks
share a screenshot of ur wokada pls
cant 💀 drop
!give-media
!give-media-perms 1h @strange mountain
idk whats wrong anymo
set chunk to 200, f0 to rmvpe with onnx
click stop if it doesn’t let u edit it
extra?
how much could my pc handle without it bugging out
it should be able to handle it
max?
set also the in a bit louder
output right?
i mean the in too
hey where do i download the client?
this version
i cant find the download
god damn 😭 \
gonna be here till 2029
the error that was causing it not to work has to be here
for server mode, set the sample rate to 48000
unfortunately thats what its at :/
also in control panel perhaps
yeah its all on 48000 i have no clue what could be wrong
it was working fine yesterday
ok i saw that it was my headphones that were breaking it
i just took my headphones out of the output and it booted up
wtf
i cant hear it now but at least its working man
is ur audio output low only when using it in a call?
what program are u using it with?
because some calls like whatsapp reduce the media volume
<@&1159293204038955078>
no all
no i didnt use anything reduce
i turn off all
they do that by default
so i think i will turn off do nothing in line 1
else, try increasing in and out again
it’s a thing done by the other call programs sometimes
Does the gtx 1650 work on the ai voice thingy While playing Like does it play it so good While dc is open and roblox
uhm what should i chose
where i can find anyother voice for voicechanger?
❤️
why is my res so high, like the voice changer starts being delayed even more when i try to play games
Hello.
Could I please have a link to the current version of AI Hub?
Thank you
hey just a quick question, i want to train a text-to-image model without using any pretrained models like SD, anyone know a good repo that helps out with that?
i've prepared a dataset for it
Would you help @sharp monolith ?
I'm not sure what you mean by "current version of AI hub"
Need a program for Voice Modulation, currently I have some very old Voice Changer Client Demo :/.
And I would like the current newer version , look at the screenshots of other people above
such as:
My version is: v.1.5.3.18a
that’s outdated asf
over a year old version
ai hub is not a program
And would you help me to have the latest version ?
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models. Technically, Mainline RVC does have a go-realtime.bat (aka RVC-GUI), but it's pretty messy and outdated so it's extremely not suggested for realtime.
Wokada = uses RVC for realtime inference. There's 2 main versions, Original made by Wok, and the most suggested one is Deiteris Fork (modified version)
you need reaaltime voice changer?
tell your pc gpu
all video tutorials are outdated
It needs this program what the above people in the screenshots have, could you help?
on this screenshot
That’s a realtime voice changer for calls, specifically wokada deiteris fork
13th Gen Intel(R) Core(TM) i5-13600KF 3.50 GHz
is that what u are looking for
that’s a cpu
Sry, GTX 4070 TI
i think you mean rtx, yeah it would be good enough
delete everything you got off youtube
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
you don’t see any settings about the device sound reduction? You sure you followed that guide link i sent
should
show a screenshot of ur wokada
!give-media-perms 1h @formal ermine
And in this program as above, you can insert a voice from the voice -models category and it will work properly ???
it will work as a realtime voice changer for calls/games, not pre-recorded audios
Is that what u are looking for
read this to understand it better
Okey that's what I'm looking for
That it would change the voice in real time based on just these voice models
then yes, read the guide i sent
Okey
Currently I have played Python 3.10.11 and Okada but the program does not start :/.
you don’t need python
Delete that python version + the wokada you got off youtube
forget everything you got off youtube
Read the guide, it’s everything there
no pretrained model to use means you'll ideally need all the image data around the internet to get acceptable results
Can you send me a direct link on priv message? Not lying I am green in this and just learning
that's the reason to do finetuning mostly
In short, how do I download a new version of this? So that I click and it turns on and that's all it needs ^^.
Oh I'm training it on something pretty vague, I don't really want other images as data for the model
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
you need to read the 1st link up there
and what do I download from it ?
that’s the only way to get it
the pretrained model provides building blocks for everything you need
the vac lite and wokada deiteris fork for nvidia windows, read it all and don’t skip important steps
it isn’t a 1 click installer or as easy as chatgpt, but all you need is inside that guide
Okey and now when I went into the guide I have the VC Client version
vcclient is the old original wokada
delete it
delete all you got off youthbe
Hmm... Idk I'm training on some pretty slow hardware and I'm tryna make this as lightweight as possible
you need only the things in the guide
lora training is an approach to do besides the regular finetuning
Okay but I'm not some kind of professional either and this version is totally enough for me just the key word on what to go into to download it.
this from a screenshot
please don’t keep replying that message since it keeps pinging the user, to have that version, you have to read the guide
there’s no other way to use the program if you don’t read the guide unless you want to do a mess
that other user read the guide too
it isn’t as easy as just downloading it and clicking start
Well, I'm off to read the instructions
This is the version I currently have
yeah and it’s bad
m
a year ago dfhfjksddfhjksadg
uninstall that along with everything you downlaoded off youtube
yeah it’s very old
a year is prehistorica in ai field
and the new wokada looks ?
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
so ill just download the new one and that'll solve it?
yes
read the new wokada deiteris fork guide
there seems less likely anything to fit on less than 8 GB vram, so if you think it's not capable of that, consider some cloud/rental solutions
uninstall all the things you got before
ok thanks a lot
yeah, this is an open source ai project, it isn’t chatgpt
yw and lmk
Yeah I'm running on 2GB of VRAM I might have a friend who has an A1000
2gb vram is cooked
8gb vram is barely considered the minimum nowdays
that seems like in the era of GTX 460 and some radeon HD
ofcourse the A100 would be better
@low shard Thank you for mentioning the fork, I hadn’t heard of it yet, I’ve been using the original.
note that colab has T4 (15 GB) for free plan
the wokada deiteris fork improved A LOT the performance, and helps with quality because it offers force fp32 (inference) mode
you’re welcome and lmk
will do 👍
Wdym with should😭
Hey guys, I am looking for a way to edit .json files with an llm that I can install locally, can you point me into the right direction? I am very new to AI so every bit of advice would be appreciated 🙏
- have a problem 2) decide to use LLM to solve it 3) have two problems
So you are saying that it is not a good idea? 😅 I found the results from chat gpt very helpful and I was wondering if I can get that to run locally
remind me how big is the chatgpt's model ? 🙂
you can run a relatively small quantized models locally, they are not good at predicting large texts
Should I be more specific about my project?
I dont want to annoy anyone with my nerd projects xD
python is great at manipulating json data, should not be too hard to program what you actually want to edit
Yeh the editing part is not that hard, but I really suck at that, so I resorted to chatgpt and was surprised how efficient and also accurate it was. I am trying to add a list of Item IDs to an already existing loot pool on my minecraft server
does microsoft word has an embed AI?
have*
because editing and aligning is so tedious manually in the word
What's the point of this?
Also, do I put 0.15 instead of 0.1?
How?
Like what? For the Arabic dialect
I have heard about something like text to json in this regard, maybe that describes it a little better
so what is the latest version? 💀
ok so i downloaded the new ver already the res is still kinda spiking whenever i try to go into a game
What are you trying to say
it is easy to edit
I can manually edit them yes but I suck at that and it took me a long time to do, so I tried to make it faster, I also have a large amount of files coming up, so I wondered if I could automate the process, just drop the files, tell the LLM the changes I want to have made and have it done
On Kaggle, I train a voice using Applio. The train time of an epoch seems to be quite long compared to the RVC Mainline fork. Is this because of the vocoders? For example, an epoch takes 15 seconds in Mainline RVC, while in Applio it takes about 40 seconds. It is annoying that it is so slow despite T4X2.
both on kaggle?
Yes, both on Kaggle.
Oh, didn't know about that. Is there much difference?
Applio is two times slower than Mainline RVC
0 chance of model exploding with fp32, with fp16 it may or may not happen.. personally half of my attempts blew up
You mean it's waste of time then ? Maybe i can try to reach epoch 100 to test the model quality..
fp16 is less stable
Thanks. My dataset length is nearly 20 minutes. I guess that's why it takes so much time.
there are also some things different, but it should not be 2x slower
i can actually test, I guess
just not on kaggle
Yes, it would be very good to test it. I'm sure I'm using the same settings for Mainline and Applio.
put game graphics to the minimum, tell the game name, close other programs in the background and show a screenshot of the wokada
the wokada deiteris fork b2332 #✨│ai-help message, you have it already dw
better stability and quality at the cost of the delay
lower value = deeper 'masculine' voice
higher value = higher 'feminine' voice
just play with the value till it sounds good
yeah find a better model, not all models are good quality
so no, mainline is not faster all things considered
Its 14
It's a paid model that I have
Do I put it on and the quality will be better, but it will be delayed?
And Also, do I put 0.15 instead of 0.1?
when did you buy the model and from who?
A year and a half ago
Mrm0z
ohh, was just asking since paid commissions are banned now
well yeah because it will need more power
there isn't a right value for everyone, it depends on your own voice and the voice model pitch
I have 4070
i7-14700
.?
Are there any other improvements I should make that improve the sound quality and make it real?
0.1 for fastest voice, 0.15 for improved quality but increases delay by 50 ms
yeah that's good, still means it will need more power and have a bit more delay
Do I put it 0-15?
your choice if you want better quality
Okay, the summary of this speech is on put 0-15 so that it becomes a better quality
And the sound will become realistic
I didn't understand this delay for every 50 ms how
0.15 yes
it uses more power, so more delay
Okay, is there a huge difference in delay or small?
And how best in FORK
SIO - REST .?
what?
-rvc
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
Why does only UVR-Deecho-Dereverb works? The rest models in Dereverb tab are just giving error
Sorry, it will be fixed
leave it at rest
hey someone can you give me linux tutorial how to run that if i double click MMvcServerSIO
the vcclient.log says 2025-05-10 13:24:03,148 INFO [main] Python: 3.10.15 (main, Sep 9 2024, 03:02:42) [GCC 9.4.0]
2025-05-10 13:24:03,148 INFO [main] Voice changer version: b2332 NVIDIA-CUDA
2025-05-10 13:24:03,148 INFO [WeightDownloader] Loading weights.
2025-05-10 13:24:03,168 INFO [Downloader] Verified pretrain/crepe_onnx_full.onnx
2025-05-10 13:24:03,168 INFO [Downloader] Verified pretrain/crepe_onnx_tiny.onnx
2025-05-10 13:24:03,181 INFO [Downloader] Verified pretrain/crepe_full.pth
2025-05-10 13:24:03,182 INFO [Downloader] Verified pretrain/crepe_tiny.pth
2025-05-10 13:24:03,238 INFO [Downloader] Verified pretrain/content_vec_500.onnx
2025-05-10 13:24:03,270 INFO [Downloader] Verified pretrain/rmvpe.pt
2025-05-10 13:24:03,337 INFO [Downloader] Verified pretrain/rmvpe.onnx
2025-05-10 13:24:03,346 INFO [Downloader] Verified pretrain/fcpe.pt
2025-05-10 13:24:03,354 INFO [Downloader] Verified pretrain/fcpe.onnx
2025-05-10 13:24:03,354 INFO [WeightDownloader] All weights are loaded!
2025-05-10 13:24:03,355 INFO [main] protocol: HTTP
is best sio?
kinda pitch shift
How much should be put
experiment and see what you get
Force FP32 mode: on (THIS IS OFF BY DEFAULT!) Turning this on improves stability, significantly reduces glitching/artifacting, increases VRAM usage by 200 MB.
I put it on will it improve the sound quality?
it works pretty well tbh
speak only english here
How much should I put it in to improve the sound?
I will show you the settings that I modified, okay my friend?
yeah they are okay
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
Force FP32 mode: on (THIS IS OFF BY DEFAULT!) Turning this on improves stability, significantly reduces glitching/artifacting, increases VRAM usage by 200 MB.
I put it on will it improve the sound quality?
Guys, please help! What should I choose here so that it works for me?
yes they are fine
either use client (can use echo and noise suppression) or use server with wasapi/ASIO for less delay https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#reduce-more-delay
close everything in the background
are you using it with a game?
Last update: May 5, 2025
if you want you can lower it a bit for less delay but just remember it to be higher than perf value by a little
No! How do I set these settings here? I just forgot
what program are you using the wokada deiteris fork with?
:_)?
How much should I put please?
it doesn't need to be precise
just a bit lower if u want less delay
?
okay thx
Just tell me please where I need to select the microphone in this program. in input or output? and where I need to select the virtual cable here
there's 2 different settings
client (more delay but easier and can use echo and noise suppression:
- input: microphone
- output: line 1
- monitor: headphones
or you can use server with wasapi/ASIO which is harder and can't use echo/noise suppression but helps alot with delay https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#reduce-more-delay
Last update: May 5, 2025
it's your choice
also
you seem lagging alot in the pic
which is why i'm asking what program are you using the wokada with
I have windows 11
you need to choose either if to use server or client
what do you need?
they provide different things
as i explained you
and yes you seem lagging
i need to use voice changer
yes.. and i explained you the 2 different audio setups with their perks #✨│ai-help message
you need to choose what fits you
i explained you the pros and cons of each
the program isn't as easy as like chatgpt
it's open source ai
thanks
cpu
No eligible .pth files found in ./models\volkan_konak_model2 bu I have pth file can anyone help me
rvc gui
rvc gui outdated asf
don't use video tutorials for rvc
what's ur pc gpu and what u want to do
rtx 4070 mobile I do an rvc model but I cant use that
can you help me
@low shard
do you want to do inference or training
Hey there, I'd like to ask how I could "pretrain" a voice dataset (like just the raw voice itself) I came across a model which said either "pretrained with Hubert" or with Titan.
hey I came first
min graphics, tried it on roblox for now, no programs, i couldn't send a screenie rn cuz perms (slr)
inference = use model
training = make models
!give-media-perms 1h @formal ermine
here
pretrains are just the base used for training other models
I closed the game already so res is normal and i haent eevn started it
use rmvpe without onnx, show a screenshot while in game
tried while in game
use chrome or firefox, might help since operagx is known by other users to give issue especially for its fancy effects eating more resources
lower extra to 0.4
be sure roblox settings are to the minimum
how can I use applio
much better thanksss so much
read the guide
yw
@still crystal
this setting is meant to decrease the amount of robotic breathing and sibilants
at a value of 0.5 this is disabled, anything lower than 0.5 enables it
original rvc devs used a value of 0.33 by default for inference
i havent noticed a big difference of 0.5 vs 0.33 to be honest
they sound exactly the same
is supposed to do something but ehh, a good model have natural sibilants and breathing by default
yea it only affects index accuracy
they're slightly better using 0.33
at 0.5 they artifacted a bit more
what's the recommend amount of epochs for an hour of audio? or is an hour of audio just overkill?
There is no way to know how many epochs is enough. You will have to use the tensor board and listen to the epochs to find the best epoch. An hour of audio is fine but anything over doenst make a difference.
I can lets say, do 500 epochs of training, save the modle, and then do the other 500 epochs tommorow though right?
so I don't have to do it all in one shot
you can save an resume yea
alr let me cook up
using index and protect voiceless consonants seems to improve breathing a bit
0:07 breathing is better at protec 0.1 and index 0.75
f0 used for training too, fcpe has different breathing compared to rmvpe
but at the end robotic breathing/sibilants occur randomly, so while protect voiceless help, eventually you'll get some robotic breathing/sibilants due to limitations in the tech
hey guys is there an ai that generates music based of off other music? Im tryna change genres of a song
That explains why the first model I train there was kinda off compared to RVC
Not relative with it, but I gonna try to use RVC then, even if it ran exaggerate slow on my laptop, but even with 50 epochs, it did generate better ones.
Sorry to but in.
Simple question: What is better for Text to Speech? Applio or Free User Elevenlab?
?
how come the voice changer does not work on roblox?
i changed my input and everything
hii can someone help me i dont know how to dowload it where to even start from
virtual audio cable not workin
okay weird question, but how would i go abut making a voice changer ai tof my friends voice? like i want to be able to speak in their voice. (i do have written permission to do so)
stop spamming
I'm just asking, does anyone use the sites???
so you should ask it once and wait for a reply, not spam it across 4 channels
doesn't matter, the main thing is that I get my answer
i just switched over to the fork web version of okada after using the old one for so long i closed the software and to open it again do i just open MMVCServerSIO?
when uploading a model, how do i add a photo?
I'm on epoch 60 rn and I totally forgot about your message, can I still install this tensor board thingy while training the model?
yup
do I just look it up and download it or is there something special I have to do in rvc?
do you have applio?
no
idk what that is
this is my first time training a model
its says that my rvc folder can't contain any spaces in the file path, however mine does, and I can't changing it because it says the folder is in use. Should I just ignore it and move on?
Hey y'all the MP3 download for my song cover isn't as high quality as when I listen to it & record it myself. Do I have to purchase weights to get the highest quality version?
If so I was hoping someone who does pay for weights could just shoot me a download for it because I'm trying to make an AMV with it
sorry to keep pinging you, but I seemingly have tenserboard fully installed, however it doesn't open anything on my browser and when I copy the address it just opens to a white page. Any advice on how to fix?
Just copy the local host part
Not the whole link
I'm fairly sure I'm doing that. Could I have something in my browser or internet setting misconfigured?
Try a different browser
Let's keep support issues here and not dms so if others have this issue they can find the answer here
Tbh I've never seen that issue before
Or I just don't remember
if I inspect element if tells me this
so wtf is mime type checking
Did you happen to close the cmd line
no
yes
I have 20 min of clean data set audio. How many epochs should i train the model to. Its a low male speaking model
How feasible would it be to train a model to clean up manga pages after removing the text? Or are there any already?
how can I put the metadata.json in voicechanger mmcv
hello how do i use the voice
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
Hello, I would like to know how to use the voice changer, what app to download, how to set it up, etc. Thanks
is this better dawg
kaggle only allows 40 minutes of use right?
you can't save and transfer the whole session state
for rvc training, download the whole model's logs folder in order to resume later in anywhere
Is de-reverb & deecho by sucial a good model option?
im trying to use VoiceAi and im stuck with payment section. I head it this is free app.
they want to grab more money from you, while there's open source alternative
oh, is thee any other app that i can use?
Last update: May 5, 2025
could anyone here pleaseeee help me with male/female vocal seperation? 🙏
Hey everyone! 👋
I’ve recently seen some examples of Gemini Flash 2.0 Experimental being used for image generation inside tools like ComfyUI, possibly through custom nodes or API integrations.
Does anyone know if there’s an official API, beta access, or any documentation on how to set this up?
Or maybe how to connect it via Vertex AI as part of a programmatic workflow?
Would really appreciate any insights or experiences you can share. 🙏
Thanks in advance!
MMVCServerSIO how do i open this
Elaborate
- what's ur PC GPU
- what u want to do
- what's the issue
Elaborate what I asked u
i want to use the voice changer deiteris and MMVCServerSIO doiesnt open i cant send ss here
This is only for realtime btw, like for calls not pre-recorded audios
Also what's ur PC GPU
yea ik
3080
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
Did u read the 1st link guide?
which ones the realtime then
i got this downloaded it doesnt change my voice
idk why
Don't follow video tutorials
Those are old
That's original womada
Wokada*
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
i did bro
That's not where the guide directs you at all
ok what do i do then
You have to read it and get the Nvidia windows download
Forget everything off youtube
sec
?



