#✨│ai-help
1 messages · Page 313 of 1

I only extracted 001 tho
and they weren't in the same folder
Why you still not changing your sample rate to 48000 Hz?
tg-develop fork may have a bug to show back to 44.1k for the server mode
no idea but just set it again to 48k
is adding an index important or only thr path file?
When more than 2 people asking for voice changer in here at the same time, it makes things messy for me to provide an answer.
Close or Ctrl + C the voice changer's terminal, go to your MMVCServerSIO folder, open stored_setting.json file with Notepad, edit every sample rate to "48000", click save, and relaunch the program again.
When you said "last question", this one is not a last question. 
still getting the same error
sample rate is set and staying on 48000hz now tho
does the same thing on both
Try another web browser.
if i do passthrough on it works for both 44100hz and 48000hz
My turn?
tried edge and vivaldi browsers
I'll get chrome and try on chrome now but I don't think it's a browser issue
am I missing files?
result, vol = self.process_audio(audio_in)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "voice_changer\VoiceChangerV2.py", line 133, in process_audio
audio, vol = self.vcmodel.inference(audio_in)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "voice_changer\RVC\RVCr2.py", line 226, in inference
raise PipelineNotInitializedException()
Exceptions.PipelineNotInitializedException: 'Pipeline is not initialized.'```
well I'm sorry I'm tryna figure this out
I got no voice_changer folder
dude I did let Toxic go and then asked when you told me to
I'll ask someone else
while you're busy
the voice sounds pretty laggy, how can I fix it to become more stable?
fish
then imma wait
What do i do here
okay so you select a microphone
can you send me a screenshot of your folder?
look at the input device tab
like this
on the right side
there
I observed for myself, the program worked for me even my laptop is old and has no GPU, of course the perf number is over 30000 ms which is crazy. I don't see any "pipeline not initialized" error so far.
yes correct
weird
is this for me mate?
oh it's because you're doing client
I had the same thing so I just did server
server is not avaliable..
The "pipeline not initialized" error sometimes has to happen with a non-RVC voice model (like a Beatrice voice model) being loaded to W-Okada voice changer fork (which only supports RVC voice model), your GPU's driver is outdated or something else.
also
send me a voice model that should work I'll try it
when you get an answer, tell me
lowkey missing out on my favourite show setting this up 😭
The question is: why the program works for me the first try while y'all keep struggling?
it feels like im doing damage on my windows
you're using neither client or server?
nah it won't mate, don't sweat
alr
With "server" audio mode:
idk
oh yeah I see why it looks unchecked now
it happens when you start server
I was using RMVPE ONX
do I not?
ask chatgpt maybe
if I knew I would help man
you sure you can't use server
Yeah
Here's your settings:
Chunk: around 130 ms
Extra: 2.7 s
Pitch extraction: rmvpe (not rmvpe_onnx)
do you know why?
No
Refresh your tab.
done
still wont budge
the hell why isn't the settings saving
nevermind it was just the sample rate
keeps going back to the default after I change it to 48000
weirdo
Try switch input device from "microphone" to "Line 1" and then "microphone".
sounds laggy a little and robotic
hello. i did everything correctly i think but i hear my changed voice how do i close it

Dosen't work..
On W-Okada, set "monitor" device to none.
Check if your browser gives microphone permission to your current tab (voice changer).
ah I changed the pitch, a little better now
do you think getting spin helps
idk rip i'll just use main outdated one
this ain't working
sounds so robotic
HUH
bro I refreshed my page
and I can't use server anymore
the hell
oh
tf
it's normal now
/:
why can't I enable noise suppresiuon?
idk what to do anymore bro
my voice sounds so robotic, a little choppy and so weird, any way to fix it to make it more realistic?
its not letting me use gpu i can only use cpu why?
after that? just unzip normal??
.
or before unzip i must do any? @hallow thistle
Use WinRAR or 7-Zip to open or extract the .zip.001 one.
What is your PC GPU? And are you following Tg Develop's W-Okada fork?
im using winrar then use click Extract to voice changer windows amd64 cuda, then 1 folder appear
in that folder, appear MMVCServerSIO + 2 file Force GPU Clocks
after that?
Go inside MMVCServerSIO folder, there's a program of the same name, double click the program to launch the voice changer.
with blue mic icon?
ok
rmvpe auto download?
show the spec in task manager, if it is integrated intel or AMD radeon graphics, it falls back to cpu mode
it is am d radeon
awmd
amd
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'set_sampling_rate'
why this appear on black window
everytime i upload a voice
it dont sound right
it sounds nothing like the audio samples people post
so idk whats going on
Help me , I have voices in my head
I'm trying to get the best quality and least delay using RVC Client. I've seen the guide showing recommended settings for xx50/xx60/xx60TI/xx70 cards etc.. Does it also matter which series these are? 2060, 3060, 4060? Could I get a 2060 card and get the same result as a 3060 or 4060? The card is not being used for anything else
Im using 1.7 and it doesnt have the settings like in ur pic
Could it be that the author of Vonovox removed features instead of adding in the latest update?
any help to humanize ai text
does anyone know how to fix the thing where when your using your mic
all of your audio sounds like your under water
it seems to work but it doesnt detect my mic at all even though i put my correct microphone, and every single app hears me besides okada
"humanizers" are just plain BS
i kinda know but what do i do
what purpose is it exactly? to outsmart the AI detectors or even real humans' common sense being savvy enough?
to outsmart the detectors
so it could help you do the school/uni projects or pass the job interview despite your real skill incompetence?
sorry but we couldn't support for such abusive purposes
its just literature so it doesnt require any skill, its not that big lol
ur talking like im a med student
in your head, zombie
Real professional from you
Repxic: can someone help
seiso💖: you're incompetent
I saw you seem like a friend of that bob guy, could you tell me what's wrong with me so he repeatedly insulted me as a clown?
ModuleNotFoundError: No module named 'pyngrok'
""COLAB""
Please provide more details. That error line doesn’t specify anything and it won’t help to deduce what your solution might be. For example, you could tell us which Colab you are using, what you did that led to that error in that Colab, and for what purpose you went to that Colab.
Colab link = https://github.com/w-okada/voice-changer/blob/master/Realtime_Voice_Changer_on_Colab.ipynb
Error happened when I ran it
I went to that Colab because My gpu is amd "weak one" r7 200 series (2gb)
the way you make that invalid conclusion makes you look nothing but a random immature person I've ever seen across average discord servers
Have you executed the cells in order? Because the error you describe means that the cell may not have been executed or may have failed during installation.
Yes
full error
/content/voice-changer/server
ModuleNotFoundError Traceback (most recent call last)
/tmp/ipython-input-341872215.py in <cell line: 0>()
22 get_ipython().run_line_magic('cd', '/content/voice-changer/server')
23
---> 24 from pyngrok import conf, ngrok
25 MyConfig = conf.PyngrokConfig()
26 MyConfig.auth_token = Token
ModuleNotFoundError: No module named 'pyngrok'
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
To view examples of installing some common dependencies, click the
"Open Examples" button below.
I think solution may be by adding !pip install pyngrok
to notebook
umm
I think you should switch to another notebook. I don’t usually check #✨│ai-help much, but I’ve seen that they no longer use the original w-okada, since it’s no longer being updated. Instead, they use its forks, which you can see with the -realtime command
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Deiteris' fork (modified version) of wokada that doesn't get updates anymore. GUIDE
For Windows Nvidia, Both Wokada Tg-Develop fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Tg-Develop Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
Could I run it locally with amd r7 200?
AMD is not very good for running RVC properly, but I don’t think you’ll get much out of only 2 GB of vram.
Try that notebook; it might work for you since it keeps the same interface as the original w-okada, just with some changes.
Do you mean AMD Ryzen 7 CPU? While W-Okada voice changer could work with CPU only, it's not always recommended for most scenario. For W-Okada Colab notebook, try this one. https://colab.research.google.com/github/tg-develop/voice-changer/blob/master-custom/Colab_RealtimeVoiceChanger.ipynb
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
Try downgrade Vonovox to v.1.6.9, the latest stable version. v.1.7 is a beta/pre-release version, so a bug or glitch is kind of expected.
GPUs do matter for W-Okada voice changer. The performance roughly depends on the generation, GPU architecture (Ada Lovelace, Blackwell), VRAM (8GB, 16GB), the specs of a GeForce RTX GPU (like CUDA cores), it's not always the number naming (xx60, xx90) itself.
While the guide of Deiteris' W-Okada lists several recommended chunk and extra settings for known GPUs there, these "max settings" sound more of "known average" settings rather than the absolute one, and while most use "extra" value at 2.7 s, the actual chunk value may be raised over supposed chunk value in a real scenario if run along with a graphic-demanding game.
Do you have to do this for Vonovox as well or is it meant to be done only for W-Okada?
“1 - Open Task Manager, click "Details"
2 - Right-click audiodg.exe and set the priority to "High"
3 - Right-click again and choose "Set affinity" then select only CPU 2.”
does anyone know how to fix the thing where when your using your mic
all of your audio sounds like your under water
Runway Gen-4.5
just donwload that on ur computer and run it
its the best one on the market apperenrtly so u will have to tweak and overclock ur systems if u want max of it
i was wrong
its licesned so its not open source anymore or it wasnt ever
so jsut use their website to us eit on cloud
ur gpu is well enough tbh
You shouldn't have to do this for anything, it should just run normally after u download it
I don't understand what u mean
I have screenshots showing the process actually had 1.9 GB of data in the Working Set (RAM) initially. Then it dropped to 300 MB.
Since the app didn't crash, that missing 1.6 GB of data had to go somewhere. It wasn't just 'empty reserved' space because it was literally sitting in RAM a few minutes prior. Windows definitely PAGED it out.
The thing is, audio still sounds the same and delay hasn’t changed so I don’t know what’s going on here.
you must be using an old version, did u get it from a tutorial on youtube?
Yes, but the same thing also happens with vonovox and Deiteris W Okada
that's odd
with all versions basically lol
you should use either vonovox or wokada tg fork, deiteris is ok but anything else is outdated
but that's weird that they all do that
yeah i’m using vonovox. Shouldn’t audio be impacted when some of it is running on your ssd instead of your ram though? cause this is happening to me and it still sounds the same lol
I wouldn't know as I don't know the impacts of using it on different stuff
I keep whatever voice changer I use in my downloads
What settings do you recommend for RVC? My dataset is studio quality, but the resulting product has excessive metallic artifacts in the letter “s.” (t de esser)
have you tried using legacy core 1.5, it's a really good pretrain
ı use klm 4 pre train 48khz
personally using the 32k version has gotten good results every time
klm 4 is kinda old if I am remembering correctly
hmm
u have download link ?
the pretrain can be found here
https://discord.com/channels/1159260121998827560/1453062294446805244
it has many different versions, 32k, 40k, and 48k, as well as spinv2 I think and also Refinegan
Does it work properly for Turkish too?
I wouldn't know as I speak english but you can try ^^
Well thank you for your help I'll give it a try
you're welcome! hope the results are better for you
so uhhh gradio just killswitches itself when you try to open the public url on colab
idk if it's a me issue but this didn't happen when i was about to train a model on colab earlier
guys what voice changer should i take? i have rtx 4060 laptop gpu, should i get local or cloud instead?
speaking from my experience the 4050 laptop ran the voice changer just fine so if you have storage root for local but if you don't have enough storage go for cloud
i just followed the guide and it still cant hear me at all
aight thankss
im tryna use the L voice, but how I configure the settings and get the best ones?
did u get ur voice changer off a youtube tutorial?
no?
what version are u using, it'll be easier to help if I know what one u have
the only 3 that tare recommended that are up to date is wokada deiteris, wokada tg fork, and the best currently vonovox
@viscid moss
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Deiteris' fork (modified version) of wokada that doesn't get updates anymore. GUIDE
For Windows Nvidia, Both Wokada Tg-Develop fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Tg-Develop Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
Still looking for UVR5/MSST or something? You have been pinging Eddy for the same topic three times this year.
W-Okada voice changer (b2397) fork and Vonovox will work with NVIDIA GeForce RTX 4060 Mobile GPU, locally. The storage size doesn't directly impact the performance of the voice changer, though a drive with highly free space (> 256 GB) is preferred.
Also, your account "pronouns" part is questionable, by the way. 
Do you mean W-Okada voice changer? And what is your PC GPU?
i fixed it by reinstalling so no worries but i got another question
im using RVC and im having trouble selecting models
No, I'm more of investigating deeper issue.
okay then im using Gtx 1080
W-Okada realtime voice changer or Applio RVC (retrieval-based voice conversion)? The initials RVC doesn't always mean realtime voice changer.
how can i know which one it is?

hello everyone, can someone please connect me with an Ai training job or any given programming task I would really appreciate 🥺
I already DM
Does someone have problems with odaka aswell? Yesterday it was perfect and now those high pitches and cut offs are there and I changed nothing
If it's a general knowledge, you can talk in here. It doesn't have always be a single moderator who knows about UVR5, unless you were trying to contact Eddy for his UVR5 fork bug fixes.
I'll just wait for the reply, It's a personal matter and I need a little help with it.
Most people here for sure. What is your PC GPU? And did you follow any tutorial before?

CPU: Ryzen 5 3600x gpu: rx 5500xt, I always have high temp and yesterday it was PERFECT, but today it’s just messy and whenever I play a game as soon as I speak it keeps micro freezing etc. Yesterday none of that happened. + the voice cuts off and it glitches like in those micro high pitch stutters mid speaking
Ok, but which W-Okada version you're using?
The std one, only one working for my amd
Try this W-Okada version. https://github.com/tg-develop/voice-changer/releases/download/b2397/voice-changer-windows-amd64-dml.zip
The dml one doesn’t work to me somehow
I tried it from huggingface and it just opens the cmd and closes right after
It says something abt windows pkg and then writes a huge chunk of commands and then just closes
What I sent to you is a different (b2397) version to what you're using (likely being v.1.5.3.18a).
leemme try
The issue with v.1.5.3.18a W-Okada, especially its DirectML variant has to happen with its outdated and buggy code, where the program only uses CPU even if your AMD Radeon RX GPU is present.
ohh and that explains the high pitches?
because with the voice im using there shouldnt be any, its one of the best teached ones
No idea which "high pitches" refer to, but if you mean those background noises coming to your microphone it's likely the cause, or when the program is so laggy that the audio is pitching up and down unexpectedly.
no i mean, when i talk normally it sometimes gives like those high beeps between the speaking of the voice, plus it stutters or cuts off completely sometimes
Still, though. 
this one is way better
the one u sent me
can u send me the settings for noise suppression? i cant change it somehow
like the noise 1 and 2 doesnt wanna check
AI Hub FR or AI Hub France no longer exists. What you're here now is "AI Hub by Weights" Discord server, which is the second iteration of previous AI Hub that was taken down in 2023.
These three "noise supression" settings only work when you set audio mode to "client". If you set audio mode to "server", both noise and echo choices will be grey out. It's a quirk in every W-Okada version.
You're welcome. Make sure to set extra number to 2.7 s for more audio quality, though always check performance number at top right screen if there any delay. 
aahh okkay i'll use vonovox then, i'm afraid that my 8 gb vram can't handle it at first, thanks
uuhhm... just ignore it
What is your query again?
what program can be used to generate the images that are listed in the ai-images section?
I have a question, it's my first time using w-okada and with each voice that I use, and listen back to the recording of my voice, the voice is always so choppy and cuts out mid word. Any idea how this can be fixed?
Stable Diffusion.
I need someone to tell me if there's an alternative to Hugging Face to remove noise, echoes, etc., because the page isn't working when I try to access it.
I think this is the right place to ask. Any models I upload myself have super high res, never gets below like 3000 ms, and when I first use the models i uploaded, they start off at like 100ms and work just fine until it quickly skyrockets in ms. i've uploaded even the most downloaded models in this server and I never get anywhere close to lower, even using the quickest settings. Using the premade models I usually get around res 200ms. Please help. (i'd send pictures but i cant)
More like UVR5.
What is your PC GPU? And did you follow any tutorial or guide before?
Yes, this is the right channel for query about voice changer. What is your PC GPU? Did you follow any tutorial before?
9070xt, and yes I followed the tutorial, down to every setting.
Also I apologize if I don't respond, I have work in a few hours so I'm laying down, but I shall be back here soon to figure out my issue! 
Tutorial I used:
https://youtu.be/SxdnGxicJOg?si=OUXQbwaUpZp54DYy
For better voice changer version, you might like to try this version. https://github.com/tg-develop/voice-changer/releases/download/b2397/voice-changer-windows-amd64-dml.zip Simply, the v.1.5.3.18a W-Okada has long been outdated, its DirectML variant is very buggy as well.
Apple Silicon (M2, M3) or Intel Mac?
macBook air
Yes, that's the Macbook name. But what about which CPU/chip does your Macbook Air has?
To check your Mac's CPU/chip, click Apple icon at top left corner screen, navigate "About This Mac".
Try this W-Okada version https://github.com/tg-develop/voice-changer/releases/download/b2364/voice-changer-macos-arm64-cpu.tar.gz, and make sure to follow this part of the guide as well. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/#download-for-apple-silicon-mac
Last update: November 22, 2025
who can help me to get the Server URL
how to fix the delay
Can someone help uh my gpu is rtx 2060 super so my chunk size is not going lower than 2400 but guides say it should atleast go down to 128 this is my version vcclient_win_cuda_2.0.78-beta
That version is very old
💔
You should switch to wokada tg fork
I thought it was the latest and they discontinued it feck chatgpt
Lemme get u the links
👁️👄👁️
U can't trust a robot lol
You'll need these, the first one is a virtual audio cable and the other two are for the voice changer
https://software.muzychenko.net/freeware/vac470lite.zip
Possibly just limitations in the code or it being weird
ya sure it will run on my gpu..?
You can try 🤷♀️
💔💔
Btw to use this one extract the 001zip and then place 002zip in it, since 002 can't be extracted
Idk why it's needed but it is
And for the virtual audio cable just extract and run setup64
Then to run wokada tg fork run mmvcserversio
It's an exe file
so ya mean after extracting the 001.zip and put the 002 zip in the 001.
Got that already
are those in 001?
Wokada tg fork is the 001 and 002 zip thing
aa..I see..
Yup
ty for the help
You're welcome! If u need more help just @ me here or if u need to send pictures and can't here just ask to dm me ^^
oki
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
Using RVC with virtual audio cable and the output isn't producing my voice at all.
What can I do to fix this?
Wdym?
Do you mean the real-time voice changer
yert
When I start it, the vb doesn't work at all
Don't hear my voice even though I have Listen to Device enabled
Uh oh
In RVC, output is vb input and my input is my normal microphone. Then in windows settings, I have the vb output selected as the mic
I hope he doesn't turn me into an ice statue..
which graph should i follow?
Did you see
For training a model don't follow the graphs, I learned just listening to each epoch is better, start around 100-200 epochs
I did yea, you have them set up correctly but idk why it isn't working
the vb output mic just doesn't work, dont know what to do
Okay, then I'll test it every 50 epochs
Oki doki
this hasn't happened to anyone else before?
You could try switching to vac lite
I'll get the link
To set this up just download it, extract the zip file and run setup64
Not as admin btw just regular
Doesn't work, now I don't know if it's the vb cable or rvc or my computer preventing them from working for some stupid reason
the vb mic doesn't even show the green bars when I talk, so it's literately not picking up my voice at all
damn, yea idk what could be the issue did u get the voice changer off a youtube tutorial possibly?
if so it's outdated
I'm just using normal RVC from github
is there a new rvc that I'm not aware about?
which github page?
there are 3 currently yes, wokada deiteris fork, wokada tg fork, and vonovox
do all of them work
they worked for me so I would think so
So I shouldn't be using the standard one then?
this one
I looked at the link u sent and I've never seen this one before so most likely it's outdated
what gpu do u have?
rtx 5060 ti
ooh nice
that one is peak, I have the 5070 ti
siick
I'd say prob wokada tg fork as that's what I use, tho vonovox is better tg fork is easy to learn and has fast model swapping times
vonovox also has a ton of features that might confuse u
alr thanks ill give it a try
let me get u the link so u don't download the wrong one
thanks
all u gotta do is just download both, extract the 001 zip and put the 002 zip into the folder for the 001 zip since 002 cannot be extracted but is needed for whatever reason
kaggle is better tbh, 30 hours free
can't train more than one model at a time but still, for free 30 hours, and colab at most will give u maybe 4 for free
per account
is it only web
never used web ones before
the interface shows up on web browser yes
but it's still on local
runs from your pc gpu and the task manager
Can I open a second Google Colab workspace while the first one is open?
IT WORKS
LETS GOOO
Thank you so much
yea but u need to use a second google account to train the other model with it
you're welcome! do u have any questions abt it?
like how to use it or anything
nah I figured it out thanks
alr, hope it works well for u in games and stuff
hope so too
I have a question actually.
I want to switch the processing unit to use my GPU instead of my CPU but it doesn't seem to work when I click Start Server.
Just wondering if it is possible to use my GPU since it would run better
No yeah I did change it but clicking Start Server does nothing
The performance stats don't move
i know this is yesterday but um this is happening to me
Can I get some proformence tips (specs rx 580 8gb
I3 10100)
Hi everyone, could you help me download the program to create voices for my characters, please?
ive got a 4060 gpu and when i open start http bat it doesnt do anything js opens then closes
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
Try W-Okada fork. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ https://github.com/tg-develop/voice-changer/releases/tag/b2397
what link theres 2?
thats amd @hallow thistle im nvida
Download https://github.com/tg-develop/voice-changer/releases/download/b2397/voice-changer-windows-amd64-cuda.zip.001 and https://github.com/tg-develop/voice-changer/releases/download/b2397/voice-changer-windows-amd64-cuda.zip.002. AMD64 is another CPU architecture name for x86-64 or x64, used in most Intel and AMD CPUs, and it doesn't refer to AMD Radeon GPU that way. The term "CUDA" mostly refers to NVIDIA GPU.
What is your PC GPU?
Or rather, what processor or what version of Windows do I use?

No joke question. To check your PC CPU and GPU, open Task Manager, go to Performance tab, see CPU, GPU 0 or GPU 1 on the left panel.
Could you please send me the download link for my PC?
Because you keep asking for it but never actually answer my questions. https://huggingface.co/IAHispano/Applio/resolve/main/Compiled/Windows/ApplioV3.6.0.zip
Is that the download link?
Why should you ask more? That's a download link.
But is this the download link for the program?
It's not that hard to click that download link. 
yes it is
W-Okada voice changer or Applio RVC (non-realtime)? While the voice changer could work with AMD Radeon RX 5xx GPU, it would likely struggle when with higher settings (Extra 2.7 s on bXXX W-Okada fork). Applio RVC (retrieval-based voice conversion) can work with AMD GPU but with some tweaks.
The user has Google Colab Pro tier, so the runtime limit now depends on the user's Colab compute units and not always limited to 2 - 4 hours within 24 hours, which usually happens with non-Pro free users. Kaggle also has an option to link Colab account (that has Pro tier enabled) for more Kaggle GPU quota, by the way.
interesting
With Colab Pro, while this tier allows you to run few more notebook instances on the same account, training 2 models at the time (if they on separate instances and use same GPU) sounds impractical or impossible, because resources would be shared and distributed across notebook instances, leading to slower performance for both and potentially drain compute units faster. For full performance, running only one instance is always better.
thanks for info
What is the most stable real-time software at the moment?
I’m currently using Applio v3.6.0 to train a voice model. Could anyone recommend the best configuration settings for achieving optimal results?
hello everybody, I have never ever did anything AI related other than using AI, I wanna get into it and learn about it and models and stuff, I dont know where to begin or how to, all I know is that you need a decent pc, I have a 4080 super and a ryzen 7 7800x3d, anyone who has tips please share and thanks (;<->;)
I use windows 11
guys can someone help , when i hit the start.http nothinf happpens....
Tg Develop's W-Okada voice changer. I was about to tell you this 7 days earlier, but you were away.
What is your PC GPU?
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
which Ai source is this?
This is Tg Develop's W-Okada voice changer fork. Basically, it's an implementation of W-Okada (MMVCServerSIO). https://cdn.discordapp.com/attachments/1159290139609137264/1457950706500763809/image.png?ex=696b0ce7&is=6969bb67&hm=8bd91ae54b7db47d22da490dd6d6c5c145640cd42f5a235d0e217b8018939f9b&
is this AI for free? and can you train models in it?
W-Okada can't train a voice model, only inference, which means converting your vocal in realtime using RVC voice model.
how do you use this W-Okada? and is it for free?

W-Okada is a free and open source program, much like RVC (retrieval-based voice conversion; like Applio RVC fork).
Nvidia
Which NVIDIA GeForce RTX?
is it something you use online or do you need to download it?
How am I supposed to answer that?
give me a link?
What? Since your PC has zero NVIDIA GPU (assuming you have no PC GPU), and you wanted to try it, this is Colab notebook link for that one W-Okada. https://colab.research.google.com/github/tg-develop/voice-changer/blob/master-custom/Colab_RealtimeVoiceChanger.ipynb
okay thanks
Generally, I do not like to tell little trivias (like your queries) whose common information can be accessed either in #1159513888199540817 or Google, and if you do research and understand a bit on why W-Okada voice changer is like that, it would be great.
Yeah ti 3050
@hallow thistle Also, is it impossible to to upload models to weights? because when i click on "train model" the bottom that says "upload model" dissappeared.
Download and try Tg Develop's W-Okada fork. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ https://github.com/tg-develop/voice-changer/releases/tag/b2397
Okay!! Thanks so much

also, is there any other way to train models besides the applio colab and weights? like, is there any other colab for training models that WORKS nowdays?
Why should you want more? Does Applio RVC on Colab not work?
it does work, but... i dunno... once there were more colabs that allow you to train voices. is this the ONLY colab that does it now?
Applio RVC is the only best known working RVC fork. Some other RVC implementations are either outdated, no longer maintained.
Try Tg Develop's W-Okada voice changer. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ https://github.com/tg-develop/voice-changer/releases/tag/b2397 The one you use is an original and outdated W-Okada.
bro I want something useful
Yes, that one is more useful, just that when you were overlooking it.
Download both of them.
why
Simply, these are split zips of a single zip file.
Why why? You use WinRAR or 7-Zip to open the .zip.001 one.
why me ?? I don't know anything about this topic
I felt you were Arab
and for
I am
but I don't know anything about the AI models that are created here, I am specialized in another area of software engineering
عربي اخوي
That doesn't mean the person knows about voice changer.
I'm a girl, and yes I am arab
Weights removed the option because they're greedy and horrible
True. This site is falling apart completely. It's like they're trying to keep everyone away from this site
It's very sad to see them remove useful features, like downloading the outputs of ai vocals and now uploading good models
Deprecated. Use vonovox. That one does actually work if you have an nvidia gpu
i was wondering that too
ugh
this sucks
I'm trying to download tg-devlop's version but i get that the 002 rar file is damaged... I've tried downloading it several times. Anyone experienced this?
Just place the 002 zip into the folder of the 001 zip after extraction
And run mmvcserversio
You can't unzip it but it's still needed
so directly into the MMVCServerSIO folder?
Just drag and drop into the folder of 001, then u can run mmvcserversio so wokada tg fork starts
PyInstaller\loader\pyimod02_importers.py:378: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
☹️
<@&1159293204038955078>
Hello
that increases the index, the index basically just controls the model's "accent" it's not needed to be moved past 0 tho ^^
same thing happening here
is there a tutorial? I'm on PC windows and i have a GTX 1070?
To download the voice changer
and this is why from now on i'll be moving models to jammable
this happens on the main acc but not in secondary ones
i wonder why...
what chunk size can i use? rtx 5050
what version of the voice changer are u using?
I use a 5070ti and I can get pretty low, all the way to 122.7 for chunk
tg
try chunk at 122.7 and extra 2.7
i feel like its coming out too fast
what?
isnt that what chunk does?
the lower the chunk the faster it comes out but the higher the chunk the slower it comes out?
why does it sound like there is a delay\
anyone know whats goin on with the applio notebooks rn?
this what i just got in kaggle:
Save the link for later, this will take a while...
Traceback (most recent call last):
File "/kaggle/working/program_ml/app.py", line 30, in <module>
from tabs.inference.inference import inference_tab
File "/kaggle/working/program_ml/tabs/inference/inference.py", line 9, in <module>
from core import (
File "/kaggle/working/program_ml/core.py", line 20, in <module>
from rvc.lib.tools.model_download import model_download_pipeline
File "/kaggle/working/program_ml/rvc/lib/tools/model_download.py", line 14, in <module>
from rvc.lib.utils import format_title
File "/kaggle/working/program_ml/rvc/lib/utils.py", line 9, in <module>
import wget
ModuleNotFoundError: No module named 'wget'
Not sure, I was just using the kaggle notebook for applio and it worked fine
hmm
That's close. The lower chunk value means less delay, while higher chunk value means higher quality.
@viral mason By the way, aside from sending these people links, how much do you know about the voice changer? 
What is this Python program/environment supposed to be? And is your PC GPU still AMD Radeon RX 570?
what chunk size can i use if i want higher quality?
I have so some mixed thoughts about whether if Weights.com still considered good or no, as much as people here especially Local_Worm keep shitting on the site as if the website itself is some political material.
As much as I felt guilty for didn't actually pay for the site whereas I once won a prize for a free Weights subscription, the Weights and related websites (Voyages) are only good for "draft" creations, especially AI covers, inference and some model trainings. The most parts could be done through separate softwares like UVR5 and Applio RVC, but this workflow is more complex than the Weights website itself, to which I just think some lazy people wouldn't even do this.
i cant add models form #1175430844685484042 to vcclient

What is your PC GPU? And did you follow any tutorial before?
when i try to upload any model it's says it's missing toml file
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
every model what i dowland it's only have 2 files and default from vcc have a lot more
You still have not directly answer my questions.
intel core i3-9100F and yes i used tutorial
That's an Intel CPU, not a GPU (graphics processing unit).
Nvidia Geforce Gtx 1650
Try Tg Develop's W-Okada fork. https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ https://github.com/tg-develop/voice-changer/releases/tag/b2397
it's keep showing cannot install device driver (1072 - the specified service has been marked for deletion)
Try extract MMVCServerSIO folder to somewhere else like "D:\MMVCServerSIO", shouldn't be on your desktop folder, check your GPU driver.
@hallow thistle
what is the best version
of tg okada
the most faster
and realistic
alr
While the voice changer might work with NVIDIA GeForce GTX 980 Ti, the entire GTX 9xx GPU series is slightly older than GTX 10xx and the more recommended RTX 20xx, so you don't always expect it to work fast whereas the program might struggle with that GPU.
Is there a more optimized version?
Oh. There's Vonovox, a complete voice changer alternative, but this specific software is known to work best with RTX 20xx, so even if you get GTX 980 Ti to work it would likely struggle either way. So, "W-Okada voice changer fork" is your only hope to run locally. For full performance otherwise, consider for a more recent PC with better GPU (like GeForce RTX 3060) or an online website instead.
is it realistic?
i wanted to say if its sound like human and not robotic
The audio quality has to happen with chunk/extra settings and the RVC voice model itself, not just specific versions.
Weights was good until they removed model uploading
Is Gradio down?
Yo
How do you fix the res ms thing skyrocketting when you're using the voice changer in a VC
Like when I'm not speaking it's like res 14 ms, and when I do speak it launches up to 500ms and sometimes even above 1k, making my voice cut badly
I'm running a gtx 1650
I know that extra should be set at 2.7 not any higher as it'll cause cutoff issues, chunk is dependent on the person's GPU and pc as not all can handle lower values and does change in delay when lowering or raising the value.
I don't know anything about fixing errors caused by the person's computer or if they're missing anything like python, I also cannot help at all with the cloud versions because I know nothing about setting them up
But besides that stuff I mainly know that if they're not using wokada deiteris fork, wokada tg fork, or Vonovox they are using an outdated software
Probably stuff I'm missing I just woke up
guys
How can i fix the constant cutting in my voice
with the chat bot
i mean voice bot
???
Like the voice is cutting
i verified and no external sound exept my voice is
like hearing
I'm very confused, could you explain what you mean by voice bot
@distant hamlet i post it here because in general i cant post images.
This is my tensorboard of my first tries to train. With different settings (because me dum and a bit confused about applio settings).
And i have the feeling i get something wrong.
Thank you a lot.
Maybe this is a bit dumb to ask. But what applio settings would you use for a dataset of 1h 13 min that is with a sample rate of 48k and very diverse with normal talk and many emotional talk and noises.
I used BS-Roformer-Viperx-1297 in UVR5 to remove the background music. And in RX 10 i used this module chain (screenshot) to go over all the files. They raw audio data had background music and other noises and personally i hear a lot of this room/reverb in it a lot. So i wonder if that workflow i did is a good way to go. @distant hamlet
I guess since your dataset is so large, it works out. Maybe batch size of 10-12?
chain looks fine. how bad is the original audio tho? And are they all pieced audio from different sources? I personally prioritize finding best-quality audio to begin with > trying so hard to repair mid/bad-quality audio. unless it’s reallly hard to find audio then that’s an exception
The model of the orange line that was trained for 8h with batch size of 8. Was bad but better then the other ones with batch sizes of 16-22 that only trained for around 2h.
Epochs are something i wondering even after reading about it what it really means when you lets say train for 400-1000 epochs. More is not better i got that because of overtraining. But for that there is tensorboard and even in applio a setting to stop training after a xx number of not doing and better process.
But doesnt that not mean i could like always go for more epochs and just need to find and pick the best epoch or does a high epoch setting alone affects quality? No right?
I usually find de-reverbing from MVSEP (or other AI models) to be sufficient. RX’s deverbing might not be necessary. and wait, what’s loudness control? lol
gotcha. so how many epochs was the bs8 model at 90k steps? and how did it sound to you? that line looks like it was potentially continuing to trend downwards or just about plateau, but just evaluate from how it sounds
I think if you fine with a little wall of text....
I have a voice model that i created with merging 2 models in Okada.
I really like that model and that sounds really good but it is old. Like from 2023 and 2024 and it has this high pitch and ss/zz/hiss problem that it makes this weird ai voice sounds.
Since back then i wanted to fix that model but i used already existing ones. Sadly there was some sort of wipe back then? Not sure i was never much active in AIhub.
So i was not able to find the orginal uploader of the 2 models.
And since then try to find better base models. When there was still "commission section" i got some models made from the "Master model makers" but... to be honest they not reached the quality of this one random model i got before the wipe or what ever happened to that time.
They had to much hall/reverb or to muddy or had the same hiss/zz problem.
So tried to merge around with different models i thought could maybe help to what i want. I only could get a small upgrade from that.
In short now in 2026 i hoped there is better ways to go for my goal with like "pretrains" and using my 4090 to just do it myself now.
Back then i got told trying to make a model with emotional data is not a good idea and not works out well.
Now at least "gemini" kind of tried to tell me that changed or would be possible.
So i got raw data from youtube from the voice i wanted with like many different emotional sounds and talking and a lot of normal talking gemini said a 13 min of emotional stuff and around 30-40 min of talking would work.
So i used UVR5 and RX 10 with was i showed even manual cut A LOT of stuff UVR5 missed in RX10 and cut to long silent parts between talking. But only were i felt the pause was too long.
So the dataset is a big mix of different sources of the same voice.
help me
So gemini was telling me it would be good to "normalize" audio to bring it all to the same lvl ?
Like laughing or scream would otherwise be to loud and whisper to quiet for training. Because i was very unhappy with the first try of training and checking the dataset i felt like... it still had to much of strong reverb and this noises of for example a PC being near the mic.
So the 2.0 cleaning gemini suggested "loudness control"
that would be around 250 epochs from what i see
if i visit the website the execution instantly got terminated
would be just like this after it finished lanching and it got terminated
gotcha, I figured it was some sort of compression or normalization. which is good
looks perfectly normal then. have you tested to hear how it sounds at 250e?
So the sound was... a bit hard to explain...hmmm
There was glitchy noises (hmmm sounds for example) and had the hiss problem.
But i could hear the emotional talk quality and so on. Sorry not sure how to explain it the best.
But the biggest problem was :
- glitchy voice sounding
- hiss problem
- metallic/reverbing?
any inference audio sample you can show? you can DM me if you want
why is this paused now? how do i use it again?
the 2.0 cleaned dataset i trained with 16 batch size.
Should i try go down to 8 and normal training settings?
Was my idea of a lot of emotional audio in the dataset maybe a wrong idea? Emotional talk i mean like :
- funny weird mouth noises that for example some vtuber do.
- whispering
- burst of laughing or other noises
i manual cleaned and removed what i felt could be to bad from quality or maybe not work well. But maybe it is still contain to problematic sounds?
There is a mix of talk from like 2022-2023 and from 2025.
So there is a little difference from for example accent of the voice. But my goal is a own voice with 2 different voices so i not care for accent.
It only would be a problem if that means using audio from different times of the person voices with different accent and maybe a small change in audio from maybe different mic setup hurt the training a lot.
i send you one one moment
sup guys, so, Im trying to run qwen image with the upscaler fine tune. I got a rtx 3060 12gb + 32gb of ddr4 ram. Now, I tried both the quant4 and quant8, nothing, im always hitting OMM, like Im probably doing something wrong, because what I wanna do is: I run the model on the ram, and then I offload some layers to the gpu, this works fine in text models, but vision models... idk, they just dont wanna work. Any recommendations?
-vonovox
-help
!vonovox
Bro. Can someone give me this disc's vonovox repository
The hell. Make sure to read help guidelines before start asking. What is your PC GPU? And did you follow any tutorial before?
The "repository" in question: https://github.com/dr87/Vonovox
Assume you have NVIDIA GPU since you never stated anything about it just demanding for the voice changer.
Why do you sound aggressive about it
It was just a question bro
You dont have to reply if you dont want to
Because literally everyone thinks I sound angry when I never meant it. 
For "free" Colab users, Google Colab can disconnect your current runtime (especially the one with Web UI) anytime, which is an expected behavior. With Colab Pro, you could run Applio RVC UI or any Web UI notebook on Colab without a problem. This is not a bug or defect that persists with Applio RVC UI Colab notebook, it also happens with certain notebooks that use Gradio, ngrok or other Web UIs like "W-Okada voice changer" as well.
The same issue** doesn't** happen with Applio RVC no-UI because this specific notebook doesn't use Gradio or any web UI codebase, certain features/commands are separated into code cells like this.
This file was blocked because files like this from the internet arent safe
how do i fix
@hallow thistle
What is this file supposed to be?
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
check my forum it has the full image
Full GPU Name: AMD Radeon RX 7900 XT
Operating System: Windows 11
Detailed Description: I use the basic W Okada client but wanted to see if i can maybe use something that can work better for AMD but ended up making it worse lol just wanna know if i messed up or if my system is cooked and keep with what ive been using (when i say worse i mean my system lags and my ms is INSAINE i cant get it to lower at all even at chunk at 800ms)
Tutorial Used: https://github.com/deiteris/voice-changer?tab=readme-ov-file
On your b2332 W-Okada fork, try these settings:
Chunk: 128 ms
Extra: 2.7 s
GPU: AMD Radeon RX 7900 XT
okay
its laggy still it keeps being "close" to the perf but when i raise it it will only keep going up
that looks old
so if i match 208 i would need 250 and etc
oh?
i clicked the download link it said as recommended
are you using wokada deiteris possibly?
yea
oh nvm ur good
oh did i fuck up
What do you mean you're running the older version like v.1.5.3.18a?
i wont lie im not sure where to check to see what verison i am on


On top right of voice changer interface, it's obvious.
You set GPU to AMD Radeon(TM) which is an integrated GPU, not the mentioned AMD Radeon RX 7900 XT which is a dedicated GPU.
Simply, a dedicated GPU is better than an integrated one.
i saw that too yee
im just trying everything mbmb
this is the ms im at with the right gpu^^
Set F0 det to "rmvpe_onnx".
Try set crossfade overlap to 0.15 and enable "force fp32". These settings are not needed but useful if you want audio quality to get higher.
thank u so much!!
went through the normal process of setup with colab applio, and when i open the public url the colab "completes" the task, making gradio unusable... is this a bug?
ask ai
both chatgpt/gemini can guide you through that
well. trying to work with chatgpt on this but i'm not exactly getting very far. i don't really code things, i just like using colab because it's usually simple..
ask AI
ai's your best helper in 2026
whats ur issue mainly
ill get an answer for u
I don't believe that.
Ok, AI bro.
according to gpt at least, the colab is cancelling the gradio public link upon opening it.
in as simple terms as i can explain, i click start server, i wait for the public url, i click the url, gradio says a session isn't open, and coming back to colab it displays as "complete" with a check mark.
atp gpt gave me a bunch of code to add so i probably already need to refresh anyway. idk why i even tried to see if it could help at all.
again, i don't code, and whenever colabs have bugs like this i'm usually at the mercy of other people coming out with new ones or something.
i've been using the same one (afaik) for months without issue.
it's uh.. linked to the applio colab guide.
Try delete your current Applio RVC notebook, and then re-import the link https://github.com/IAHispano/Applio/blob/main/assets/Applio_Kaggle.ipynb to your account again.
alright, tried that, didn't seem to make any difference, but i also don't know if i deleted it properly. it ran as if it was still in.
i could try a different google account maybe?
i have spares, i'm a free user and i hate time limits
i swapped accounts and it worked, so... i have no idea why it happened to that account in particular
either way i guess i'm all good now.
Is there a good tool for voice/speaker seperation ? Like if i want to remove TTS or 2-3 other speakers and only want one specific voice? I used WhisperX with a little script but i was wondering if there is a more easy way or maybe even better working then what i use.
Anybody?
why is it so delayed or sometimies i aint even hearing it
The tensorboard extension is already loaded. To reload it, use:
%reload_ext tensorboard
Reusing TensorBoard on port 6006 (pid 3062), started 0:02:05 ago. (Use '!kill 3062' to kill it.)
Ngrok URL: https://hiltless-marbly-brianne.ngrok-free.dev
WARNING:ngrok.tunnel_ext:error connecting to upstream error=Connection refused (os error 111)
An error occurred connecting to Discord: Could not find Discord installed and running on this machine.
- Running on local URL: http://127.0.0.1:6969
i'm getting an error like this, what's the reason?
/content/Applio
An error occurred extracting the index: need at least one array to concatenate
If you are running this code in a virtual environment, make sure you have enough GPU available to generate the Index file.
huh?
same here
/content/Applio
No wav file found.
/usr/local/lib/python3.11/dist-packages/torch/utils/data/dataloader.py:626: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(
Not enough data present in the training set. Perhaps you forgot to slice the audio files in preprocess?
An error occurred extracting the index: need at least one array to concatenate
If you are running this code in a virtual environment, make sure you have enough GPU available to generate the Index file.
how to addapt this ai voice?
What AI voice?
Like how to use the ai voice in the game
No, but like is it about Applio RVC (non-realtime) or W-Okada voice changer?
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
@hallow thistle is there a version of replay that works?
No. Applio RVC and UVR5 are only your hopes.
what's UVR5?
two questions: 1, can this app be used on PC? and 2, is it possible to train rvc models with it?
How do you not know how UVR5 supposed to work?
UVR5 doesn't use RVC voice model. It's a program that separates audio into stems.
can you make AI covers in UVR5?

i don't understand this answer. just say yes or no.
UVR5 is not your typical AI cover maker program, the what.
in an online method of Tg Develop's W-Okada (Kaggle), the option to use server is greyed out, is there a reason why?
No, use applio for that
Anyone?
I need some help
Hello everyone.
I’m testing inference in Applio with models trained by me using SPYN V2 as the embedder.
I’ve noticed that if I leave everything at the default, like the index at 0.75, the voice sounds very close to the original, but the pronunciation of the phonemes becomes a bit strange.
I also notice this with models trained with cvec: in Applio I need to tweak things to try to make it sound better, while in Replay, for example, it already sounds perfect, with correct pronunciation.
I’ve heard comments that Applio doesn’t seem to be very good for inference, but it’s the only one I know with support for SPYN V2.
Has anyone experienced this and has any suggestions?
If I lower the index to a value below 0.75, for example 0.3, the pronunciation gets better, but it loses some of the voice characteristics.
nothing can be done, it's a side effect of the index files
Yes, but when I do inference in Replay with a model trained with Cvec I don’t notice this, and the index practically stays at the same value…
Although there is a difference, with SPYN V2 and the index at 0.75 in Applio the voice sounds very close to the natural one, but both with SPYN V2 and with cvec I need to make adjustments, whereas in Replay I just run inference.
Well, at least that’s been my experience with inference using both.
Replay still doesn’t have support for SPYN V2 to compare.
maybe they're not really using the index file
bc index files will always have some negative impact in the pronunciation, it's not normal to have perfect flawless pronunciation with them
It could be… It’s strange, because when I import the model it shows that it has an index. But it makes sense, because when training in it the index is not generated.
Do you think it’s better not to use it to mitigate this problem?
yep, index files use a quite old technology (circa 2020?), they're good at increasing the similarity between the model and the dataset but they cause weird/bad pronunciation sometimes
@analog obsidian can u help me
i think its a simple problem but i cant seem to find a solution too
please ping me when ur here so i let u know what i need help w
okay, i have a problem where im on intel gpu so i downloded the vc client win std beta, i extracted open file opened dist and i opend the start_http in the dist folder and it didnt load the voice changer all it said in the terminal was
C:\Users####\OneDrive\Desktop\ai voice changer\dist>main.exe cui --https false --no_cui True so i need help loading the voice changer
@craggy bough
did you get it from here https://github.com/tg-develop/voice-changer/releases/tag/b2397
Last update: November 22, 2025
yeah thats pretty old
this ones recommended
ok i need help with smth else now
and
Intel(R) Core(TM) i3-1005G1 CPU
is that reccomended to download vc client
gpu matters more
okay my gpu it doesnt say, it says intel uhd graphcis
graphics
ok nvm
use one of the cloud options
Last update: August 5, 2025
they're so cooked
where that at
you havent seen my hd4000 yet
idk which one to pick, i downloaded vc client for girl voice changer
i js followed a tut
turns out its outdated
ew don't help them anymore
what
dont follow tutorials cuz theyre all outdated
and its for people with proper gpus
what contraption is that lol
okay thank you for being helpful
Last update: August 5, 2025
hamburger menu, realtime voice changer, cloud
i swear this happens every time i use kaggle and i forget how i got around it all the other times
where do i get the zip files for these?
why is it not letting me upload any voices
Are you sure you wanna use this specific W-Okada version that made to use Beatrice voice models especially? Because Beatrice model is rare in "AI Hub by Weights" here. "RVC (retrieval-based voice conversion)" models are more common since they give significant better audio quality, especially RVC v2.
W-Okada voice changer or Applio RVC? What is your PC GPU? And did you follow any tutorial or guide before?
For help, use "@ helper" instead of a specific user. 
HI! I have a question regarding training RVC models, more precisely the choice of sample rate. So far I've been mostly sticking to legacy core 1.5 48kHz pretrain so that was my dataset's sample rate, but I wonder if there's a chance the models would train better on lower frequency samples (with appropriate pretrain, e.g. 32/40kHz legacy core).
Is sticking to 48kHz alright, or should I rather choose a minimum sample rate that fits the frequency spectrum of my dataset input? I haven't checked it so far so perhaps my input samples don't actually utilize the entire 24kHz frequency range
That is ancient technology bro
I can see the dust on it
What gpu do u have
models train 32k better since hifigan is not great for 40k and 48k, i know this personally because i train pretrains and i know what rvc is capable of
and its just bad for high sample rate, too many problems, like worse breaths and esses
That's some great info, thanks!
you should use wokada tg fork, I'll get you the links
thank you brooooooooooo
u will need the first one to connect your voice changer to games or discord
ok
and the other two are for wokada tg fork
for vac lite run setup 64 after extraction and for wokada tg fork after u download both extract 001 then place 002 into the folder of 001 since it cannot be extracted but needs it anyways
needs both to work but idk why
so its making me reinstall vb cable?
no? it's a different one that works the same but doesn't cause issues on windows
vb cable isn't as recommended since it causes weird issues on windows sometimes
?
extract 001 and then place 002zip file into the folder of 001
after u do that just run mmvcserversio in the folder of 001
wait theyre the same thing no?
they're two different files
oh ok
where did u find this new version from because i still look here for the main things
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Deiteris' fork (modified version) of wokada that doesn't get updates anymore. GUIDE
For Windows Nvidia, Both Wokada Tg-Develop fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Tg-Develop Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
then u can find the three main voice changers used now
I would recommend you vonovox since your gpu can definitely handle it but it's currently in a beta stage adding new things and I'd rather give u the newest version of it than the current version that has less features than the beta
how do i pick that?
what?
the vonovox
if you'd rather use that I can give you the download
you'd need to delete the one I just gave u tho to save on space
whats the difference
there's too much stuff but it's basically the best one of the three
oh ok
if you'd like to use it here's the download
https://github.com/dr87/Vonovox/archive/refs/tags/v1.6.9.zip
just run setup first then run start
they're .bat files
do i extract it
yup
ok so how is it the best?
it produces more natural speech, has a lot of different features to block out bg noise messing with the model so weird sounds don't come out or give random voice cracks
there's an optional paid effects like adding reverb or a 8bit effect although you can easily bypass that by using a DAW like fl studio or just using voicemod
it's mainly to support the creator which is always nice
the third one is a little outdated compared to the othe two, it's wokada deiteris fork
basically the original wokada but made more up to date code wise and has a few quality of life features, tho the other two have them as well
depends, which one are you going to use
alrighty, after the setup is done run the file called start
ok it worked thanks so much gangyyy
you're welcome!
if u need more help ask me here
btw to import a model just download one from here, extract and press one of the empty slots and insert the .pth file
and underneath is a box that says index I believe
import the index file for the same model in that
yep
im finishing revamping applio notebook ||which wont be even accepted but idgaf, they dont want autotune infer fixed||
and i wonder which badge set is the best
i think 2, 3* or 4
* - uses website colors
anyone know if https://colab.research.google.com/github/iahispano/applio/blob/master/assets/Applio.ipynb?authuser=1&pli=1#scrollTo=IlM6ll0WDuOG
is having issues? i keep recieving "No interface is running right now" message and can't use it
Say something.
Applio RVC can train a voice model. I don't know what to say about this.
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 4060 8gb vram desktop) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- ANY illegal activities.
- NSFW/Porn.
Requests for these topics may be ignored, not helped and result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
- Don't Ask To Ask.
I think they're an NPC stuck repeating the same thing
im here
Voice Changer Client Demo
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Deiteris' fork (modified version) of wokada that doesn't get updates anymore. GUIDE
For Windows Nvidia, Both Wokada Tg-Develop fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Tg-Develop Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
first guide is vonovox
I figured it could be the thing itself
is the best of those three
awsh man i cant send media
but does it say
For Windows
Download this: VAC Lite (Virtual-Audio-Cable by Muzychenko)??
is there a step by step on how to set this up completely?
Last update: November 21, 2025
dis guide
:D
yess ofc
ill see about what i could do and hopefully itll stop making me sound like a robot
im sure it will, its a lot better code wise
@tame oracle ive gotten far to the point where the files is already exported and i installed the "setup64", what now?
for the VAC lite?
yes

Words alone won't always imagine as image. For better understanding, there's screenshot.
That's not the actual program, it's "help" doc for Virtual Audio Cable.
where do i find that?
This is the actual control panel for Virtual Audio Cable.
OHHHH
No, you don't need to set anything in this control panel.
okay so im basically set?
This is what you gotta do. https://cdn.discordapp.com/attachments/1159290139609137264/1459041990804115497/image.png?ex=697056be&is=696f053e&hm=f9f501b5af541dd9aa4c577d76111b767982c91367065446636f86a7a9b3546a& https://cdn.discordapp.com/attachments/1159290139609137264/1459041991290650760/image.png?ex=697056be&is=696f053e&hm=f9cad34a51bd2908bbcab47a043851feaab807b00b82f640845f114a2b9e6797&
All of them are now checked
is there an option where like
okay im going to sound so slow but like possibly add the models?? 😭
This is Tg Develop's W-Okada fork.
This is where you upload a voice model to the voice changer.
so i have to go to that website?

No way, you're looking for Vonovox, not the W-Okada fork.
By the way, if you wish to stay for Vonovox, I can help you a bit about Vonovox. But if you'd like to try that one W-Okada fork, the easier interface, I can send you links.
Hi guys, Ive spend all dat yesterday to try to find cheap api to kling models video provider. Does anyone know the workflow or any api or apy way to get kling video generation price less than 0.5$ per 5 sec? 🙏
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
anyone know how to fix this issue of fuzzy output? I'm new so I don't really know what I'm doing yet
RTX 4060 laptop GPU
I'm using airpods but nothing is using airpods input, so I know it's not bluetooth distortion
This is the voice im using:
https://discord.com/channels/1159260121998827560/1403422896680079390
So I've got the "generating buffers" thing goingon and I can't find a single way to fix this on discord or online
Anyone have any actual assistance with this?
Hey guys
I'm new here
I joined because I don't have a good idea of how gen ai models work, but find their use of copyrighted materials problematic
I want to create software that efficiently and quickly poisons Gen AI through encoding music, audiobooks, images etc
Does anyone here have an idea of how I could go about doing that?
Hey guys any quick help I installed via 'W-Okada Fork Guide' but whenever I try a model my voice never changes anyone has a fix ? GPU 5070TI
which guide did you use?
hmm I'd switch to either vonovox or wokada tg fork tbh
both are better than deiteris
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Deiteris' fork (modified version) of wokada that doesn't get updates anymore. GUIDE
For Windows Nvidia, Both Wokada Tg-Develop fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Tg-Develop Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
ill try rn
vonovox is currently the best of the three but I personally use tg fork for the fast model swapping time
if u do use tg fork u will need the 001 zip and 002 zip
i'm trying vono first
alrighty, all u need is just to download the current version 1.6.9 and run setup first, the run start
ty bro
np!
also i've seen people talking abt malware inside of w-okada soft were there triping or what ?
if it isn't from here it most likely is a scam
a lot of people are scummy and try and trick people with versions of old wokada that have viruses
these are the only trusted sources
yeah i figured
alr ty
Is there a standardized regular speaking audio sample for both male and female voices that can be used for testing inferences?
I'm looking for something that covers a good amount of speaking sounds just to know if a model works well
what version of wokada are u using?
also u cannot upload it if it's a json file sadly
not really but I do have these
that's ollllllld
Thanks for these, it's good to know I'm not the only one just ripping audio clips off whatever shows up on YouTube
no problem! if you'd like more talking audio or just singing samples I got plenty
what gpu do u have btw, since ur using a really old version of the voice changer u should upgrade
oki doki
What’s this new model you speak of?
I’d be really interested to learn about it
failed experiment
What’s the latest architecture you recommend
Specifically for singing models
Also what’s the other rvc server you speak of
