How to hear myself | AI HUB | Page 1

lucid python Jan 16, 2026, 6:55 AM

#

GPU: AMD Radeon RX 5700 XT.

Operating system: Windows 11.

I can't hear what I say at all, I followed the guide of wokada in discord as best as I could and I'm still not able to hear my audio at all

calm knoll Jan 16, 2026, 11:44 AM

#

lucid python GPU: AMD Radeon RX 5700 XT. Operating system: Windows 11. I can't hear what I ...

On your voice changer, chunk number is set way too high, making the audio to delay approximately 2.1 seconds (approximately 2100 ms), so try reduce chunk value down to 130 ms. Set Extra value to 2.7 s. Check if your "monitor" output is your speakers or headphones to hear the program.

lucid python Jan 16, 2026, 11:55 AM

#

still wont work

#

and monitor is my headphones

calm knoll Jan 16, 2026, 11:56 AM

#

What about your full screenshot?

lucid python Jan 16, 2026, 11:57 AM

#

calm knoll Jan 16, 2026, 12:00 PM

#

The program is working, look up performance stat at top left corner screen. What do you mean you can't hear the program?

lucid python Jan 16, 2026, 12:06 PM

#

it judt doesnt work, how can i show you

#

is it supposed to be working by the stats?

#

i wwitched to server and now i can listen but it hss bad quality

calm knoll Jan 16, 2026, 12:18 PM

#

Can't help with an E-gurf voice model, but try another anime voice model from #1175430844685484042.

#

Set F0 to rmvpe_onnx, not something else.

lucid python Jan 16, 2026, 12:19 PM

#

it works in server kinda, client doesnt tho, whats the difference about server and client

#

do you got any advice to make it sound good or should i test it myself

#

it sounds a bit robotic and i got echo s little bit

calm knoll Jan 16, 2026, 12:23 PM

#

Or try this newer W-Okada fork. https://github.com/tg-develop/voice-changer/releases/download/b2397/voice-changer-windows-amd64-dml.zip

lucid python Jan 16, 2026, 12:30 PM

#

can i keep both versions or do i got to delete the current one

calm knoll Jan 16, 2026, 12:30 PM

#

Extract a newer one to a different folder, while also keep the older one if the newer one fails just in case.

lucid python Jan 16, 2026, 12:35 PM

#

I'll test it in some hours, if it goes good ill just leave this as solved

#

By the way, does a better mic improve the voice quality?

calm knoll Jan 16, 2026, 12:39 PM

#

That's one myth about the voice changer. A better microphone doesn't improve the generated voice audio quality at the post-result, though a better-working microphone can sometimes suppress a bit of background noise that coming to microphone, regardless of microphone price.

lucid python Jan 16, 2026, 12:42 PM

#

So it's more important to look for noise suppression microphones with an alright quality instead of a mic with insane quality?

#

Though last ones may also got nice noise supression

calm knoll Jan 16, 2026, 12:50 PM

#

A physical microphone itself doesn't have its own noise suppression feature, although the noise suppression/echo cancellation is often done through software-level settings. Some slightly expensive dedicated microphones have their own AI noise removing processing unit built-in, but as what I said these devices will be more expensive than the non-AI ones.

lucid python Jan 16, 2026, 12:52 PM

#

Do you got any advice for people with noisy mics? I feel like my mic doesn't have good noise supression

#

Or how should I manage this issue

calm knoll Jan 16, 2026, 12:55 PM

#

By the way, the audio sounding robotic sometimes has to happen with the voice model itself, even if extra value is set to recommended 2.7 s on W-Okada fork. While the mentioned voice model doesn't always mean it was badly trained, I once heard that "low" dataset audio in RVC voice model training might be the issue.

lucid python Jan 16, 2026, 12:56 PM

#

Oh maybe, but if that's the issue I'll end up using others, right now I just downloaded one for testing, but ty, I'll try the newer version in some hours though client probably still won't work

#

Is server better than client or what's the difference?

calm knoll Jan 16, 2026, 1:02 PM

#

Audio modes in voice changer: "Client" is simpler, lets you use the voice changer's noise supression and echo cancellation options, although generally has the higher audio latency because it mainly uses older "MME" audio API. "Server" one is more complex but flexible, lets you pick any sample rate, audio API (MME, WASAPI, ASIO), although sometimes the program fails if set to WASAPI and a mismatched sample rate.

#

When you set audio mode to "server", these noise/echo suppression options will grey out and unavailable. This one is a known quirk in every W-Okada version.

lucid python Jan 17, 2026, 12:02 AM

#

should i add index and path if a model has both

rigid meteor Jan 17, 2026, 12:33 AM

#

The index changes how the voice's speech sounds and how much this blends with your own voice.
It is optional
index files tend to be large, and, if you're going to use them you may need to adjust the chunk size to compensate, since it requires more time to process the voice with it enabled.
Only the pth file is required.

lucid python Jan 17, 2026, 4:30 AM

#

it runs good but now when i talk it cuts off last second of the phrase, or sounds robotic in the end

#

a lot of the time thats happening

#

nvm its robotic

rigid meteor Jan 17, 2026, 5:33 AM

#

If you changed the extra, chunk size or crossfade, you should swap your processor once to cpu, and then back to gpu0 to fix a bug

#

that should help with lagginess, choppiness and huge amount of processor usage

#

for robotic sounds... idk, it could be the model, it could be a filter, it could be software
you might be able to fix some of it by speaking more loudly into the microphone or adjusting the input volume
you can try to fix some of it by adjusting extra or crossfade, but don't forget to swap processor (cpu and then back to gpu)

#

it might help to lower protect to 0.30 or lower

#

but do also check your microphone settings; it's possible its caused by some setting.

calm knoll Jan 17, 2026, 6:57 AM

#

arissip

calm knoll Jan 17, 2026, 7:12 AM

#

lucid python nvm its robotic

There are some workarounds, simply, if you still care about 100% perfect audio quality. Click "stop server" at first, go to advanced settings, set "Crossfade overlap" to 0.15 and set "Force fp32".

lucid python Jan 17, 2026, 10:36 AM

#

Ty I'll try this, by the way does internet connection matter at all?

lucid python Jan 17, 2026, 10:37 AM

#

rigid meteor for robotic sounds... idk, it could be the model, it could be a filter, it could...

It is true that I didn't do this but in the earlier version I used it had a warning of this bug and in the newer one I didn't have one so I thought this bug didn't apply anymore

calm knoll Jan 17, 2026, 10:47 AM

#

lucid python Ty I'll try this, by the way does internet connection matter at all?

When you run W-Okada voice changer locally, internet connection is** required** to download some files, meanwhile most actual functions (like converting your voice in realtime) generally don't need internet. The online cloud options (like Colab and Kaggle) however require internet to all function, not just the voice changer that runs within the service.

#

As much as I implied, the internet has nothing to do with the audio quality. Aside from simple settings and an RVC voice model, how do you know if something else has to do with the audio quality on voice changer?

lucid python Jan 17, 2026, 11:38 AM

#

calm knoll As much as I implied, the internet **has nothing** to do with the audio quality....

Do you mean if there's anything I believe may be causing bad quality?

#

If so, maybe my mic in my headphones are kinda mid

#

Should I record and send it here?

#

Like an .mp3, .wav or wtv

calm knoll Jan 17, 2026, 11:44 AM

#

Still believing like that?

lucid python Jan 17, 2026, 11:45 AM

#

Wdym

calm knoll Jan 17, 2026, 11:47 AM

#

Do you still think your microphone is the issue in all of this? We've had a discussion earlier.

lucid python Jan 17, 2026, 11:49 AM

#

Yes I remember I thought it kinda influenced the input but if you say so I'll stop focusing in my mic, though should I send a test audio? Reading whatever?

#

Maybe I'm tripping and it's not that bad but im not sure I'm not an expert

rigid meteor Jan 17, 2026, 5:14 PM

#

It could help see what exactly you mean with robotic noise.
The voice changer should only be reacting to voice though, not to noise. I'm not sure how your microphone by itself would be causing robotic sound. It could be caused by background noises instead and I noticed some lower quality models might cause robotic like sound to be put out with certain sounds, or at random intervals when using the index.

#

Some people speak really softly; whispering I mean. Some models cannot handle whispers and then produce robotic like sound in its place. That is why I mentioned that speaking louder might help.

dense pulsar Jan 17, 2026, 5:58 PM

#

a better microphone improves the quality of the model due to the input being resampled to 16k

if your microphone sounds muddy and bad at that sr then the model will struggle and have different problems like bad pronunciation, robotic sound, etc

but if your microphone is clean and high quality then the model (more specific, the embedder) is going to have a more easy job translating your voice to the model's voice, giving better results

some models are just bad and robotic tho, that cannot be fixed

#

ah and the index file it's just a file (xD) that stores the accent found in the dataset

index at 0 makes the model to use the accent/pronunciation of the source audio (your voice)

if you set a value higher than that the model is going blend the index file pronunciation in the result

#

but in realtime ive heard most of the time screws up pronunciation, it's more useful in non realtime infer

lucid python Jan 19, 2026, 8:57 AM

#

rigid meteor It could help see what exactly you mean with robotic noise. The voice changer sh...

my bad for that slow reply, heres a demostration, dont mind what i read it was a random newspaper, but you will get quickly what happens

rigid meteor Jan 19, 2026, 10:58 PM

#

It does sound like an issue with the model; it for some reason has trouble determining the vocal pitch seemingly.
Can you try a different voice model and see if you still get this odd effect?

#

https://discord.com/channels/1159260121998827560/1175430844685484042

lucid python Jan 20, 2026, 12:15 AM

#

This is the only girl voice I've seen that is good

#

I feel like I found the solution slightly atleast

#

I got to talk more like clear and calmly

#

If I speak too quick or too high or low it glitches

lucid python Jan 20, 2026, 12:17 AM

#

lucid python I got to talk more like clear and calmly

And slightly loud

#

Not too loud just not whispering or speaking unclearly

rigid meteor Jan 20, 2026, 12:29 AM

#

One tip would be; try to sit in front of the microphone, with your mouth I mean
and figure out from what angle the sound enters it best; some microphones have an odd angle. The backside will be less loud than the front.
You should be able to be heard clearly with about a fist distance. (2 to 6 inches)

With a headset, keep the microphone very close to your cheek, at the corner of your mouth.
Also, try adjusting the microphone volume or gain if you can (on the hardware), if not you can increase the input volume on the voice changer, or use effects like the audio compressor to force it to take your voice in louder.

lucid python Jan 20, 2026, 12:49 AM

#

What are voice effects

#

They're useful?

#

I thought they were a random feature

rigid meteor Jan 20, 2026, 1:11 AM

#

Personally I prefer to use OBS's filters, which do the same thing.
Voice Effects are basically adjustments to the audio received or send.
Gain for example, makes you louder (or less loud if you do negative decibells)
Limiter prevents your loudness from going over a certain value
A compressor effect is basically a conditional effect, which causes audio to be pushed less loud or louder if audio is detected at a specific decibell level.

#

In the past these types of 'effects' were only done by people through expensive hardware, or through expensive VST plugins.

#

or by programmers, if they know their way around complicated tools that adjust the audio driver directly (well almost)

lucid python Jan 20, 2026, 1:15 AM

#

Are these placed like along with the chunk etc settings or where are they located visually

#

Within the menu

rigid meteor Jan 20, 2026, 1:43 AM

#

Scroll down

#

calm knoll Jan 20, 2026, 1:56 AM

#

lucid python Are these placed like along with the chunk etc settings or where are they locate...

This is where the "audio effects" located. Click on "+" button to reveal "add audio effect" pop up.

lucid python Jan 20, 2026, 2:38 AM

#

where shoukd i put this output or input

calm knoll Jan 20, 2026, 2:38 AM

#

Output.

lucid python Jan 20, 2026, 2:42 AM

#

is there any recommended settings for compressor

calm knoll Jan 20, 2026, 2:44 AM

#

While I know how "compressor" effect works in audio engineering, I'm not sure how to explain about this one.

lucid python Jan 20, 2026, 2:45 AM

#

also how can i remove the effect, it wont let me

#

just in case

calm knoll Jan 20, 2026, 2:47 AM

#

lucid python also how can i remove the effect, it wont let me

Click stop server, scroll down, and you'll see a red trash can icon on that FX.

lucid python Jan 20, 2026, 2:48 AM

#

its like merging with tge ui

calm knoll Jan 20, 2026, 2:50 AM

#

lucid python its like merging with tge ui

If you resize your browser too small, the voice changer UI would look too squeezed.

lucid python Jan 20, 2026, 2:55 AM

#

oh yh fixed now

#

is this a common issue or why cant the ai say hello or some specific words are like harder

calm knoll Jan 20, 2026, 3:02 AM

#

lucid python is this a common issue or why cant the ai say hello or some specific words are l...

This thread gonna be a long one, one issue initially solved, then another one going in and so on. aronathink

lucid python Jan 20, 2026, 3:03 AM

#

my bad

rigid meteor Jan 21, 2026, 6:10 PM

#

lucid python is this a common issue or why cant the ai say hello or some specific words are l...

Depends on the model and how it was trained. Some models have certain tones louder than others even if you clearly speak the same volume. You can fix this with the compressor effect, but don't forget to add gain since the compressor also lowers the output volume slightly.

#

If you feel like you sound too soft, you can also use an expander, but be a bit careful where you place the threshold when using both of these. Your audio only needs to be somewhat more equal, not completely equalized. If everything has the same loudness than you won't sound natural.
EQ can help with things as well. Humans have by default higher midtones than low and high ones; you can make it close to eachother by adding or removing some volume (decibel)

#

A good Equalizer and Compressor setting wll make you sound like as if talking in a studio, but you'll need to find the correct settings for each voice. Unfortunately Audio Effects are shared, which is a flaw in tg-develop's design.

rigid meteor Jan 21, 2026, 9:13 PM

#

lucid python is there any recommended settings for compressor

the settings depend on the voice model
the default settings can be okay, but you may want to make the threshold smaller, reduce the attack and increase the release values so that it acts quicker and compresses more.

warped mulch Jan 26, 2026, 3:54 PM

#

can someone help i got a good gpu

#