#✨│ai-help
1 messages · Page 328 of 1
What is your PC GPU? And what do you use the voice changer for?
5070ti
nvidia
nvidia gpu
im trying to use the voice changer
theres no issue i havent started
Roleplay, girl voice or something? 
for video/ fun not really girl voice
There are Vonovox and Tg Develop's W-Okada fork, these are known voice changers that can work with GeForce RTX 50 series.
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project.
A Realtime Voice Changer with similar performance to Vonovox & Wokada Tg-Develop Fork, with extra features.
Deiteris' fork (modified version) of wokada that doesn't get updates anymore.
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
which ones the best?
Vonovox gives better audio quality, while Tg Develop's W-Okada is easier to use. There are trade-offs.
for nvidia I suggest vonovox
in the vids i was watching it said they were free are they?
anything off yt for voice changers is outdated, but yes they're free
yeaa ik one of them showed this dc so im asking in here cis there all 2 years old
how can i go about downloading vonovox?
1 moment I'll get the download
Download Vonovox. https://huggingface.co/dr87/vonovox/resolve/main/Vonovox_beta_17_11.zip
you need this too
https://software.muzychenko.net/freeware/vac470lite.zip
you said vonovox wasnt really simple so is there any videos
Last update: March 30, 2026
"Many Effects are Premium (paid), such as Low Quality Mic" is it mostly like this? i just want to use models from the models channel not really any effects will i be fine
yea
Most core features in Vonovox are free, like RVC voice model. "Paid" effects are optional, not really needed. 
those r optional
voice changer
i downloaded it and its open but how do i use it just to hear myself
how can i delete vonovox and voice cable so i can redownload it
Patient. Unlike W-Okada though, Vonovox only has one output device. You either set output device to "your speaker" on Vonovox or hear yourself on Discord. https://cdn.discordapp.com/attachments/1159290139609137264/1446358776587489351/image.png?ex=69d92654&is=69d7d4d4&hm=d4e3afdbe81daa1f372c0a8f4f8294c7f48fa662c501236c19e47a9253260a25&
i set it to my speaker and line 1 and still couldnt hear myself ond discord
I don't know, I can't identify an issue from your words alone. Send your screenshot to here.
Did you know? "Exclusive mode" is an audio mode in WASAPI/ASIO that makes a sole program (like Vonovox) as the only program to output sound while mutes other programs at the time if they all on the same audio system. It's better to set this mode off.
when i go on discord and set my speaker to line 1 i see it moving but cant hear anything and i tuened exclusive off
shows thiss
Why Vonovox works for others though? Did you follow the guide I sent to you at least?
yes
Send your full screenshot of Vonovox.
You set input device on Vonovox wrong.
thats what you said to put it to
it only works when i set it that way and then do the opposiute onb discord
On Vonovox: input is microphone, output is Line 1.
On Discord: input is Line 1, output is speaker.

it only works for me when i do opposite of thisd
Elaborate?
when i do line 1 as input on vono and then line 1 as speaker in discord it works kinda
That's not how it works.
its the only way i can hear myself with the voice changer for me
the pitch is off thouggh is there any way to fix it sounds high pitch i tryed 2 dfifferent ones
To hear yourself on Vonovox, you set output device to speakers, or go to Windows' Sounds settings and do this.
how can i delete everything so i can restart
what about voice cable
If you made the same mistake for another time, you should question yourself. I was giving the most agreed approaches, you're literally doing opposite.
i did exactly what you said it didnt work agan the only thing that worked was doing opposite
If you set "Line 1" as output on Vonovox while set "Line 1" as input on Discord, this is correct. But when you set "Line 1" as speaker on Discord while set "Line 1" as input on Vonovox, this is incorrect because you're gonna send all those Discord sounds (including ping sounds) to Vonovox through Line 1, not Vonovox to Discord as intended. You're just confused, bud.
If you don't believe me, you can ask fellow members who used voice changer here. 
i do just dosent work for me
One message removed from a suspended account.
This AMD Athlon CPU (released in 2018) isn't really that old, though positioned below AMD Ryzen 3. AMD Radeon Vega 3 is an integrated GPU, so probably skip that. Do you mean like you want to run the voice changer as CPU-only? Because of course it gonna be slower.
One message removed from a suspended account.
which rvc related program?
Elaborate:
- your pc os
- what are you trying to do: AI Covers, TTS, E Girl Trolling / Catfish or Roleplay
can someone help me'
This is a General AI Discord Server, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: LLMs, AI Covers, TTS, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
ive downloaded evberything and tried to open MMVCServer and the cmd prompt comes up to download stuff but ive done that 3 times now and the actual rvc prompt hasnt came up yet
4060 gpu
i5 8400
windows
hey are EaseUS Voicewave and Voice ai good?
I want to have differnet voices that sound well and not 2022 choppy. both male and female
Why male and female voice models? 
FivemRP
ive downloaded evberything and tried to open MMVCServer and the cmd prompt comes up to download stuff but ive done that 3 times now and the actual rvc prompt hasnt came up yet
4060 gpu
i5 8400
windows
we don't really do anything but humans
Check out Vonovox. This voice changer gives better audio quality that any W-Okada version.
@hallow thistle
is there a guide for this?
<@&1159293140440723499> Hacked account in help channel.
so should i upgrade from wokada to vonovox?
Last update: March 30, 2026
is it the same delay orrr
so are they good? are there any better alternatives? using windows 11 and gpu doesn't do well
Am I supposed to answer these trivial questions or something?
Vonovox and Tg Develop's W-Okada are better. I'm not sure about EaseUS one, but Voice.ai is a scam one.
understood ty
huh
like whats the downside to changing
what tutorial link are you using? Are you trying to do TTS, AI covers, E Girl Trolling / Catfishing or Roleplay?
we don't suggest those, it's better you elaborate if you want to hear the alternatives used here
Give a deep voice model
there are tons of #1175430844685484042 , which one are you looking for? like ben10 or e boy deep RVC Voice model to troll?
is it open-sourced model?
Beneath MVSEP website, many separation models and softwares (like UVR5) are open source.
Hola,
GPU: rtx4070 ti super 16GB vram
OS: Fedora KDE Plasma 43
What I am trying to do: I installed wokada TG-develop fork and works with the model & want to link/send the output of the fork to another program like discord or anything else.
The Issue: Can select my mic as input but when it comes to select the output device, I see no virtual cable showing despite having portaudio installed (did I miss anything from the docs?).
**the tutorial link: ** The link from the docs on the TG-develop fork (Realtime Voice Changer > Local > TG Develop's)
I used before deiteris fork on windows and works nice and had vac and was all fine but first time trying to use TG fork and on linux with portaudio. From what I heard, portaudio doesn't create a virtual cable? And that u may need to use pipewire? If anyone knows better how to set up this, I would appreciate a lot ^^
can you give me the link for download the realtime voice changer
well it's in the docs, literally. Also Sapphire just gave the link for docs above
uhm can you teach me how to download it
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project.
A Realtime Voice Changer with similar performance to Vonovox & Wokada Tg-Develop Fork, with extra features.
Deiteris' fork (modified version) of wokada that doesn't get updates anymore.
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE

GPU: rtx4070 ti super 16GB vram
OS: Fedora KDE Plasma 43
What I am trying to do: I installed wokada TG-develop fork and works with the model & want to link/send the output of the fork to another program like discord or anything else.
The Issue: when opening the web interface, audio processing is locked onto server (when the first time it was working fine but after trying to create a virtual cable with pipewire, it went haywire) and I cannot start the server and run the voice changer at all. Also I set the Sample Rate at 48000hz but it gives errors and changes to 44100 while saying the input/output/monitor supports only 48000hz... I don't get what is wrong here... And I tried 2 browsers: firefox & opera gx + tried the troubleshooting from TG fork about this issue with audio processing locked onto server. At least if the server was working...
**the tutorial link: ** The link from the docs on the TG-develop fork (Realtime Voice Changer > Local > TG Develop's)
Yes many of them are but in mvsep u can just use those models directly and free...use UVR5 if u want locally
This is a General AI Discord Server, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: LLMs, AI Covers, TTS, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
Idk what the first one is but voice.ai is bad and it's paid, if you have AMD gpu use Wokada Tg fork, if you have Nvidia use Vonovox
It depends on what setting you have the chunk size at like always
What is this fedora KDE thing? Do you not use Windows?
it's a linux distro. I did use before windows and well it was working fine
Anyways since you have a 4070 use Vonovox, idk how well it works on Linux but it's worth a try
https://huggingface.co/dr87/vonovox/resolve/main/Vonovox_beta_17_11.zip
First download is for the voice changer second one is a virtual audio cable that connects it to both discord and any game you play using it
I've seen vonovox and wish I could have tried but from what I understood in requirements or "pros/cons", it said that it works only on nvidia gpu with windows only or so I understood
Ngl I thought it was from Mars lol
You could try but I'm unsure if it specific only works for windows, I just know it's Nvidia only
In case tho you'll need to switch to Wokada tg fork, idk which one is the Nvidia Linux version tho
I would like to know if there's a good voice changer, also if there's a good woman voice as I run a VTTRPG and would like to stop hurting my throat and instead using a voice changer. I already have some male voices to use that I like, but I don't feel VCC from Okada branch is working good for me. I used it for a while but was not able to configure a good output for it. I have a NVIDIA RTX 4060 with 6gb, 16 ram and an Intel core i9 14900HX. I'm running in Win 11.
will try
Lemme know if there's any issues ^^
yeah, vonovox is windows-only

the way interface jumps from 48000 to 44100 is normal in Windows release of tg too, IDK why it happens but it has never caused any issue for me so I didn't really care
Second link is a lite version?
I have no idea why...
sadly can't help with the virtual cable issue, I haven't ever run w-okada on Linux so I lack experience here
also wanted to ask, how I would install that VAC470lite on linux as it's, from what I can see, windows installer :))
I mean didn't try yet with wine so not sure how it works
yeah it's ok.
I was going to suggest you wine, but not pretty sure if it would work as it's a "virutal machine"
I tried to make a virtual cable with pipewire but seemed to break client audio processing
It's a virtual audio cable that connects the voice changer to games and discord ect
OH its virtual audio cable, I have it already installed. Thanks!
Do you have VB cable or the one I sent? The one I sent you is recommended over VB cable
VB causes odd issue on windows sometimes
This one doesn't
It's the same, Just checked the one I have already installed
Ah
Yea, VB caused a lot of issues for me
I will give vonovox a try! Also, where can I get pre trained models? I'd preffer them in spanish but I think I can work around with english ones XD
Right here good sir
https://discord.com/channels/1159260121998827560/1175430844685484042
And also herehttps://voice-models.com
Thanks for the help!
hell?
XD
God, this sounds a lot better tan VCC 
Never heard of Tan before but I'd think it would lol
Voice Changer from okada branch, I was using that one but, this is better XD
<@&1159293140440723499> weird account
I only know of the original wokada, Wokada deiteris fork, Wokada tg fork, and Vonovox
As well as Applio real-time
That's somewhat newer like Vonovox
I see, might take a look at that one too
hi i got a nvida gpu 5090 does anyone have the fork okada i used this one before but lost it
you should use Vonovox
what are you planning on using it for btw just curious
hanging with freinds i like using solo leveling voices and sh
they sound so relistic i used vonovox but it jst not like tgfork
cool!
I'll get u the downloads rq
I don't understand how people lose stuff like this, do you randomly delete it or what
nah i needed to reset my pc aand sh i had to many files
sorry for taking up ur time to get the files im really thankful tho!
I could never, everything I have on my pc is too important to losee
i had that one before it way easier vonox is so confusingg to me
but the beta is easier than tg fork
all you need to do is change block size around and pitch
everything is done for you
wait can u show me what fork looks like because im lowk dont know if were talking abt the same thing
this is vonovox
this is Wokada tg fork
Vonovox isn't complicated at all
neither are
I wouldn't sacrafice quality just for one to be "easier"
that's just me tho
oh wait now im weirded out so with wokada it would be on the browser and it be like sounding so nice but ig that must be fan made or smth
yea same w me
so i had one
that looked like okada the normal one
but it was on browser
sm guy gave it to me
i just use vonovox thank you fuck me i mst be confusing aha
is there a website for voice modles
??
most likely it was outdated then if some rando gave it to you
there's plenty here but also a site that has them too
ohh what siteee
this one!
I'd check here first as this place has a lot more quality control over good models
if I was the egirl models would have their download links removed to stop dirty scammers, like those weirdos in this channel https://discord.com/channels/1159260121998827560/1420775879759630448
people joining just because some random old yt video said they have them here makes me feel some kinda way
w-okada not working on rtx5060, a little help?
elaborate:
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link
- windows 11
- real time voice changer (RVC)
- tutorial link?
RVC doesn't mean realtime voice changer, it means Retrieval-based-Voice-Conversion,
this is a General AI Discord Server and many people confuse it, that's why I'm asking what are you trying to do since there are different tools: AI Covers, E Girl Trolling / Catfishing or Roleplay?
Also, did you use any tutorial or download link for whatever are you using right now? What brought you there?
- i know what RVC is and what it stands for, i've used w-okada before so it isn't my first experience with it. I recently upgraded to a new PC with RTX 5060 and by the looks of it, the pytorch hasn't been updated. i'm here because of the second to last option

Use Vonovox it's the current best
The downloads you need are here
thanks 💛
You're welcome!
Anyone knows where can I find a feminine but not too feminine voice (femboy)
hi do you know how i can hear myself while using vonovox?
I do yes! one second
why?
what's your pc gpu? (Nvidia or AMD) and what do u plan on using it for? just curious ^^
better not be with egirl models 
u promise you're gonna use normal stuff like Goku or Darth Vader ect
ok
peak
here's the two downloads you need ^^
first is the voice changer second is a virtual audio cable to use it in games ect
nah it's really easy
just extract both zip files, for vac lite (the virtual audio cable) just run the file called setup64 and then install driver
and for the voice changer run mmvcserversio
yea no weird setup like the old one
and it runs on browser
for the first time it could take a bit but it shouldn't take long
are you able to send a screenshot?
you have your settings wrong
input should be mic output should be line 1
you're using a different virtual cable but yea should be fine
Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (0, 26) at dimension 2 of input [1, 128, 6] tf is this 😭
I see many but what is the best like free and do tts with rvc model same appolio ?
i am setting up a full off grid property(for when shit hits the fan) and need help with the ai aspect to control (cameras,hydroponics,gates,water) i have been diving into it with ai and they recommend i start with MS-01 but i also want to run 120b models and just would like someone to talk to who knows a little more than me...
can i get help? everytime i run tthe start this pops up
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
i got this on rvc mainline, what do i do?
its the third time this happens
that's a bot you're replying to lol
you should probably be using Applio btw on Kaggle, google colab kinda stinks
Why are you using the mainlnie RVC?
kaggle?
rn i was trying on collab, the link that should send me to the ui didnt work
i wanted to do some covers
is there any reason for it?
Anyone able to help me with an ai voice model?
Curious what ai voice trainer thingy it is
hhello !! I've been using this and it stoppped working so I was wondering if there was a new verison <3 vcclient_win_cuda_2.1.4-alpha
Hi! this is a veryyy old version
jai
what is your pc gpu (Nvidia or AMD) and waht do u plan on using it for
hai
5070
you should use Vonovox, I have to go very soon so I'll get you the downloads
here ya go
first link is for the voice changer second one is a virtual audio cable (it's recommended to use it over vb cable)
yessss I have vb cablee !!
the second link tho is recommended to use instead of vb cable, it does the same thing but sometimes vb cable is buggy for no reason
so im switching out of deiteris fork to a different program
which ones the better option performance wise
tg-develop fork or vonovox
If you have Nvidia use Vonovox, if you have AMD use tg fork
Since Vono is Nvidia only
damn
I thought this was a funny joke, repeating after each other
but instead it turns out they're all the same bots 
@low shard triple kill here
why not suggest the docs btw?
are you trying to do e girl / e boy trolling / catfishing?
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
RVC is STS Only, no TTS can natively use RVC models
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
there is no RVC Mainline Cloud port suggested, all are abandoned, I will remove them
they got already killed
So tts can't do sam's appolio when you have rvc model and tts that can convert the voice to the rvc model you have ?
is there any free website alternative than weights gg that allows the use of custom rvc models ?
hi, question: what is the group's stance on the limits of AI assistance when it comes to writing in research papers? Is it the provenance of the ideas or the style of writing and prose of the human ideas that are being written?
Basically, where do you see the limits of what AI assistance should not cross?
RVC is Speech To Speech
Applio is an RVC Fork (modified version)
Applio to do "TTS", firstly uses Edge TTS to make the input audio (the tts model you see), then uses the RVC model over it
there's no unlimited free site, it would just go bankrupt
it's better you tell your pc gpu and os
Yeah my problem is I cant use my model with applio speech model bc of Microsoft that's why I want program same appolio
Хелп ми
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Tg-Develop
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
Hi everyone! Does anyone know how to make endless streams?
hey what do I do with the D and G pretrain thingys from my training?
I thought an index and my model pth would be the only result if I'm being honest
you can, you just need to use both an edge tts and rvc model
there isn't any tts that can use rvc models other than the way i explained
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
those are needed only for pretrains or to continue training, you don't need to include them when posting a normal rvc model
gotcha, I gotta keep them when I want to resume training oki
It's easier to get the people what they need rather than giving them stuff that they'll probably not read
i mean it's like giving them a car without instructing them how to guide
the guides are made to be read to understand how a program works, else why even spend so much time making them
Both programs are very easy to use and setup, I should probably tell them though each time how to run each
i mean if you want to manually step by step give them everything everytime, but that's not going to help them when they need an update or want to know what setting does what
Fair but in Vonovox specifically has only 2 settings that ever needs to be touched, block size and pitch
I guess for Wokada tg fork it's a little bit more, just chunk size and extra time
OH SHIT NEW APPLIO COLAB
How can we create the sound we want here?
just use Kagglee, it's better than colab as it gives 30 hours a week for free
colab gives like 4 at max
why would you do that, that's weird
<@&1159293140440723499>
Can you tell me the way again and how because I'm new to these stuff
what about applio, it doesnt work anymore either?
it didnt work yesterday
this is how I use it, I use it on Kaggle
whats with the dataset?
is it for training?
yea that's for training you don't have to add any datasets
to import a model tho should be the same
just go to download section then paste the link from huggingface
like on google collab?
idk how to use Applio on colab
but applio's interface is the same on all softwares
local, kaggle, colab
should be the same
o
ok
btw, are there any new models to train my datasets with? or im ok with Ov2?
that was the newest one last time i made one
titan, ov2, Ren3, any of those are super old and bad because they cause harmonic distortions that we didn't know about back when we first used them
pffffffff
dont tell me i have to do some models again? D:
like
yea 😭
10 of my models use ov2
use this pretrain it's brand new and honestly for me from testing it's great https://discord.com/channels/1159260121998827560/1492203850747216083
Hi! What coding LLM is best for 12 GB VRAM atm?
does kaggle have a limit of how much gpu can i use?
I am looking for something like real - ESRGEN llm model, is there any alternative for upscaling image
30 hours per week of combined GPU time
how much did google collab had?
Frequently limited to T4 GPUs with ~12-hour max sessions (but often interrupted sooner) and 90 minutes of idle timeout. You may be restricted for days if you abuse resources.
what number of gpu should i be puting here?
Why won’t it let me generate, bruh
It won’t lemme send pic wth
I’m trying to generate an image yet it says it’s not permitted no matter what I delete
I’m trying to make an image of Maxie and Mega who are two Pokemon characters
Gives me this “Content that violates our community guidelines was detected in your generation. Your gems have been refunded. Please try again with different parameters.”
Even though there’s no nsfw
I just put “Maxie from pokemon ORAS with a younger guy with black hair, red eyes, fluffy black collar, black and red shirt uniform, red cape”
Ain’t nothing wrong with that
If you explicitly want to use RVC models, use Applio
If you actually want better TTS, try other TTS programs
RVC isn't the best for TTS
huh
I wasn't here for a couple months
Applio Colab works fine
The VibeVoice TTS model, which is developed by Microsoft, is one of the best.
where
what's been updated?
just use kaggle for applio 💔
I use a dual nvidia gpu setup, is it possible to make vovonox use a specific gpu?
Last update: March 30, 2026
Thanks!
what are you talking about? you should specify
I cannot call at the moment sorry
whatever you have is outdated then
what is your pc gpu (Nvidia or AMD) and what are you using the voice changer for?
super outdated yea
right here. first link is for the voice changer second one is a virtual audio cable which will connect the voice changer to games and discord
https://huggingface.co/dr87/vonovox/resolve/c8034f5f6d50648a8109bb4f847182362e2b779b/Vonovox_beta_17_11.zip
No, just run setup64 for the virtual audio cable then install driver
And for Vonovox just run setup
Any yt tutorials are outdated for voice changers
Sure
It's not that difficult, just follow what I said here
Extract both after download and run the files I said there
Pitch at 0 works for most models if you're a guy, but if you're using a female voice pitch it up some until it sounds right
I personally have my block size at 0.50 but it works well at 0.30 which is default
It's alright
Excuse me?
I don't do that
<@&1159293140440723499> weirdo
I won't be helping you further
what ai is ran on terminal?
i cant run the start_http.bat file for the voice changer, any tips
did you get it from a yt tutorial?
anyone know why after follow the audio cable and mic steps for input and output it wont work? Like on roblox i cant hear anything through my mic?
why the hell did the creator change it like this now i cant choose an index
nvm its text to speech
the ai vc client doesnt work at all it doesnt make any sounds i checked all configs
the old versions worked fine but i dont wanna use old version i want new ones
im on win_ cuda 2.1.4 alpha
i have amd cpu
and nvidia graphic card
i am trying to speak
well
yea
use Vonovox, what were you using the old voice changer for btw, just curious
Does anyone here use Kaggle who can help me? I trained a model, everything was going perfectly, 275/300, then it started throwing errors and stopped training, and everything started throwing errors.
this?
yo
so im tryna set up the vcclient from w-okada after a while of not using it [i had deleted it] and im on a new version trying to set it up with voicemod, i genuenly cant figure it out. i already have the cable stuff and whatnot
that's really old, if u have an Nvidia gpu u should swap to Vonovox but if you have AMD u should use Wokada tg fork
what's ur pc gpu?
I'll get u the download but I have to leave soon
@swift thunder idk how to fix your error but look at this short tutorial I made in case you did something wrong
alright! first link is for the voice changer second one is a virtual audio cable which will connect the voice changer to games and discord
https://huggingface.co/dr87/vonovox/resolve/c8034f5f6d50648a8109bb4f847182362e2b779b/Vonovox_beta_17_11.zip
pretty sure i already have the cable stuff
unless it got a few updates since ive downloaded it
are you using Vac lite or VB cable, they're two different softwares but do the same thing just VB cable causes issues sometimes
i ohnestly dont know, it just says cable, and its benn a few years since ive downlaoded it
nevermind its vb
I'd recommend the one I sent then just in case
yeahhh
k both folders are done downloading
do i unzip both and install the new cable?
Yep
For vac lite just run setup64 (not as admin)
And most likely you won't need to restart your pc either
If you like you can uninstall it the same way you installed it
But you don't have to
Same setup just using the other cable
What are you using?
There are no default voice models that come with it, whatever you're using is outdated
What's your PC gpu? (Nvidia or AMD)
And what what do you want to do with the voice changer, just curious
I need help T-T, i am looking for some good male voice models for realtime, do u guys know any good ones
man the hard truth about clipping audios on your own from 1 to 10seconds wasn't giving me good results at all
same goes for letting applio cut audios on its own 😔
can def say og 48k pretrain is noticeably giving bad results than legacy 1.5 48k pretrain
is there any documents on what needs to be avoided or kept as I isolates a dataset?
like, for example, without whether it's true or not
if audio utilizes stereo heavily (sound from left or right), you should turn it into mono (just an example not confirmed to be true)
if you can cut audios on your own from 1 to 5 seconds(RVC limitation), it's better to cut on your own to make better quality dataset (just an example not confirmed to be true)
why there are so few to no documents on how you SHOULD process a dataset?
RVC works with mono audio only
Your stereo data was simply downmixed to mono at preprocessing step xd
yeah....
I thought there was a warning about this but maybe not
I was wondering why I got irregular volume sometimes with my model
apparently model also studied the lowest volume part of the audio when the music was coming from only left or right
I let applio do the normalization but
wasn't able to tell huge difference on my own when I heard the dataset after the auto normalization from applio
There's also some debate on pre vs post normalization
If your data has noisy silence removed then post should be quite good
afaik, pre is normalization before cutting and post is after cutting
Yeah
I just see no reason to go for pre
unless your dataset is suffering from low quailty audio issues
If there was lots of dirty silence in your dataset, post would blow it up
But yeah, other than that it's seemingly better
idk how you deal with the brief silences between lyrics or speech though
like 0.1 to 0.4 seconds dirty silences
someone who knows def should put it on the docs
why did the mmvc file stop opening for the voice thing it was opening yesterday now gotta reinstall
a) ignore them and deal with it
b) manual cutting
c) smartcutter (my go-to)
Though I mostly train with video game voiceover. It's clean out of the box.
see I do either
a) manually cutting them to close the silence gap
b) completely silence that dirty silence part without closing the gap
hello
but I can't tell which is better or should be avoided
i have problems with mmvcservice
no works :,v
and i have all
the VB- virtual cable input, input 16inch and output
Ideally replace dirty silence with pure zeros and leave just a bit of silence between words/sentences (e.g. 0.1s)
1 day to the other stops working
How you're gonna do that is a separate thung
i reinstall but dont works :,vv
thank you so much for the clear answer
anyone wanna help me make a AI voice website
I can't pay anyone but I kinda wanna see how hellish this could be
I don't know how tf to do anything ;-;
really gotta put it on AIhub documents though
I think the main problem is there's lots of uncertainty around dataset preparation
Lots of aspects for which people have different approach
at least the ones that are generally good to do should be listed up on the doc
So it's hard to tell "this is the way. This is the only and right way"
Perhaps, yeah
instead of having nothing should be better
the info of something generally good + the reason why it's generally good = an easy step for anyone, can logically think from there to guess and try some better ways to do things
rather than shooting themselves in the foot

Definitely, agree
:c
It just doesn't open or what happens?
This is weird
Especially that another person above just had the same issue
just one favor to ask, can you share a screenshot of any of your processed audio file because I want to see how you processed it?
preferably with spectrum
like this is a random pic from online but I usually edit out these clicking or tearing parts of the spectrum to process but NOTHING else because I have no info for anything else
just the pure silencing out part like you and I discussed a little bit earlier
this is one of my datasets (from a game, too)
other than concatenating all of it and silence truncation i didn't do much here
i certainly don't do any precise adjustments like that (although they can sure be beneficial, depending on the dataset and things you adjust)
yeahhhh
yours looks vastly different from mine I think I got a better idea now
thanks again
they don't always look that similar, I guess various timbre might turn out quite different
though obviously clean human voice will have lots of shared properties in the spectrograms
this is from the very first dataset I did, I think that higher frequency parts needs to be cleared out? or is it just only in your screenshot case idk
some dirty silence parts can be seen in here too
it might also be my spectrogram settings TBH, not exposing too much dirt in the highs
only major difference is just I added silences in between sentences
those settings are what I mainly use when looking at the harmonics
ahhh
maybe mine was showing 48k spectrum I think
I figure because yours is showing til only 15k but it's just my guess
my data is 32k so it can only peak at 16kHz, hence the range
in your case it can go up to 24k
yeppp
oh and just one more thing before I go
I let applio cut my one long audio file for a test
and it chopped some of the last parts from the first sentence and then put it into the second sentence's first part
is that a problem or not at all?
or abruptly cutting it mid sentence?
good question
TBH not sure how it affects the final model
damn
Personally I don't mind it and just do use the autoslicing on my concatenated data
but
slicing phonemed in half
definitely can have some negative impact compared to when the samples would simply go from silence, to audio, to silence again
ohhhh
Haven't ever done any research on this but that's what I would expect
at least it's much better than having no answer
Whether it has a massive impact or little-to-no impact at all? No idea, maybe someone else knows
I will go for manual cutting from 1 to 5 on my own and see if it helps any better
apparently 5 is the max for applio
to process while training
There is one more problem with it though
(and i guess the main reason for equally-lengthed 3s clips)
hmm?
The training pipeline utilizes 3s segments and cuts off the rest. So if you provide it a sample of e.g. 4s, it will still only use the 3s and ignore the 1s.
If you provide a sample of 8s, it will process 2x 3s samples and discard the remaining 2s
(or at least that's how I understand it, recently saw a discussion on this)
ohhhh
So eventually the outcome is often similar - cutting words in half and discarding some info
I swear I saw it somewhere in this discord that applio can process up to 5 sec hmm
gotta go for 3s to be safe I guess
This needs further verification I suppose 🤔
Don't want to state i'm 100% sure of something when i'm not
gotcha thanks a lot
hello everyone im new to ai stuff, i wanted to change a voice to another to make some ai song covers, i've tried using RVC but i have an AMD GPU (RX 6700XT) and cant get it to work, could someone help me getting it to work, or maybe guide me towards another ai i could use to change one voice to another? any help would be appreciated.
My specs are:
Rx 6700XT gpu
Windows 10
hi, anyone know why google colab keeps crashing or disconnecting when i’m generating roblox assets? not sure if it’s a GPU limit thing or what.
Hi, anyone know how to create realistic TTS with human nature voices? Like breathing laughing?
Hello! evening, morning to everyone! im just curious about why my Odaka starts to slow down and genuiley start being unresponsive, is it a internet thing?
it only started to act like this after a few seconds tops
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
do you need help?
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
what
I need it 🙁
what mdoels? are you trying to be like ben10 or e girl / e boy / trolling catfishing?
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
need what? an rvc voice model? check #1175430844685484042 or #1159289738314919936 , or make it yourself
Last update: April 4, 2026
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
This is a General AI Discortd Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
I should make the Sapphire Message about the Guidelines and Elaborating more visible, not sure why it's so ignored
Yeah, trying to get a good eboy one, im kinda tired of getting catcalled and all that. I currently have wokada set up (my gpu is a 3060)
Oh yeah
are you like trying to troll / catfish people or trying to like get privacy in games because you get harassed as a girl?
Hello, im trying to run tg-develop w okada fork on ubuntu server but when i run the server it does not accept arguments
kat@kat-server:~/Voice-Changer/MMVCServerSIO$ ./MMVCServerSIO --launch-browser false --https true
usage: MMVCServerSIO [-h] [--log-level {debug,info,warning,error,critical}] [--launch-browser]
MMVCServerSIO: error: unrecognized arguments: false --https true
did you run this through some sort of noise removing process? if you did, I'd like to know how you did it and why
because I certainly didn't make my dataset to have voids all around the spectrum
not in spacing between audios but spaces inside of them like swiss cheese
well, as i said, it was already clean as it's from a game so it's kind of an "easy" dataset
so lots of samples concatenated with 100ms breaks in between
- smartcutter on top of that to possibly clean up silences from within the samples
This is a General AI Discord Server and there are many voice changers, elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
rvc fills the voids when the model upscales the audio, it doesnt matter
I know it dose because I already read you said it couple of times in the discord
BUT
do I want to make those voids on purpose is the question
no
hmm okay
so voids on only spacing for now
I thought it could make clearer voice output idk
there will be always quality loss due to how rvc works, it will never be 1:1 with the dataset
yep and we're just trying to mitigate the losses as much as we can
and this is the part of the thing too but it's a shame I guess
good question, i would have to compare this part from samples to before smartcutter
yeah somehow I thought those voids inside of the audio spectrum not the ones from spacing could enhance outcome quality
this is smartcutter
and this is original
(slightly different scale too because the second one is before downsampling to 32k)
ah so original already had those voids in
yeah, looks like it
I admire the attention to details, I don't look that precisely usually xD
gotta do what I was doing just now
hey I just wanna make my dataset processing worth a while
the only trick to enhance the quality of a rvc model is to get a better dataset, recorded with a decent mic, low noise, and no editing at all
what i usually do is slap the data i got into training and filter it later, if the model develops some flaws
there are more questions to ask after I realized wthat is a better dataset in general
raw wav audio files
like hmm
like a recording of yourself, without any editing to that audio clip
no mp3 compression
no "voids" in the spectrum
should I avoid putting chest voiced audio when the majority of the audio is modal voiced? kind of stuff
this just saved me an hour or two tbh
yep you want consistency, in both timbre and audio quality
yeeahhh
I was confused because I read you need diverse audios but at the same time you can't put drastically or moderately different styled audio (in terms of speaking / singing)
yea with diverse they mean pitch variety, not monotone audio
gotcha
you have to teach the ai the whole voice range of your speaker
I just gotta undo my voids on spectrums I did for the past 30 minutes
and I put too much trust on index since I tested out british accent model as a realtime model
why not actually? I mean if it was to be added manually then sure, perhaps not worth the hassle. But assuming it's inserted automatically, isn't it better to discard the noisy silence?
ah and also word variety, you dont want the speaker to repeat the same words often
(even if the noise is low anyway)
and it only sounded somewhat decent when I mimicked the british accent decently
so a lil disappointment there but we carry on
with voids you're talking about the empty space in the spectogram? like the ones caused by compression?
or you're talking about the silences between samples? lol
oh i thought you meant the silence
mb
i mean, it fills them up with random shit but it's not ideal
better to have real data there
we really gotta update AIHUB documents
hmm in theory i could fill those with RX's spectral reconstruction. Wonder if it's better or worse than leaving it untouched.
Interesting thing to check I guess, not sure how that reconstruction thing performs
probably better off leaving it untouched unless your dataset is low quality in that terms
i'd rather train the original data, tho i have tried upscaled audio before as dataset and it came out fine
i know pretrains dont like upscaled data tho
but finetuning is different... soo

yea tbh i think it's fine as long is not for pretrains
what's your take on non-verbal audio for a dataset though
like sighing, laughing(moderate not high pitched), humming sounds like hmm or etc
out of those three laughing is the worst i think
wouldn't expect occassional sighing/humming to break the model
oh now it makes sense
my british accent model had A LOT of laughing or giggling
f the singer ig
adding lots of noises like that will probably cause the model to insert them into normal speech
which is rather undesired xD
yeeeep
one of my lazy trainings was Ellie from TLOU which had lots of shouting/screaming/growl-ish angry voicelines, beside normal speech
and it got quite audibly rendered into speech in the model
especially "heavier" speech with stronger emphasis turned out raspy like the screams
"soft" speech was more-or-less unaffected
but yeah, it was an experiment to see how it affects the model and turned out as expected, it's rather bad
now I think the most difficult thing to do in the processing dataset step is which theme of the audio you want to mainly use as a dataset
you can't just put everything and hope for the training to turn out good at everything right?
well, ideally if the data is consistent, then this is a non-issue
so I gotta choose what kind of audio I mainly want to train
like, for someone who doesn't know what a consistent data is
I, myself would put shouting, crying, singing, grumping, sarcastical speech, screeching, mocking etc in the same dataset
and it wouldn't be able to make generalized voice like I expect it to
for example like the one you've just said from TLOU
ideally that would be great, but for a better and more flexible architecutre than RVC
yeah, hopefully some day
although a part of me doesn't want it
due to how common catfishing and other stinky use cases are
if I have a dataset of crying, angry, annoyed then I have to choose one
oh it's just an example
even singing method can be vary from a same person
and we have to choose only one of them to make a model for now
at least realtime or retrieval tech isn't going anywhere to be developed further at the moment
TTS industry is going to be advancing just fine and at least it's not for catfishing
So far I know humming, grunts, yelling (kinda), singing, coughing ect works
Especially when used with a good pretrain like Legacy core 1.5 or 1.6
I think, in practice, audios that are drastically different produce unstable outcome because it's not just pitch is different
I usually keep singing out of a dataset if it's mostly a talking model
that I gotta agree
so any model that sounded less robotic with less artifcats have these monotone-like feelings in my experience
i think those things also will depend on the ratio between the types of data
screaming only = bad
screaming just a bit = not as bad
Good example is my model of Doey
for simple explanation would come from talking models
let's say you want a model to sound like a specific game character in general
but you have a dataset that consists of just talking, shouting, being grumpy, or idk mocking
ratio wise idk what ratio that can f up the model
i am new on this and i want to know how to have a solid base to start learning Ai....is coding must?

So can excessive coughing really mess up a dataset?
It's mostly for General Grievous from star wars
idk we will have to find out ourselves
both ratio and the fact that coughing is in the dataset in the first place is a problem or not
Hmm
ideally it reproduces the same cough every time you cough
but I can imagine the model to blend that coughing into normal speech and f up the whole speech you're trying to say
idk if it's prevented from the training phase I don't really know
that's pretty much my Ellie case i think and for that the answer is right there
but i would expect that with not-so-much shouting it would be way better
that is, in this case a lot of coughing is in the dataset of that general grievous
so maybe same with coughing
I think moderate tone differences express happiness, sadness, and annoyance can be done and I've seen a few
but above that, I don't think we can do that
that for sure, nothing wrong with expressions
but people expect more than just little expressions to fall into the category of "oh I should put this in the dataset"
His voice is usually this really gravely somewhat robotic voice but for the most part it's pretty human sounding
Not sure how the sound of his voice could affect the training
so did I for one or two first models I trained
one way to find out 8)
like glados and etc characters
I think his voice alone isn't a problem at all
it is that god damn non-verbal things are always the problem
it's quite harsh at times
not to mention the "too much expressive" speech if you're training a character model
i'd worry a bit that RVC could exaggerate this after training
but usually it does well with all kinds of funky voices
yeep
whether clean human speech or something very artificial
e.g. both robotic-ish voices i tried training had some flaws resulting from RVC learning some parts of it too well and exaggerating them
one was resulting from low frequency content in the voice so that was more or less architectural limitation
the other was a 'cyborg' voice which is 90% human with slight electronic buzzing
and it affected the model a bit too much too
doesn't sound great even though it's actually not so far from thje original
For this model he has some efforts and pure screaming in it but most of it is talking, the thing is his voice goes from talking in a deeper more serious voice to a more silly higher pitched voice
https://discord.com/channels/1159260121998827560/1349526562197995570
example
I love making silly models
I heard it and it sounds like high pitch sounds are blended with his serious tone of voice
me too
That model is a bit older tho so I no longer use it
Still struggling to make a new one that I actually like
I just wanted to point that out because it might have been the very thing we were talking about the dataset ratio or the
too much diverse expressions generate lower quality model
Makes sense
Doey have serious and silly voices which are drastically different so I figure
That might be the reason why e-girl models were thriving back then
that sounds about right
I'm feeling car sick
I find it strange tho how rvc can make a model like this somewhat where it can change pitch to match the random voices this character changes to
But struggles with a character that just has a lot of range in their voice like going from serious dark voice to higher more bubbly voice
The limitation is real and I'm coping
I hope some rich dude will show up and improve this stuff
Oh have you made a model like this?
My friend has yes
I've tried many times but never was left satisfied
one day I can say unhinged shit as glados that sounds so natural to the point others might think it could've been from the actual game
I mean my model sounds pretty damn good as long as you add autotune to it when using live
I used one from here for a gartic phone session and it was fun
i think maybe it's achievable by separating the various voices by some criterion
splitting by features is probably not a way
but pitch maybe?
like, having some vastly different samples of a low voice and high-pitched completely different voice
could try with a deadpool model I was gonna try training
and perhaps training with some low batch size to make it not generalize so well on purpose
since he got that normal unc voice and his silly voice lines
Not for trolling, but just so that i can talk without people being weird
My favorite Deadpool as of recently is the one from marvel rivals
that is a respectable use case
Exactly the one that I was gonna do
Say does keggle applio still works?
or I should just try that google colab
As long as you're not using any of those e girl models you're still a person
Idk how i would do that as a woman but yeah
xDD
Yup! Still works and it's the only version I know how to use anymore
doesn't that charges you for a server or anything?
since the age of AI I cannot imagine any provider without paying a server
"Mommy asmr" ahh voice
wrong but good attempt
Wdym
it's free 30h weekly
where are they pulling their cash from wth
i'd say it's amazing xD
it's google level of flex in terms of a provider that is
-# I delete my acc every time my time is too low and just use one of my other emails and it resets the time each time forever giving me infinite training time
Totally don't do what I said
Kaggle is W
Love it
Colab is Doo Doo because the time limit for anything is like 4 hours max
though how long does it usually takes for a single epoch to train?
my local training time for an epoch would be 24 seconds
very broad question, considering one epoch of 1min dataset is a bit different than one epoch with a 5h dataset xD
ok hm
let's say 1 epoch = 50 steps
how long did it take you on keggle
so it's around like
15 to 18 minutes of data
Not sure but I'd say it trains quickly
Really depends on dataset length
anran_klm_dc_32k_4b | epoch=139 | step=7367 | Current time: 23:50:47 | Time per epoch: 0:00:25
this is from epochs with 53 steps
25s
so that's similar to my spec with acceleration option on
damn
really similar I tell you
3060ti with i5-14gen 32gb ram
Shhhh
good bot
I have a 5070ti but I refuse to train locally bc it's confusing, just takes a lot of room on my pc, and can cause my vr to do janky stuff like freeze ect
okay do I install applio on keggle or
yeah because it's hoarding your pc resources and VR is a heavy game
i prefer to not train locally just because it's a waste of money when i can do the same in the cloud for free xd
and also no need to keep the PC running for additional hours and blocking me from doing something
Real
the last time i trained a lot on my PC, my energy bill reflected it 
Yikes
the only issues emerge in case of huge datasets that exceed the disk space available on kaggle for free xd
Good thing I've never gone over an hour of audio
Well there was that one time with Kratos
hey I know we're simping keggle here but can I get a link or an explanation on how to set up applio on keggle
I have a whole video
one of my first nice models was the Witcher, dude has almost 11h of voiceover in the game 
on the contrary
This is how to do it
I've heard of the game but have no idea what it is
if you'd watch a video about it it would be more confusing ngl
😭
I like scripting, I just have a script with a couple variables that takes all the necessary configuration and it does all the magic with one click 
thanks a lot
entire training procedure in one short string
yeah, runs preprocessing, feature extraction, index generation and then runs the training
afterwards compresses all data into zip
No need for such specific epoch
and then i just download it
I mean you can resume training from where you left off too right?
i mean, in this case, 250 is when it ends
I guess 200, 300, 350 all good unless it's overtrained
in this case the 250 is almost surely much more than i need
you can resume from where you left off
i can later reupload the data and restore it before continuing training
if needed
but i usually just pick a large number of epochs to "ensure" i won't need to resume training later
though sometimes i still do it later
sounds about right
in my case i don't have data persistence so it's not like all the trained data stays there between runs
that's why i need to reupload the necessary files if i want to resume
not much work anyway
yep
isn't GUI's whole point is to make it less of a chore to navigate
each to their own, I guess
for me making a script and then just running one command is more convenient than opening a GUI and clicking through al lthe things
but GUIs are definitely convenient for lots of people
I'm spoiled by the linux world
ahh
I didn't get it at first because locally applio saves the latest settings you've used

@viral mason oh yeah, when you put silences in both start and the end of a 3s clip
do you include those silences in the 3s in total or exlude them from the total of 3s
I'm not sure, I use one entire dataset and let Applio do the silent clips and all that, I truncate silence of the entire audio in audacity before putting it in
hmmm oki
I was gonna manually slice clips into 3s
and I was wondering if silences in a clip should be counted as total seconds
Yea no need to manually do all that nonsense
the model learns silence thanks to the mute files, don't manually add silence to the samples
Applio automatically slices your audio pretty sure
That's why I put one entire audio file
yeah but I tested it and it cuts the sentence mid way into two clips that it sounds just weird
hmm
I wanted to manually slice into 3s myself
got it
model takes that 3s audio, then learns using segments of 0,36 secs, it doesnt learn the 3s at once
no manual silence pauses in the dataset and no manual cutting
truncate the silence in audacity then use simple slicing
Simple slicing?
truncate the silences and how long would that suppose to be after that?
30ms? 50ms? 100ms?
300 ms
300ms? sounds a lot generous than I thought
yea coz the model still needs to learn natural silence
i train pretrains, thats what i use and works

You use 0.3 seconds? I have mine at 0.2
0.2 is fine too
0.1, 0.2, 0.3
yes
I'll try 0.3 then
Do I need to enable truncate tracks independently if I'm using one audio file or nah
idk, i leave that on always because im paranoid 
when audio is 48k but when you look into the spectrum it's only 19k on chart so it's 38k 
gotta love mp3 bro
Use Wav
if the file is already mp3 he cant do much
yeah
Sad
converting it to wav is going to preserve the mp3 compression
it's a lost cause
I do but not often since they do that too
Do what?
youtube compression and all that
Ah
I often get lower quality ones
check either https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/ or https://docs.aihub.gg/realtime-voice-changer/local/vonovox/
don't use youtube video tutorials
Last update: April 1, 2026
Last update: March 30, 2026
when people neatly edit an entire video of a character voice lines but download it with dlp
it's garbage
Do they have Nvidia?
I wish they could upload it somewhere else that can be lossless