#✨│ai-help
1 messages · Page 287 of 1
I might make an awesome tunneling list for jupyter notebooks, who knows what can happen in the future
It looks like one SQL file went missing from Weights' database, which somehow also makes model searching unusable. The "My Creations" part still working fine, but without the creation count number.
is v.1.5.3.18a the latest one?
this is a general ai server, I’m guessing you’re talking about original wokada from an outdated youtube video tutorial, they are all old, that’s over 2 year old
anyone can teach me how to download voice changer?
oh sorry i didnt knew it
so that method is still working? right
antasma returns to ai hub 
it’s severly outdated, don’t use it
it’s fine dw
Hey, trying to export a video through the storybook on openai,
it's not using the correct videos even after clicking each one and replacing them, exporting just gives the original video before any changes.
Anyone able to explain what I might be missing
Best girl rvc model
where do I download the ai hub
AI Hub is a discord server, not a program
have you read the rules?
are there any alternatives for kaggle? @low shard
specifically can I change the applio url to a public url like the one in colab?
nvm did it myself
lightning.ai and google colab
I transplanted google colab's gradio to kaggle
Applio Kaggle recently updated because of an ngrok issue, check https://docs.aihub.gg/rvc/cloud/applio-kaggle/
Last update: September 30, 2025
💀 honestly why kaggle applio doesn't have gradio as an option
Ooooo
the latest Applio Kaggle uses Gradio + LocalTunnel, that's because it needs a 2nd tunnel to show the filebrowser (needed bc kaggle built in filebrowser isn't as good as google colab one) and tensorboard (which displays automatically in google colab but not kaggle)
If you modify it to work ONLY with gradio, you won't see the filebrowser nor the tensorboard
Trueeer
Also what's the file browser you are mentioning about here? As in like the directory? Or am I missing some important point?
btw lightning.ai is also a good option, it allows web uis on free tiers without needing encryption (so no ban risk), gives around 80 hours of T4 GPU freely per month, and can use only gradio since it displays the tensorboard correctly
You know that 3rd public url you see that basically opens as a file explorer? The program's name to do that is Filebrowser
Wait how fast is the training compared to kaggle?
Never have I ever clicked on the third url, what does it even do? Why is it there?
🥲 I'm someone who rarely uses tensor board cuz I simply don't hear much difference between 100th epoch and 280th epoch (10th epoch to 100th easily recognizable though)
that depends on the cloud gpu being used,
you can technically use way more powerful gpus on lightning.ai for free at the cost of lower free gpu time monthly, might be better you check https://lightning.ai/pricing for more info (they give like 15 credits for free per month)
I'll try to sign up in the morning I have to freshly train a new batch anyways, also can I import the dataset from kaggle to my drive?
like if you compare a L40 to the T4x2 on kaggle, the L40 would be faster but you'd get less gpu time
if you compare lightning.ai's T4 to Kaggle's T4x2, kaggle would be better
kaggle does offer more gpu time since it's 30 hours weekly, tho it's more risky technically
I mean, it's your choice
Nope, it isn't like google colab
I mean you could like download your dataset and just upload it to your drive
I'll try to dig in main thing I'm worried about is data set 🥲 I wish there was some way to import my data set from kaggle to lightning directly
The data would cost 4 GB 🥲 I only get 10GB per day
don't you have your dataset locally saved too?
🥲 I did most stuff in cloud like I directly imported to kaggle from Google drive public url
wait so is your dataset saved in your drive currently or not?
I have to dig into the lightning's ui maybe I can find a way
Yesss
That's why I'm asking is there any way to directly import 🥲 instead of downloading and re uploading
Can I make my dataset public in kaggle and use that public link to directly import?
Or does direct import work in lightning from drive?
I think the best way would be that you modify the code to use gdown to download your dataset from your google drive 
I can't seem to find any info about directly importing kaggle or google drive datasets directly into lightning.ai
Ooooh they have 10 hours of a100? @low shard or can I go for h100 or l40s
Maybe I'll try to dig in a bit on how the lightning's ui feels first 🥲 maybe after seeing the workspace I might get an idea
I mean you don't need very expensive and fast gpus like a h200 for RVC btw 😭
Okay so how fast the training speed would be in a100 of h100 of l40s,
For context my data set took around 12 Hours to reach 200 epochs in T4x2
I mean this one particular project is like one time thing so I'm.not gonna probably use lighting afterwards
Or at least until a new version of branch drops
🥲 I tested spin v2 , it doesn't feel much of an improvement compared to contentvec too at least for my Tamil songs
are you sure you actually want to train on like h100? first time I see someone wanting to do that since they usually cared more about getting more free gpu time monthly
😭 another thing is I'm going to extract and train with crepe tomorrow , bcz I feel like rmvpe doesn't get the pronunciation and overall tone correct
Nah I don't care much 🥲 as I don't train multiple sets, I just want to finish my thing as soon as possible (you already know I train for like 30-40 hours per week in colab right 💀 that was really annoying so faster the better)
how long is ur dataset btw 😭
This current one is 58 minutes
🥲
I went through audacity for noise gate and truncate as well (used applio docs not aihub)
I'm not really sure about the exact training speed gain from an h100 compared to t4x2, but theoretically it should be faster, the h100 has way more tflops and cuda cores
Eventhough the results with rmvpe are satisfying, I feel like the pitch could be more accurate with crepe? (And better pronunciation? correct me if I'm wrong currently crepe extraction has the best quality when there is no noise in the training data set right?)
Lemme google it how much percentage
you love to experiment i see?
Yeaa 🥲 trial and error always
Idk much peeps doing this so imma do it
you already know I train for like 30-40 hours per week in colab right 💀 that was really annoying so faster the better
dedication fr
😭😭 and honestly sometimes I wonder what can I even get out of this because that 3 hour voice data there, is just my own plain voice 💀
theoretically, but tbh most just stick with rmvpe: #🔊│ai-development message, #📑│making-models message
So h100 seems 3x faster than T4
I'm getting no access for it
💀 I trained 285 epoch with rmvpe, so why not venture out into crepe for same model and see the difference (plus this time going to go with default contentvec instead of spin v2)
tbh I always trained rmvpe like most people
be aware to not expect too perfect results for things like non speech sounds, rvc models usually don't super realistically scream for example
I mean you can experiment however you want ofc
experimenting is good
just telling ya that usually rmvpe is more suggested
And the male singer I'm training for have real high pitch he screams a lot and sometimes the slicing size of 5 seconds is not enough
🥲 mmm true that
I will try out my best and it doesn't seem h100 being 5 hours is not that useful as it's only 3.5x faster than the t4
Which means only 1.7x faster than t4x2 theoretical, and yea it's not gonna cut the 6 hours mark even to train 200 epochs (plus also considering I would be going crepe which is significantly slower than rmvpe ? - idk heard some rumors like that, have to go with something like L4)
Welp gotta get some sleep now already 2 am here and it's freaking 36°C
Yeah people don't usually want to train on such high end gpus just for RVC 😭
f0 extractors dont change the quality of the model, but they can affect how stable the voice is
😭oo
Stable as in more pitch accurate?
yep
some bad f0s can make the voice sound uhh wobbly idk how to explain it, english is not my first language lol
And more tone accurate ? Also feeling more closer to his real.voice assuming data set is really clean
😭😭 it's okay I get it
But it kinda sucks with pronoucniatio. When index goes up
Swift?
Ohh YEA did you try out swift? @analog obsidian
haha but unironically yeah, its better than harvest and pm lol
Aren't these really old names?
nah that just contentvec being bad
💀 this is giving me mangio rvc memories
I used spin v2
yup lol
yeah, dio is a pretty old f0 that in fact got removed from applio as it's not suggested at all
🥲 like both custom pretrain and extractor
tbh I haven't really played around with it much, but I heard some mixed, a bit more negative, opinions about it in the Vonovox server
It should be technically a bit better than rmvpe, but I can't say that it is in practice since i saw some negativity in it
😭 i was too amateur back then idk how was it tbh
ah no idea, when i tried spin v1 index it was actually way better than contentvec
Maybe because it was English data sets?
For me spin v1 sucked hella bad that contentvec
Spin v2 kinda feels similar to.contentvec
But I can't pinpoint which is better
you surely love playing experimenting to even try pm lol
As both have pronunciation issues, maybe I will bet for spin v2 at least in the lady 2 audios I tried and it kinda felt like pronunciation was good
💀 what's in there I can't open it it's 2 am and I'm in bed
spin v1 is less fuller than cvec due to some issues, but the pronunciation is still way better than cvec
i know coz i compared them too
😭😭 wait how much would you rate compared to spinv2?
spin v2 was meant to fix the fullness problem
can't give it an opinion because no good pretrains for it
i was bored 😭
💀 ahhh maybe then Mine was undertrained when spin v1 was used?
is there gonna be like a legacy core pretrain for spin-v2 btw?
noobies v2 spin pretrain has degradation,i showed him
What about the current one I think @simple ore mentioned

this is degradation
💀💀💀ಠ_ಠ
Holy shit gng again I can only hear this tomorrow I will take a listen
Degradation means the voice getting more metallic away from original data set voice?
Ex-fucking-actly I was feeling it around 280epochs
I had to turn on autotune
To somewhat balance it
😭😭
im trying to find a temporal fix for the degradation problem of rvc
It felt like some robotic a bit? Idk how to explain kt
after that yea i can try porting it to v2
goodluck
🔥🔥
So would you suggest I train with contentvec then? @analog obsidian
yea keep using cvec and the og pretrain
Thanks for it 🙏 imma try with crepe tomorrow and update how it goes
no dont use crepe, its broken in rvc
due to the way the pipeline works or something idk why honestly, i just know it doesnt work like it should be
rmvpe was made with rvc in mind so it works as expected
-# it honestly feels like RVC is held by ducktape
FR 😭
I saw you talking about ddsp-svc or smt before, did you also try seed-vc?
I was thinking of giving seed vc a try for fun, it's 0shot anyways so it's easier to use
i havent tried seed-vc, but honestly ddsp-svc seems to be a great rvc replacement, same quality, realtime support, being able to train with 8gb vram gpus, fast training
it doesn't degrade like rvc
which branch are you talking about tho? the main one is kinda inactive since 10 months
yxlllc is who made the realtime code of rvc iirc
and trained rmvpe
thx
i mean, hopefully one day this can 100% replace RVC

I think that it's more probable we get a replacement than "v3"
it needs a pretrain so we can train using smaller datasets, currently it's training from scratch so it needs an 1 hour dataset minimum
the only issue is we have already thousands of RVC models, so we are kinda stuck on RVC which isn't very sustainable (the average user mostly cares about inference rather than training)
yeah changing from rvc to ddsp-svc is not an easy task
but it's a good alternative, members of the rvc team are involved
ddsp-svc realtime asks for slightly more vram than rvc
but lower than so-vits-svc realtime
What even is ddsp, what's the difference between it and rvc
Does it produce better models?
Error
unhandledrejection
no error stack
NotSupportedError: AudioContext.createMediaStreamSource: Connecting AudioNodes from AudioContexts with different sample-rate is currently not supported.
98/t/Un/this.setup@127.0.0.1:18888/index.js
im getting this when i put my main mic input n
Hi, I have a question. What is the latest Google Collab notebook available so far for training AI voice models? In my case, the last one I used to train a model was Applio in the Cloud, but it took much longer than doing it with another Collab. And I don't know if there is a newer Collab, since my PC can't train locally.
I would recommend using kaggle or lightning.ai as both give way more hours for free users compared to colab
or ok but lightning.ai do I need to enter any more information or just create the account with email?
lightning and kaggle both require an email and phone number verification
kaggle is easier to learn tho
or ok thanks a lot I'll see how I do it 😄
can anyone else not hear them self
it doesnt say my mic is moving or making noise i uinstalled reinstalled used different voices
and nothing
ive tried everything and this isnt working for no voice
its literally just making my pc lag like crazy
which w-okada version are u using
im not saying it depends on the version, im checking if ur using the correct one
the latest one let me see rq
not sure tbh i just know it was like 218
2.1.8
or sum
it has 18 at the wnd
end
u gon help me or wha
what link is it?
yea pretty sure that's outdated as hell, what gpu do u have?
i got a nvidia 5060
wait really the video was 3 momths old i alr used 2 different links
i think irs a problem else where just not aure where
yea u should use vonovox as it's the best most up to date one and the only one currently still actively getting updates
can you send me the link to it
Can someone help me out Applio on lighting ai is sleeping
all youtube tutorials use outdated stuff because they don't know about the new stuff and or are just being stupid
srry about taking so long I'm on vrchat
I asked Nick but I think he's busy or some
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Deiteris Fork, with extra features, but supported only for Nvidia GPUs on Windows. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Most suggested WebUI with the best general support for many platforms. GUIDE
For Windows Nvidia, Both Wokada Deiteris fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Deiteris Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
I could use a little help since my brain isn't working I guess.
Under the Wokada fork, I had it set up with Voicemeeter and Minihost. VM for routing, Minihost for noise cancellation (which works wonders).
For Vonovox, under the Audio devices, I'm missing all the other input/outputs from ASIO and the WASAPI setup is foreign to me using Minihost. Just broke on my when I tried to change it.
If I change Vono to WASAPI, my devices are there but obv Minihost doesn't work b/c it's set to ASIO.
Anyone got any ideas on how to fix/set this up without completely chanign my setup?
its no problem i was eating anyway
ye i can see that
could you send me the link to the right one tho
it should be in the guide but sure!
ok thanks because i lit cant find it in the guide at all
i got vonovox like u said im so confused how this works
its different from the other and i still cant tell if it working or not
yo

yo it aint working tho ive tried 3 different apps
and none of them work can somebody please help
where is the support team in here bru 😭
they honestly mad useless
what are you using
vonovox
someone said to use vonovox cause its better
and i cant get it to work on here either
Last update: September 6, 2025
uh huh
if hes dming you its probbaly a scam
oh rly?
i havent used any of the voice changers so id just wait for nick to come online
im on a mac 
dammmmmmmmmmm
is there a official tutorial
cs the guide is just full of links
where
can u send me the link
or if iuts in ther server
send it in priv then
yes you can
lollll
u were right
is there anb i can ask
that ain a scammer bot
wait am i supposed to have vb cabled downloaded for me to hear my own voice and for my voice to work?
@low shard but hes sleeping rn
am i supposed to have vb cabled downloaded
if only you read the guide
nooo i dotn wanna use it on dc or games i just wanna heart it if it works and it didnt work
ima see rn
....
can the world just end already
So fish speech local only works on Linux right?
Looking at the link on their github page. But I just wanna make sure
https://speech.fish.audio/install/
can anyone pls tell me what im doing wrong
i cant hear my self and the volume doesnt work
there should be docs for installation on other platforms
btw you can try another cloning TTS: #📰│dev-updates message
The links on the ai docs file for fish speech leads to their github, and the installation link on their github leads to the link that I just sent.
Alternatively F5 TTS has an installation guide but I'm not sure if it's compatible with Windows. Is chatterbox just better?
why do you go back to that old wokada version after having vonovox not working on you
chatterbox is claimed to be comparable to 11labs in some ways
it didnt work either the sound was weird and idk how to fix it since i only know how to fiz it in w okada
ans thi isnt the old version it was the most recent one
so can anb help me bru
like what settings could i have done wrong
bro pls where is the customer support at
or wtv its called
please check the aihub docs for the fork version
already sent it but he doesnt wanna read
u did?
sorry bru i was prolly fixing chrome up
i didnt see it
ok ima uinstaill this one
where do i go lol
i pressed the link in the guide
mvm found it
downloading rn
said it was the latest version
Any possible assistance with this?
Interesting... I've attempted to install chatterbox and this error showed up. Is there anything I should do? Also, if I get passed through the installation would this open up a local tab?
try cloning the repo and installing that
when i try to use a voice changer the voice sounds so robotic and cuts out how do u fix it?
python 3.11 only
I'm going to level with you, idk what that means
git clone
or download https://github.com/resemble-ai/chatterbox/archive/refs/heads/master.zip and unzip
then pip install -e . from that folder
dot at the end
Sorry mb missed it
hold on, there' some other issue stll
I just need en tbh
still isnt working sb pls help and why does this error keep coming up when im pressing start it doesnt go away
you'll get the original en model
Still on the folder?
I got a question, im new to this vocie changer client demo. A friend of mine helped me install it, everything is set up, I have the virtual cable though none of the voices seem to be working? Is there a reasoning or something im missing? Do I need to reinstall the application? Or is there something on the appilcation I need to adjust.
anb can help rq bru
got the one u tood me abt and still dont work
I've uninstalled and ran the command, but the folder is still empty
if you did not create & activate the virtual environment, then it got installed under C:\users\user\appdata\local\python ...
dude seriously where r the mods at what r they doing
what r they even mod for if they aint helping this is wild
what does the error say?
Wait at some point you said Python 3.11 right? I got 3.10 but also, no python folder under appdata/local
3.10 would probably work
that
these r my settings
and in the command setting it says this
can you're type what it says, I can't read your screenshot
oh ok lol igu
an error occurrd during voice conversion check command line window for more details
and in command window it says
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
one For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with "TORCH_USE_CUDA_DSA' to enable device-side assertions.
what is your gpu?
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Deiteris Fork, with extra features, but supported only for Nvidia GPUs on Windows. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Most suggested WebUI with the best general support for many platforms. GUIDE
For Windows Nvidia, Both Wokada Deiteris fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Deiteris Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
and installing a version compatible with your GPU
as well as a proper virtual audio cable
it was compatible the one i installed
i alr have a virtual audio cable thing i downloaded it
click the 3rd link above
wokada deiteris fork?
Okay, I found the file do I just run the installer?
no
it is already installed if you ran pip install chatterbox-tts
ive never downloaded from this what do i press
you download 3 *.z* files
and you unzip them using 7-zip or winrar or winzip
so the firsr 3 right?
yes
this gonba be a long process? since i need to download 3
read the guide
wtv im downloading it rn let me see guide aswell
Is there a tab when I run chatterbox?
oh yes i figured it out ill lyk if it works noobie thanks a lot
u should be a mod
better then the mods they have rn
you can run this
change AUDIO_PROMPT_PATH=r"voices\david.wav" to the reference voice you want
How about if I have a voice clone I wanna use?
or this
as I said, change the AUDIO_PROMPT_PATH to the .wav file with the reference voice
and tts3.txt to the text file you want it to read
Ah I see I'll try that out now, thank you
@low shard as it turns out lightning is not available in my region
it keeps cancelling my download @simple ore and says site wasnt available
is that a me problem os it like that for everyone else right now
can you explain what's ddsp-svc ?
https://github.com/yxlllc/DDSP-SVC/tree/6.3
there is a readme here
sorry i just went and read

now after reading that I've got a doubt
is rvc the bestest voice conversion tool out there?
(i'm asking in terms of which has more quality)
or are there any other top end conversion tool
ive heard so-vits-svc and ddsp-svc can beat rvc if you have big datasets
that's far better than rvc
diffsinger is also good if u want to create ur own vocaloid lols
are those 2 only? but aren't both of those open source?
wait so current market's top conversion tools are all open sources?
i mean i need to search up what's a vocaloid
you mean like voice conversion as tts or in general
for singing so-vits, rvc, ddsp-svc are good
tts idk i dont use that lol
i mainly use rvc for singing conversion
but not real time though @analog obsidian
i rarely use real time stuff, so I just need top notch quality for singing conversion 
so-vits and rvc are technically the same
ddsp-svc is diffusion based, so better than gan (rvc/sovits)
but uses mostly the same shit as rvc
cvec and nsf hifigan
also i thought about why aren't there an option to change hifi gan
there are new models at least according to applio or aihub docs?
because the only way to fight degradation in rvc pretrains is to either getting a huge dataset or completely changing the architecture
@analog obsidian
as in changing the vocoder?
I want to try out refine gan but i have no single clue if i can even change them
are there pretrains for refinegan?
the refinegan implementation in applio is not the refinegan of the paper, is noobies version of it
so is not "refinegan"
but noobies refinegan

how do I use it then?
is Gabox's voc_fv4 stile the best for vocals?
ask him to do a pretrain for it (wont be possible since no one knows how to fix rvc degradation)
ahh wait can you send the degradation you sent at night?
i wanna have a listen
i like it more than fv5
yea
even with pretrain I want to know how do you even change from hifi gan to something else in ui?
I didn't experience this sorta behaviour this just feels like there is a lot of gaps in vocals
if you change the vocoder you'll be training from scratch, which needs at least 44 hours of data to provide something good
aww, I can go online and find the person's 44 hours of concert
it happens with bigger datasets, smaller datasets cant overwrite the pretrain fast enough to experience degradation
it's going to be really really really hassle though maybe gonna take me 3 days to even get them properly organized
bigger how many hours are we talking about
i started to notice it when i trained 4-5 hour datasets
and pretrains gets degradation at epoch1
thats how bad the rvc arch is
lol
ewww
i mean the 48k pretrain of rvc sounds like this
did you notice or experience similar in contentvec?
gang I have to ask something to you
is this epoch 1?
i dont remember how many epochs rvc-boss trained the 48k pretrain
thats the og pretrain 48k btw
my epoch 1 - 30 always sounds like this
models do inherit the pretrain degradation too
like i've trained multiple multiple times and my first upto like 30 epochs sounds exactly like the degradation wav you've sent
it only gets better after reaching about 50? then slowly noticable improvements till like 100-150
then the improvement is kinda not noticable at all
😭 so that's why all of the contentvec interference I've done had this effect at the beginning?
is not the embedder who is causing the degradation, is something else we dont know lol
the generator just fails
😭 then how does it improve as we move over to like 50 epochs
😭 these terms are honestly confusing man
i cant find Gabox's voc_fv4 on mvsep
if degradation is this thing as you've sent in these attachments, it has been always there in starting epochs, or by degradation are you hinting towards the quality starts to feel like the first attachment after a huge amount of epoch?
holy hell
well i call it 'degradation' but honestly doesn't have a name
this 'issue' is when the generator can't average properly the dataset
so you get a mess
🥹 and are you hinting it can happen both at starting and when overtrained?
basically yeah

why are my downloads failing on chrome ive tried like 30 times and they failed every time my wifi is good i even used ethernet
is not there, but it's in a colab and uvr local
also doesn't ddsp-svc has some sort of UI? or I have to do everything manually?
@analog obsidian ohhh also you got the link for new kaggle gradio combo?
everything manually
yes
nop sorry idk 
awww fuck no i will have to learn a lot
it will be annoying till i get used to it
oops nvm
let me go to my good ol kaggle and train with contentvec crepe
crepe? nobody uses that anymore tho
crepe is not viable because it gives a f0 to noise
why not?
f0 is just pitch extraction process right?
the second step is calling pitch extraction
thats done using a f0 estimator
the job of this f0 is to teach the model how your voice sounds like at different pitch
aww fk i have to make it quick
but what if my data set has absolutely 0 noise?
and I've properly preprocessed using noise gate and truncated?
then rmvpe is still more accurate than crepe for speech

for songs?
songs too
😭
rmvpe was made with singing in mind
it was trained with a singing dataset
m4singer
crepe used speech only
damn so i'm stuck with rmvpe?
rmvpe is probably the best f0 ever made
not gonna lie
lol
for human voices
is this rmvpe propoganda
for instruments, etc crepe is better

alright imma take 1 copy of the song I want with rmvpe (as I've already trained in it)
no big surprise the f0 made for rvc is the best for rvc, crazy
made by members of the rvc team
who had in consideration how rvc works
💀 rmvpe is solely made by rvc team!?

wait then what are these benchmarks showing
instruments , sounds, and speech
f0s are trained using different types of datasets actually
lols
its crazy
the f0 doesnt have an impact in quality
it just tells the model how your voice sounds like at different notes

imma go brute force
does anyone have rtx 5000 series and the downloads just wont work at all
bro theres no way the mods are this trash at their job
ive spent 8 hours in here tryna figure out how to get thia done and they are ghosting
May I ask if anyone knows whether an RVC model trained locally performs better than one trained on Weight?
Or are the results about the same?
My English is poor
depends on the gpu they used on weights or ur gpu locally
@lost temple I've rejected your dm, please ask here or post a thread in https://discord.com/channels/1159260121998827560/1192011222023950368
Which element of the GPU determines the quality of the model?
If I want to perform local training, where should I look for recommended GPUs? I am curious that the GPU Applio use is recommended or not, I also wonder if there is any method to know the GPU that Weight use.
Thank you!
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 3060) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message is very helpful.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- (E girl, as an example) catfishing/trolling, scamming, impersonation.
- NSFW/Porn.
- Any illegal activities.
Requests for these topics will be ignored and may result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
check https://docs.aihub.gg
Last update: August 5, 2025
that seems like old original wokada, dont use youtube tutorials
I think you just said in dms that you fixed it rn, and yeah i have some irl stuff or sleep sometimes
I want to make an AI cover but I couldn't understand it.
that's alright now i understand
what u using
nothing but I want to use phyton and rvs but I dont understand
It is so difficult for me
I asked chatgpt and python downloaded a few things. It confused me even more.
vac lite is suggested instead of voicemeeter
please elaborate more
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 3060) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message is very helpful.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- (E girl, as an example) catfishing/trolling, scamming, impersonation.
- NSFW/Porn.
- Any illegal activities.
Requests for these topics will be ignored and may result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
damn where are u from? did the support say anything specific?
you aren't getting ghosted, Helpers here are only volunteers, there are very few helpers that help with different things, and ofc they can be busy too
can you tell your pc gpu and operating system?
nvidia 5070 and 32ram intel i9 ultra
that's great, check https://docs.aihub.gg/rvc/local/applio/
Last update: August 9, 2025
dont use chatgpt for this
nope its the one linked in the guide
"5 hours" runtime is a bit higher than average free runtime which is usually "4 hours" for free users. 
Sri Lanka man I'm not surprised tbh
Nope the docs have no realtime voice changer that has that version, which seems to be original wokada
Either you followed a YouTube tutorial, or you didn't read the full docs tutorial and just clicked the first GitHub link which was just for introduction and proving that it's open source
Anyways what's the issue?
btw I know the issue is solved, but have you seen https://github.com/gradio-app/trackio ?
Heyyy I need help again can you help me
Sure if you elaborate :)
until yesterday that used to work
i mean until today
now Line1 reacts to nothing I guess
Idk it doesnt work at all
wait weren't you using vonovox or atleast i suggested you that instead the last time we talked?
yess i used but I liked that one better
but now it doesnt work
oh wait
it is not windows 10 compatible now?
i remember something like that
why would you prefer wokada deiteris fork over vonovox? vonovox still gets updates
is the input in discord set as line 1?
i don't think that should be the issue, even tho windows 10 isn't going to receive free updates anymore (unless you're in the UE)
your audio setup should look like:
-
wokada deiteris fork:
- input: microphone
- output: line 1
-
other programs:
- input: line 1
- output: headphones
of course I have no problem with that
it is Okada not changing the voice
either doesnt take the input
or give the input
Weird thing is
you need to set the audio routting as i said, it will be different on wokada deiteris fork compared to the other programs
I do that
there is no voice in line1
so, you confirm me you did the audio setup just like in #✨│ai-help message ?
yup
it is just either Okada doesnt take my input voice or doesn't process
oh
okay
Solved it
The CPU name is either be Intel Core Ultra 9 or an older Intel Core i9, since the letter "i" has been dropped for most recent Intel Core CPUs, they now named "Intel Core 3/5/7" and "Intel Core Ultra 3/5/7/9". 
yeah some browsers could have issues some times
you too
Hello, 👋 I'm new here on the discord and I'm a total noob to the whole AI thing, I'm playing around with it for the past couple of weeks and got myself set up a SillyTavern client lately, it works pretty well so far for text generation with a model that I run through kobaldcpp but I wanted to try to add text-to-speech to it aswell.
I have installed a TTS Provider which is called SileroTTS and run it locally in a conda environment, just running this combination (kobaldcpp+sillytavern+silero) works totally fine and I can generate tts replies from the chat messages.
Now to my current problem: when I activate the RVC extension inside SillyTavern and try to have a custom voice for the tts then I get two error messages which are these:
- TypeError: Failed to execute 'blob' on 'Response': body stream already read
- TypeError: <RVC module> RVC Voice Conversion Failed INTERNAL SERVER ERROR
I hope someone here is maybe able to help and if this is absolutely the wrong place to ask about this topic then I'd be happy about some suggestions in which places to ask for some advice.
the link you sent it worked for the download but when i open the file or wtv its stuck in command prompt or it fails instead.
this is what it says
An error occurred connecting to Discord: Could not find Discord installed and running on this machine.
- Running on local URL:
To create a public link, set share=True in launch().
yo
yo
anyone has any female models that can laugh?
It just seems like an alternative to wandb and tensorboard, but that would mean changing the code, and for what? what we have is more than enough
Although I think you meant that we could take the code from this and integrate it into Applio’s interface, right?
A lightweight, local-first, and free experiment tracking library from Hugging Face 🤗 - gradio-app/trackio
i mean yeah, applio could maybe integrate the tensorboard / wandboard / trackio directly into applio, that's a good idea
If I want to cut out background noise, do I have to do it manually or is there a more effective method? i have tried uvr5 instrumental and de noise, but its still there. I have cut out keyboard noises and such by hand but that takes forever. im new to this
just use gabox's fv4
https://huggingface.co/spaces/TheStinger/UVR5_UI
do this and then just put the vocals of whoever you're making a model of into the slot and use MelBand Roformer | De-Reverb by anvuew, and then finally UVR-De-Echo-Normal.pth in the VR ARCH section (denoise is optional)
How to make a ai cover
btw if the anvuew model destroys a voice entirely like this just use de-echo de-reverb and skip the other echo model I suggested there unless there is extra echo
I really like de-reveb v2 by suicidal great reverb remover
idk if it's better than the one by anvuew tho
could test
How to get the "start_http.bat"
Would anyone be willing to join vc and help me out 😭
soo
what 
it went hell, rmvpe is much better
especially for the parts where the singer screams
like crepe messes it up like the one you shown in degradation
Hi guys, does anyone know of any models like veo3 where I can generate voice and video, but for longer than 8 seconds?
without sound, there are plenty
framepack is basically unlimited length
I'll have a look into that ty
how about with synchronised audio?
Download Vonovox 🐠🫧
Can somone help me to fix the voice changer
Sure, what gpu do u have and where did u get the download?
I got it from a video
Yeah it's most likely outdated like over a year old, even if the video is a few months old
Oh its says over a year
They use outdated stuff because they don't know about the good up to date stuff
Well what gpu do u have
How do i see that?
Open your task manager and click on performance
Sorry I was eating I'm back
Sorry for spamming
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Deiteris Fork, with extra features, but supported only for Nvidia GPUs on Windows. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Most suggested WebUI with the best general support for many platforms. GUIDE
For Windows Nvidia, Both Wokada Deiteris fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Deiteris Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
Download wokada deiteris fork
Read the guide and download it, download link is a little down just gotta scroll a bit
Should say amd
Here?
Yes just scroll down a bit until u see the download links
Once u download it unzip the zip file and open up the mmvc server sio exr
What the hell is this
Oh clash royal
could we call so i can screen share for you
its easy for me
Please could we do that
Here?
Hello sir
Are you here?
Yup!
okay tell me when i can screen share for you
Mk
It's no problem I love helping people out with the voice changer
Btw you're not trying to do any e-girl stuff right..m
If u are you're on your own
No im not
I promise
I just always wanted a voice changer
But you are gonna help me right?
Of course
I'll be ready to help u on a VC soon
just a sec
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
are there any good colabs to train models rn?
both kaggle or lightning ai would be better as they give more time for free users
does anyone have a rvc voice change for the rtx 5000 series none of the ones i tried work so far
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Deiteris Fork, with extra features, but supported only for Nvidia GPUs on Windows. and without cloud options GUIDE
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project. GUIDE
Most suggested WebUI with the best general support for many platforms. GUIDE
For Windows Nvidia, Both Wokada Deiteris fork and Vonovox have similar performance & quality. Users should read the pros and cons for both and choose based on their differences, such as UI and Vonovox's paid effects.
Read Wokada Deiteris Fork Pros&Cons & Vonovox Pros&Cons
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
try vonovox
Everyone recognizes my voice changer voice because I seem to be speaking too close to my mic. Are there any tricks to make the voice changer more realistic? It's getting annoying. I've tried so many things.
move away from your mic lol
not working
what voice changer do u use btw and what gpu do u have
just admit to them that you're using voice changer
if they're bullying you, it's time to stay away from them
u alr told me this
it doesnt work
SOMEONE PLEASEEE GIVE ME A VOICE CHANGER FOR THE rtx 5000 series
it does but you never read the guide
dude it literally doesnt i tried it about 20 times
the download fails everytime and i tried 3 different browsers and my wifi is good
its bugged
😔😔😔 bru irs been 2 days where the mods at
i think they dead or something
how r they inactive in the server they mod in for this long
now pls answer
which voice changer have you tried other than vonovox and the old wokada 18a?
there this one someone gave me
i think it was called deteries
and it was a specific part made for 5000 series
it downloaded and all
but when i run it
it says something like set “launch=true” and other stuff
and there is
no launch
i checked all files
wdym?
it should never close instantly, so what error message did it show?
mods moderate the server
many of us dont use the voice changers
it doesnt even open😭
its stuck in command center
and says the thing set true launch
ohh ur q mod?
that means the terminal window has already shown up but not the gui in the browser
did it also show the link like 127.0.0.1 thingy?
yes that it showed that at the top
what does that mean my browser didnt change so idk
enter the url in the browser
no actual way thats how to do it
if it works ima request u to be. amod for this server
what
im saying no way thats was how to do it all along
im gonna test it rn js opened pc
IT WORKEDDDDD
NO WAYYYYY
THANK YOUUU
now let me see if the voice finally works 6 apps layer
ok then
i jynzed it i think
it opened the site right away
then it closed and now when i type the url it says error
ok it opened again lms
dont hesitate to share the screenshot if something wrong happens
ok it opened but im confused on this
it says index and voice model but wont let me drop files
to use my goku voice
that is applio realtime
I'd suggest to put the model files in logs folder
ok so the whole folder or js the pth and index
Applio\logs\yourmodel
yes i just put them
then click refresh
ok i just did
lms
YO IT WORKEDDD
do i press unload vooce
or nah
and for the input do i use regular mic
for some reason i still cant hear my self
testing sound its just 2 ringing noises and gone
hey ik its probaly me being dumb but im trying to use the voices but its lagging out like crazy. i checked my cpu and it goes to a hundred percent. anyone know if they can help me or got a solution?
💀
nah im ngl im slow on that i was on the weong spot
but
i still cant hear my self
i turned on monitor sound thing
and i cant hear my self when im using the site
why is that
I dont know, as far as i know applio realtime is held together by duck tape
what is that
its the only thing i can use since hte others r buggedl
can you tell me as if im not using applio
cause evry site or app i used
never could hear my self
Have you ever tried setting the output to line 1 and going into discord and setting your input as line 1 then click test mic?
nope never tried let me try rn
bro ain no fkn way
it worked
but one problem the sound is so laggyyy
like my words come out after a minute of saying it
do you know what that is
im having the same exact problem
check your chunks
oh dang is it also laggy or just delayed
bet lms rn
should i make it higher or lower for best noise
the lower makes it sound worse but it makes it come quicker
i hear 196 or 256 is a good spot but at the same time idk cause my voice changer doesnt even speak properly rn
ok i just put it at 196 ima test it after i try and figure out best settings for goku voice
ok ok
hey btw what voice changer thing are you using? mine looks way older and diffrent XD
Just use vonovox
Your wifi don't like u
Wdym
vonovox doesnt run on your browser
it starts off strong then purposly cancels the download
i tried using
brava
micorsoft edge
and chrome
Thats just bad internet
ill try again right now and show you
i got ethernet bro
bad isp
its not internet trust me
ill show rn
ok downlaoding rn ill show when it failss
but one thing
do you know why when im testing volume i say hello and it works but really late then when i say a different eord it says hello again from the first time
I dont know man, i dont know / use applio realtime
Its just setup now
?
Ofc
where can i download
im ons its
lms rn
do i download both source code zip and tar or just one
im on the 1.6.9
just this
just downloaded it and extracting rn
one question
why does urs say setup bat
mines never has the bat thing
its always js setup
You have extensions hidden
bet ty
just unhid file
how does vonovoz work lol
it was alr fixed
i just wanna know how to use this thing
cant hear my self on discord
my volume completely cuts out when i start it why
bru anb can help???
why this gotta be so difficult js for me bro this is crazy why my shit gotta be like this
wayyyy too impatient
wdym dude u cant be fr
its been 2 days bro
i been asking for help for 2 days and whenever i get to the part that is always bugged evb asleep or sum
youre on a server where every helper is a volunteer
you cant expect immediate replies
2 days i shouldnt expect a reply?
nah bro idk abt that
can u js help me and get it over with bru plsss
and every time someone tries helping you decide to switch to something else
why cant i hear my self i have tried 3 different mic testing aites
CAUSE THE ONE I USE DONT WORK😭😭😭
and whenever i get to the most important part they ghost for a day or sum
just one thing bro why is it so delayed and cuts out at times
its like rly delayed
im using vonovoz
yk what i could change to fix it?
wokada is a voice changer
yes ik does it woek good like no delays and stuff?
ohhhhhhhhhh
do you need me to type in larger font?
Anyone know of the best locally run (runnable on my own pc -- I have a 3060) ai song generator (or a song cover maker, or even one that allows me to use a rvc voice model to make a song with that voice). I am wanting to use one of these ai programs to make a "unreleased" song of an artist.
Does this not work anymore? https://huggingface.co/wok000/vcclient000/resolve/main/vcclient_win_cuda_2.0.65-beta.zip?download=true
It won't download'
Where do I download?
that’s an old version of the beta of original wokada
which isn’t suggested
don’t use youtube/video tutorials ofr ai realtime voice changers
!howtoask
- Check Docs & Guides: Your answer may already be in the AI Hub Docs or the https://discord.com/channels/1159260121998827560/1159513888199540817 channel.
- Search the https://discord.com/channels/1159260121998827560/1192011222023950368 : Look for existing posts that solve your issue. Do not invade someone else's post.
Tell your:
- Full GPU Name: (e.g.,
NVIDIA RTX 3060) - Operating System: (e.g.,
Windows 11) - Detailed Description: What were you trying to do and what went wrong?
- Tutorial Used: Link to the guide you were following.
- Screenshot: A picture of the full error message is very helpful.
To maintain a legal, safe & ethical community, we will NOT provide help for:
- (E girl, as an example) catfishing/trolling, scamming, impersonation.
- NSFW/Porn.
- Any illegal activities.
Requests for these topics will be ignored and may result in moderation action.
- Be Polite & Patient: Our helpers are volunteers. You may ping the
Helpersrole once. - English Only: Please keep all conversations in English.
Is there one similar that works the same?
please read the guidelines above and elaborate
Read the help guidelines, it tells you what you need to elaborate to make us help you
it’s not suggested
are you talking about the formant shift in vonovox?
no in rvc
i have a male voice I want to swap it with female song
well the pronounciation is really weird for starters, like idk how to explain it, the words are correct but the way is really weird
it’s better to lower the pitch and play around with it
the pitch is correct as I've turned it down to -12
but the pronounciation is awkward af
like there is a feminine touch in the pronounciation?
so i need some help in figuring out what does timbre and quefrency means
Could i get some help?
!howtoask

absoluetly no use then to train with crepe?