#✨│ai-help
1 messages · Page 225 of 1
lol
I am curious, will AI RVC ever improve to the point it can mimic coughing and laughing (odd sounds that can be made from the human mouth in general) Cause if someone screams with a high pitch the model wouldn't comprehend what the user is trying to accomplish
Yea they all sound like robots, or russians. I try to set the voice but it doesnt exist or its the preset built in ones
for ARIANAGRANDE_ES_BY_SZAJEAN.pth
would I set the voice to something? or leave it empty? If i dont set it then shes a russian man
You can't expect much, RVC is meant for STS, all that the python packages does is make an API request to Microsoft edge tts then use that audio as an input for RVC inference
Ms edge tts is multilingual and nice quality but not emotional at all
Ohhhh
I getcha. lol, makes a lot of sense now. I'm making a bot to have a convo with and ms sam is trash so was looking to better. The built in voices don't seem to sound like that though if I change to a different en_* voice
Apolio is a good alternative to that kind of application
Thats a python lib?
Applio does the same thing 😭
It uses Microsoft edge tts API requests
Awww
Nope it's an RVC fork (modified version)
should this question be asked somewhere else? unsure if it doesn't belong here
It does RVC have improvements, but you can't expect it to sound emotional in applio either
Only the future can tell us
Unfortunately the original devs of RVC kinda left it to work with gpt so vits, which is a TTS AI
Are you sure I don't have to set the voice param to match the voice name in the model file? I just don't know what that name would be
Hopefully applio and our engineers will make improvements but who knows
Ah, cause I do intend to do roleplay with AI Voice but not wanting to sound awkward in certain situations
I mean setting a tts voice which has the same gender and language helps for sure, but it doesn't give emotions
Microsoft edge tts was mostly meant for just reading text on the browser, Soo yeah
it can depend on coverage and quality of the dataset and pretrain used. you can try this https://discord.com/channels/1159260121998827560/1339155300720054316 but dont expect to be always optimal for those edge cases
That's very experimental though
Iirc hifigan was disabled in applio main branch for now not being stable enough
Ohh, yeah huge improvement by setting the voice to en-US-AvaNeural
Is there any examples of it being used? Real time scenarios
I have tried training some metal singer model
Yeah it does help, tho not in the parts of emotional speech, like not as emotional as 11labs
How did it go ?
we need a new embedder for that, but thats very complex to do at the moment
not too optimal but not bad for making cover mixes
seoul finetuned contentvec and was able to make models scream
@urban fractal are you having any other issues ?
Well it's something
Nope
That can be considered a revolutionary step in real time voice changer no?
nick super quick question how should i know how much epoch to use ?
I'll share a video and code once I get it together
yeah
There isn't a right amount
Use the tensorboard
but rvc does not stand for realtime voice changer
Oh, thought it does 🤔
do you have a tutorial video or doc perchance for that?
Retrieval-based-Voice-Conversion
noobies said xeus can work but takes much resources
wokada
Ah
Last update: Dec 24, 2024
Then in terms of Wokada, has there been any noticable improvements lately?
w-okada is merely a gui for realtime inference
ah its just a file for applio
it's just a gui, it can't get any better
we need to improve actual rvc
Wokada is a program that basically focuses on the realtime rvc inference
There are 2 versions
Original made by Wok
And deiteris fork made by deiteris
The deiteris fork initially brought many advancements, which now are also in the latest original wokada, but deiteris fork is suggested still for having more options and less bugs, along with the removal of Beatrice models which were low quality compared to RVC and were experimental
every change the gui may receive will be a optimization or qol change
There was also a go-realtime.bat rvc GUI for realtime inside of rvc mainline (og rvc), but that's really outdated compared to the ones I mentioned
tier list for perfomance atm:
- deiteris
- mainline realtime
- original w-okada
I mean it has improvements in performance
This is the better W-Okada.
I got a random question as well. Besides the new vocoders is there any team that is actively working on, for example a RVCv3?
rvc boss feels rvc is already perfect and doesn't need anything new
don't expect something from him
Rvc boss Is focusing on gpt so vits
Ok
they tried rvc v3 but didn't liked the results
I see so new vocoders are the only way forward for now?
Ah I am certain this is what I have but not sure if I have the latest version, does it update automatically?
i installed everything sir , can you tell me the settings
imo the new vocoder thing is nonsense, we need a new embedder
Weird, could you try to open CMD and do env/python.exe -m pip install --upgrade gradio
anyone working on new embedders?
those are up to you
what you want to sound like
What would a new embedded or look like? I’m not familiar with those
seoul was finetuning contentvec last time, but i have no idea how that went
W-Okada doesn't update itself. You must download one and install again.
he's trying to make models laugh in realtime
male to female what should i use
again its up to you
Ban
is KLM 5 RefineGan best for talking female with medium high pitch voice?
it would improve: the model ability to laugh, whisper, etc
worked! ty
just mess with it untill youre happy
any recommeneded settings would help
@simple ore btw are you aware of this?
😭
It seems that Gradio needs to be updated
is KLM 5 RefineGan best for talking female with medium high pitch voice?
You're welcome
Is it alright if you can give me the right link for it?
our current pretrains are already pretty good
Pretrains help mostly for language tbh
And is there any sorts of logs regarding new updates, when it was ,what it introduces
thats old, use klm 4.9
Oh awesome, one of the things I hope in the future is that RVC will be able to handle more synthetic voices since there are characters that I want to make but I am struggling to be able to train.
dont touch refinegan pretrains, yet
if i got 17.1 minutes of datasets would a epoch of around 450 be okay?
This is the bitcrush effect. It's basically another distortion effect that reduces the resolution or bandwidth of audio to sound harsh and noisy.
4.9 is a hifigan pretrain
Ah I see, so the core of it is RVC and there'll only be improvements to Wokada if there are more revolutionary steps for it?
What is your PC GPU?
4070 ti
if rvc gets better, w-okada will also get better
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
w-okada can't get any better by itself
it's just a gui for inference
the only thing that can be improved there is perfomance
Hell yeah let's run Wokada on a raspberry pi 6 🔥
Lmao
I was thinking more of an intel duo core
Duo? Why not 0.5 🔥
there wont be NPU support to work on as well
true true
W-Okada simulator. 
Let's make Wokada run on hopes and dreams
NPU?
Yes
More like Intel Pentium. 
It's not a headset with a mic attached, the mic and headset are separate
refinegen disabled in applio?
the latest intel & AMD laptops
Hmmm, Celeron.
Doesn't sound like a problem to me then, Towers all the way
You need to activate it by using this #🔥│model-maker-chat message
no access
Try increasing in sens a bit
Neural Processing Unit
GPU + CPU for ai laptops
Which is barely usable since no programs support it
They can I’ll just grab the link to the file
I mean there are, but pretty few if you compare it to Nvidia GPUs
Kind of funny how an AI is refusing to work with another
But AI laptops sound pretty cool
AI is mostly supported on Nvidia GPUs
You can also do it on AMD and Intel but it's a much more pain
Though looking at the site I was provided, last update was December 2024, so I do have the latest build for it
AMD in most cases needs zluda, a cuda emulator
Ok first in 250% and out 400% is a bit excessive, lower in to 100, out ig doesnt matter too much if u want it to be loud but test it at 100-200% range anyway
Try audio: server mode with s.r. 48000 and [windows wasapi] prefix on input and output. This uses better audio device than client. Could help
In advanced
Protocol sio, force fp32 mode on
Can someone help me transfer music to the lyrics of a singer? My PC is a mess now and it's urgent
Hmm, but does the RVC inside of Wokada come updated everytime you launch the program?
no
pls if anyone can help me dm me
if rvc gets a new important change (like a new embedder) the w-okada dev has to update the gui to use the new embedder
Please elaborate the issue and what's your PC GPU
Ah
When was the last time RVC got an update?
Well since it says december 2024, I guess we haven't reached that goal
5 months ago (mainline)
3 days ago (applio)
Oh, ok
i deleted the model files but they're still visible and usable for some reason
and trying to delete causes a problem
it's hard to find a fast alternative, all of the cool ones require an absurd amount of vram for both training and inference
Did you delete them from model_dir?
no, its like right now my pc is a bomb like i can use it cause my CPU and GPU are in dust due to a blackout last night, I'm on my phone, that's why I'm asking for help
from wokada itself
@low shard when clicking the generate index it does this
Like 4090 amount of vram?
hmm
i'm tired of people hitting themself over the head, instead of simply downloading the compiled version
Oof
i just followed the doc my bad 😭
i think xeus is able to run locally but it ask for too much
i see
He did use the compiled version
im shit at using pcs bruh
Only way I can think of making it work is by hooking 2 4090s in one PC and leaving one to power the demanding embedder
can anyone help or...?
4090 only has 24 gigs of VRAM. We would want more like 80 lmao
Isn't that the same steps of https://docs.aihub.gg/rvc/local/applio/
Last update: Apr 01, 2024
You'd be needing an entire NVIDIA A100 to run that
4060 is enough to handle voice changer alone, plus streaming
Ah no I know that
my 4070 ti is doing fine work
But we're mainly talking about the demanding embedders
that's correct steps, but that error only happens when you download old source and try to install it.. so it grabs outdated library
But what about those AI TOPs in the 50 series? Could they be made of use to RVC?
what's your pc gpu
1650
💀
ohh so ur pc is broken and ur trying to do ai covers on phone?
Oh god
it's fine dw
ajam ajam
you know how to do it?
Actually are they called AI TOPS
or just TOPS
Honestly not sure what they are or what they do
but the download link is the same as the one in applio docs
Yeah doubt 4gb of vram can do anything unfortunately
damn
You can probably get a 1080 ti for cheap
It has plenty of vram
But well, not the fastest
you can inference fine, but about training, you're limited
you can train but at a low batch size which is usually more unstable
like 2 or 4
any RTX are better due to having tensor cores
That's true yeah
main branch applio has checkpointing, the speed lost isnt that much
These tensor cores making me tense up...
wasn't that giving worse quality
it was the inplace thing
im training my models in mainline now

so checkpointing works all fine now?
why
yup
6 gb vram is recommended for that
is there any other steps for tenorboard with applio besides opening the .bat?
well, it's the original rvc
@trim sparrow if u want to do training, you could also try cloud, meaning you will use a remote good pc and won't run on your pc
but, it has limited time in free tier
do you know how much time limited?
also cloud notebooks sometimes break easily since the cloud provider update packages and python version for example
whenever i open it i go the link and it says no dashboards for this current data set
4 hours max of gpu daily on colab (u need a google acc)
30 hours weekly for kaggle (needs an acc + phone number verification and harder)
22 hours monthly on lightning.ai (needs an acc + phone number verif)
that's because you need to start training first for a bit
and then refresh it
ah okay you think kaggle would be a good option?
do you know the website?
reach epoch 1 then try refresh it
this server is english only btw
cloud is way suggested for phones
@crude flame https://www.youtube.com/watch?v=bXEUto3lMyk
how i can get smooth voice like this ? i am not talking about the model but this looks so good
what's inside of your model's folder (logs/model_name) ?
you need to check the tutorial inside of the kaggle
oh right I didn't send you lmao sorry
i actually hjave no clue but it started working when i restarted applio
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio UI Colab: RVC Fork with some extra features like TTS
- RVC AI Cover Maker UI: Automatically Separates the Vocals and Instrumentals, converts the voice and mixes them back
it tells me to put ngrok token but i dont see the place where i input it in the code as shown in the doc but it says this
in applio colab you just type it in the field
im on kaggle
im not seeing that at all is crolled through the cells and tried the find thing to just search up for !pip install pyngrok and it didnt show anything nor did !ngrok config
if i use server mode do i need to download a noise cancellation software for the bg noise since sup2 doesn't work
yeah im not finding that anywhere
post a screenshot
that's an entirely different colab
is it this one https://docs.aihub.gg/rvc/cloud/applio-kaggle/
but you can do it here
ah okay thank you
I don't think this fits as an AI related question but I'm wondering what role do I need to be able to send a message in a model thread to ask a question there.
Need separate yea
Does anyone know how to train voices remotely? My GPU isn't good enough to train a model.
any recommendations
Steelseries sonar
for nvidia - broadcast app, here's how good it is https://x.com/thegunrun/status/1252789873699745792?lang=en
have u trained with refinegan
I have not, I likely won’t until or if it gets supported on wokoda
keep getting (The parameters of the pretrain model such as the sample rate or architecture do not match the selected model) when using it
Im almost certain that’s because you need to use the pretrains in the discord
Apollo doesn’t have any official pretrains
For refinegan anyways
when i choose
im using KLM 5 RefineGan keep getting
(The parameters of the pretrain model such as the sample rate or architecture do not match the selected model)
as I see @tight ether hasn't tried making another pretrain with the latest code
alr thx
ERROR: Exception in ASGI application
raceback (most recent call last):
File "/kaggle/tmp/.venv/lib/python3.10/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/kaggle/tmp/.venv/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
return await self.app(scope, receive, send)
ERROR: Exception in ASGI application
File "/kaggle/tmp/.venv/lib/python3.10/site-packages/gradio_client/utils.py", line 973, in _json_schema_to_python_type
f"str, {_json_schema_to_python_type(schema['additionalProperties'], defs)}"
File "/kaggle/tmp/.venv/lib/python3.10/site-packages/gradio_client/utils.py", line 919, in json_schema_to_python_type
type = get_type(schema)
File "/kaggle/tmp/.venv/lib/python3.10/site-packages/gradio_client/utils.py", line 880, in get_type
if "const" in schema:
TypeError: argument of type 'bool' is not iterable
An error occurred launching Gradio: When localhost is not accessible, a shareable link must be created. Please set share=True or check your proxy settings to allow access to localhost.
i get a error when trying to run through kaggle
that's really weird, try to make a new code cell, put uv pip install --upgrade gradio then run the run start cell again
"error: No virtual environment found; run uv venv to create an environment, or pass --system to install into a non-virtual environment
"
sorry I forgot to put ! above that code, what if u add that?
i added it
it wouldnt work without the !
should i js create a env? "!uv venv .venv"
what if u remove the uv?
i'm doing a quick test rn, installing requirements
i just did the venv .venv thing and it worked but not ngrok is saying theres multiple instances from before n i have no clue how to terminate
either restart session or try to check if you're able to in the ngrok dashboard
TypeError: argument of type 'bool' is not iterable
An error occurred launching Gradio: When localhost is not accessible, a shareable link must be created. Please set share=True or check your proxy settings to allow access to localhost.
this is so annoying
idk i give up on this
I found a fix for applio kaggle
delete the notebook you made
make a new one
add !uv pip install gradio==5.25.2 -q below !uv pip install numpy==1.23.5 -q in the 1st cell
then follow the guide as normal
@simple ore since vidal is not active, is anyone able to even update the kaggle anymore? 
im using the google colab one right now it seems to be working is it worth it or should i try notebook again?
it's not as much suggested as kaggle for training, since the gpu time is random daily and max 4 hours
and you don't know how many epochs to train, soo
use the latest kaggle notebook
i put this under the install one?
it is the latest kaggle notebook, i just tried it myself making a new copy
@low shard you can just import the notebook directly from github instead of taking 2-month old one
how do i do that?
the github one?
how do i import it from github?
use import notebook, then
oh we always used https://www.kaggle.com/code/deiant/applio
since it's in https://docs.aihub.gg/rvc/cloud/applio-kaggle/
Last update: Jan 13, 2025
do you have the github link for it?
@crude flame u gotta update the docs then i guess
training with refinegan slower? or just me
that one is old, but same fix
not much, about the same
new code uses fp32, so a bit slower
applio docs need to be updated too, https://docs.applio.org/applio/getting-started/other-alternatives#set-the-environment uses https://www.kaggle.com/code/deiant/applio/notebook
I assume Vidal owns it, so he needs to update it
how do i put my database inside of Applio?
do i put the dataset inside of the "working>Program_ML"
@simple ore
thank you so much and also is there any way to see the console so i can see the progress?
nvm got it, thank you again you and nick!
which is the best version for ai cover? fork or mainline rvc?
Vidal owns the site ?
Share a screenshot of wokada and discord settings
Mainline/ applio rvc
Original wokada and Wokada deiteris fork are meant for realtime
ok wait
!give-media-perms 1h @austere harness
i use google chrome
Check if it works when u use client
he owns the notebook the site links to, but it should be just a copy of the repo
nope doesnt work
this doesnt work too
Did u set input to microphone and line 1 to output after using client
yes i did exactly that
no audio
Did u also follow this when u did the 3rd step
Yeah, what I was saying is the applio docs should be updated to using the GitHub import version instead of the Vidal link
yes i think so
Could you check?
stop the conversion, then disable sup1
which is better?
mainline for inference
Between mainline and applio? Quality wouldn't change much
other voice models work through discord, but this one for some reason just gives no audio i dont know why
well, i'm not sure about that... the copy is frozen for a specific release
whats inference now 😭
something will be better right ? thats why they are 2 not one
getting the vocals for ai cover 😲
It could prob just be a model issue then, it's the only single one that doesn't work?
Could u also send the link to the model download link
It's prob just a bad model
yes all other models i tried work on discord just this one gives no audio through the discord, i set to monitor the voice i can hear the voice model when i set monitor to my speakers
so i should do rvc mainline to make model then use appolio for making covers?
but theres no audio on discord
applio is easier to use
dont care about ease , need the best one
mainline better?
Both wouldn't change in quality
😭
theyre the same thing brah
Mainline and applio quality are the same
so whats the difference
mainline is the original rvc
What changes in applio is the user interface being easier and some performance improvements compared to mainline which is the original
yea and appolio is the fork
And some features like tts
doesnt work lol
change to client
what do u suggest
appolio ?
also doesnt work lol

Applio* yeah
First, what's your PC GPU ?
rx6600 , it supports amd right?
i have 1080ti but its being used at another system
You need a special guide for AMD
like the onyx ?
Because AMD is less supported in most ai programs than Nvidia
no worries i will get help from my friend he has 4070
You need to use Zluda
A CUDA emulator
It's just longer to follow but will work
i will follow it but do u think i should use my own machine or nvidia will be better?
Well that RTX 4070 will have better performance
In speed terms
Quality should be the same if you use on ur amd GPU rmvpe
yeah i know that mate but thank for you explaining so simply
you are so good!
4070TI Super is 6-7x faster than 6700xt for Applio
6600 is memory limited
dont have that , he has 4070 laptop so its a mobile version so it will actually perform like 4060 desktop version
yeah, 4070 -30% for laptop
@low shard so is there a fix for the voice model not working or should i just get a new model?
i just tried it myself, it works fine
try re-uploading it in wokada
and be sure u extracted the zip
ok
@low shard which batch size should i use with kaggle?
depends on dataset lenght
8 should do the job
and then save should i put that on 20?
and for the epoch should i start at 250 and then when it stars dropping just stop the training?
ok it works now
5 or 10 should be finhe
put epochs to 500, it will train from 0 to 500
and u check the tensorboard how it goes
you're welcome
alright bet thank you so muc for all the help
should i mess with any of the advanced settings or just not touch those?
u should be fine as long as ur using rmvpe
alright brotha ill lyk how it goes thanks again!
Hey guys I need to make an AI cover using Yoda and I don't know how to do it in any sense, I have used jammable and a lot of other services and it's not very good, does anyone have any ideas on what I can try?
what's your pc gpu?
all that those sites use is in reality the FOSS (Free and Open Source Software) named RVC (Retrieval-based-voice-conversion)
you can run it locally on your pc
Ur gpu?
Just wonderiing
wbu
is this normal?
this is the tedious part of rvc, there's no way to tell
its all random
thats for epochs
there would be a part where more echopos makes little to no difference
Im training with 8 minutes. The female can talk
So
i want to make ai cover
yeah too many epochs doesn't mean its better
if you want to do a singing cover, a singing model its better
i can sing any bullshit ?
i know shit
in singing
i made it in past
but anything less than that works
are you recording your own voice? then sing, use your whole range
singing + talking or just singing bullshit ?
just singing
okay
bullshit works ?
@low shard mb for all the pings is this bad for the model or normal?
wdym ?
the pre train ?
yuh
it's normal
you can sing anything
also it just started
cant i make it mix?
overtraining doesn't happen instantly
no point in mixing talking if the model is only gonna be used for singing
it can be good to get the tone of voice or no? i did that with mine
how can i know when to stop it ?
the model will randomly use the speech samples during a song
instead of fully using the singing samples
welp i guess we will see how it plays out 😭
its all random so i cant even predict that
Last update: Dec 24, 2024
AI is gambling 🔥
it would be better yeah
havent tried that
try 500 epochs
and hear all of them until you find the one that sounds more natural

💀
graphs don't help in choosing the epoch, they're meant to tell you how well the model is doing during training
1 hour set bs 8
hmmm
around 4 hours?
u mean 100 200 300 400 500 ?
bs means?
batch size means each time it takes backup right?
batch size is how many samples at the time rvc will learn
too high its bad
Too low also its bad
u meant that right?
u said 8 i will do 8
i will spam ping u when i have issues ty so much buddy
😭
i dm u ?
alright ts got me stressing 😭
just ask here bro there are like 10 people that can answer ur questions
loss goes down = less losing shortly explained
but i like u 👉 👈
if u want the easiest way ever possible, weights.com exists too

both of yall are banned 🔥
but i am a cute panda
training models aint hard
just tedious
i know but i want to impress my 3rd gf with my ai cover
so wanna make it better than previous ones in past
record yourself, after dat denoise your audio, then run this in audacity
i will require it when i do in practical
and use the simple slicing in applio, 3s and 0.3s of overlap
i will lost it till then
so i have to disturb u again
i will ping u again tomorrow
should go to bed
Good morning and good night
btw jokes appart u guys really helped a lot , it didnt helped anything but still u helped haha i appreciate it alot thanks a lot for your time buddies
❤️
good night 
i retruth this
@low shard still normal or no
I have a nividia rtx 3050
Problem is I need to do it quickly, I don't want to have to start training anything from scratch
😭
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.com: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio UI Colab: max 4 hours daily, not granted, of GPU
- RVC-AI-Cover-Maker-U Colab: Automatically separates the vocals and instrumentals, converts the voice and mix all together back
Easiest possible (automatically separates vocals & instrumentals) : weights.com & rvc-ai-cover-maker-ui
easiest cloud: Ilaria rvc zero
easiest local: Applio
if u just want a quick thing as easy as possible then use weights.com
Thank you so much
yes
you're welcome and let me know
should i stop it at 16:50 i started it at 16:00 or around that
can sm1 send me okada voice changer thru google drive its so complicated to download
hello i need help with the ai voicechanger
wdym? elaborate the issue more
huh
what's your pc gpu?
what's the issue? and what guide link did u use?
amd 6700xt
laggy and slow in VR
share a screenshot of ur wokada
!give-media-perms 1h @toxic nacelle
should i stop now?
not yet
1 sec
alright
@low shard
lemme guess
u used a youtube tutorial?
because those settings are completely wrong and ur using the original wokada
yeah, noone want to help me around
uninstall everything you got off it
uninstall also vb audio cable, it gives random issues on windows
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
read the 1st link, wokada deiteris fork
Alr
@low shard should i keep it going or stop it
@lofty lichenyou relize i can see deleted text
Welp it's in English because that's the most spoken language internationally
uunderstandable
It's better you try to read it, use a translator, and let us know for any issues
yeah but idk where is the download
alrightly
Wait still a bit
im pretty sure its at epoch 289
Yes, that's the good thing
This isn't a graph about quality
It's about the machine learning model loss in the training
Read how the tensorboard works in the docs
@low shardis this right?
GPU: ur amd GPU
Extra: 2.0
Chunk: since youre going to use it in a game, you will need to check the perf value at top left when you click start, then put a little higher value as chunk
Did you do the step 3 and download vac lite?
not yet
You need to
you're looking at a wrong chart
look at 3 g losses - fm, mel, total
@low shardok fixed but kinda slow
is it this one?
do i keep it going or should i stop it
(im at epoch 435)
@simple ore
@low shard
Hey, I actually just started a chill little server where we share useful tools, automation tricks, and side hustle ideas.
If you're into that, I can send you the link!
I need help! My applio is still fucked up and i can't make ai covers or TTs voices
I have a good GPU It's Nvidia 3050 Geforce Rtx too
I din't get any errors
Hi! what are the differences between client and server? Other than echo/supp1-2? [Okada]
once again your options are 1) use compiled version that does not require 'installing' step, 2) env\python -m pip install gradio==5.23.1
I already have that
then open command prompt and run env\python -m pip show gradio
i mean now and post a screenshot of the result
is this an
now run the command I gave you
i already
and upgrade it to 5.23.1
dont bs me, i see thru your lies
now run applio
ok will work?
noobies if my voice is kinda crackly near higher pitch would it be a good idea to add a dataset of a little of similar artists that have proper pitch or would that mess it up?
yes it finally worked
im adding falsetto & some higher freq/pitched vocals to the dataset to see if it fixes the problem also took around 60% of the talking out to make sure it doesnt interfere but just there for tone 22 minutes long 650 epochs lets see how this goes i guess
you either need a singing pretrain or you need to add singing data
ima try this and see how it goes if its still not to my liking ill find a pretrain but i thought i read somewhere its bad to work on a existing pretrain
hey so like, is this thing supposed to use 56% of my cpu? it wasnt doing it yesterday
also Im using okada's changer
Hey guys quick question with tts-with-rvc
Is there a way to speed up the TTS voice?
depends on the tts you're using
I'm just on windows
applio uses edge tts, there's a speed parameter
tts = TTS_RVC(
model_path="cartman.pth",
index_path="cartman.index",
voice="en-US-EricNeural",
f0_method="rmvpe"
)
Yea MS crap so there should be a way to control the speed
I dug into the library but didnt see where its getting the tts audio from and how it generates it
Oh I think I found the speech function in there
async def tts_communicate(text,
tmp_directory=None,
voice="ru-RU-DmitryNeural",
tts_add_rate=0,
tts_add_volume=0,
tts_add_pitch=0):
rate should be non 0
+5 if you want 5% faster
Got this error though
path = tts(text=ai_response, pitch=0, index_rate=0.85, tts_add_rate=5, tts_volume=2, is_half=True)
TypeError: TTS_RVC.call() got an unexpected keyword argument 'tts_add_rate'
it is tts_rate
oh different function there though, but I see the same option here: https://github.com/Atm4x/tts-with-rvc/blob/main/tts_with_rvc/inference.py
I tried tts_rate too and it made no change but no error
add a debug before `communicate = tts.Communicate(``
and print the value
anyway, that project uses ancient rvc code
why neither selected
is
that y
ty
it sounds robotic
as shit
where can i fix that
sounds robotic
Send your full screenshot.
Set chunk number to around 46 ms for less audio delay. Set "F0 Det." to regular rmvpe. Although you have set extra number correctly, but the audio still sounds robotbic, it can be the voice model itself. Try another one.
Hello everyone I am starting a project, what i need is a co-host for my YouTube gaming streaming, i already created a working model, but i have some problems, it uses eleven labs api, so ot cost me a lot, i would like to you guys to give me ideas on what voice clone open source good enough to do a Brazilian kratos Voice that i already have 5 minutes of isolate voice to use on the model, also I am using open ai api, o would like to do this project entirely open source, can i have some directions in what to use for my idea?
I have:
I9
4090
64 gb ram
at 46\
it goes red
the perf
If you use W-Okada with a game or something, increase the chunk number until perf number turns green.
I feel dumb. I was setting it to like 3 or 5. Setting it to 50 sure does the trick. LOL Thanks
good to know
what is this?
This is the UI of fork W-Okada the realtime voice changer.
do you have a link perchance
What is your PC GPU?
That will never get fixed. There's another better one to use.
What's the best way to do a duet locally? Right now I'm using ultimate-rvc, which I'm not sure if I can do it there, so I assume I need to separate the vocals another way, but I'd much prefer locally. Specifiacally male and female vocals.
unfortunately you can't download this great model
I tried MVSEP and it didn't really do that good of a job. I'm trying UVR5 and it just refuses to separate the male backing, messing around with different models to hopefully find a system that works.
how do i continue training a model
i wanna add a new dataset to it but i dont wanna re render the old dataset that was already processed
you can continue with the same dataset
wdym?
I would not advise trying to use a finetuned model as a pretrain for a larger model
the base has to be generalized enough
and by training a finetune model you're cutting off some of the parts
no im just adding more singing/rhytem to it
just start over
ts took like 5 hours😭
happens
so i cant just add the new audio to the same dataset and process it or will that basically just restart it or
as I just explained
imagine you have a block of marble and you're carving a statue out of it
a pretrain roughly cut shape of a human being, then you finetune it by cutting it into a shape of a bald man with a beard
and now you want to turn that into a statue of a woman with big tits holding a vase
did I make a point?
sorry im not 100% with the vocab for a lot of the ai stuff even if lets say i created the model from scratch and its just singing/humming etc and i just wanna add more would that still fall in the criteria of that
yeah just not 100% on the vocab for that stuff but i get ur point just not sure of what a pretrain or finetune is
but either way ill just start over im probably gonna go to bed anyway and let it sit
I gave you totally not AI-based explanation
pretrain is the weights you're using as a base
finetuning is using a small dataset to finalize the voice model
ahhh okay
n quick question for the dataset does it have to be in just one file for all the audio or can it be multiple?
it can be many files in the same folder
ideally they need to be same quality recording-wise
can someone walk me through setting up realtime voice changer using these ai models
alrighty thank you also do you have any clue on how to delete old models its taking up storage for kaggle😭
can someone tell me what happened
use kaggle it worked good for me
should i stick with fps16 or fps32 @simple ore
damn i just paid, few days ago for it
full error
so bad token?
or whatever it says
Make sure you censor your ngrok token before sending screenshot, otherwise someone would snatch your token for their things. What is your PC GPU? Because W-Okada can work locally without having hard time finding ngrok token for Colab/Kaggle.
Hii is there an ai which can like fake mouth movement on a video where the persons mouth is closed
too late for that 🙂
i did refresg
fp16 is not stable, more often than not the model just explodes
you can see that by a bunch of triangles shown on the charts
thank god i just put it at 32 and got in bed😭
So what is your PC GPU?
With that GPU, it should work. Just won't be that fast like one in Colab/Kaggle. https://rentry.co/ForkVoiceChangerGuide#download-nvidia-on-windows
That's pretty much it. Most of W-Okada Colab notebooks are broken. Even using fork W-Okada Colab can get your account to be terminated, especially when you have free tier.
No excuse. If you have bought Colab Pro or compute units, make sure you keep it set to T4 GPU instead of A100 or L4, because these two GPUs eat more compute units than T4 one.
Kaggle gives an option to use two of T4 GPUs for free 30 hours a week. But if you have bought Colab Pro or pay as you go, then use it to your advantage I guess.
So does RealtimeVC mean it can't be used anymore?
Don't worry about it. You can still use fork W-Okada. But make sure you got a right ngrok token if you wanna run it on Colab or Kaggle.
got it, thank you so much
What's up?
@steel forge warning while sleeping?
What
there are many other cheaper GPU rentals
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
while sleeping
You were warned for a gif you sent 9 hours ago of an animated walking penis 😭
Just don't send that gif again
lmao then remove the gifs from the server
No
yip
Sending what you got warned by mod into public chat won't fix your issue.
Ok fine i guess
just dont
anyone can search it on the gif at the side of the chat
then tell owner to remove gif access
its not considered nsfw by discord since its there
By tenor*
that gif should also be reported in the site itself since I'm sure it'd violate its TOS
And it doesn't really matter, we make our own rules in our own server. Our staff saw that as too NSFW, so they removed it.
So if you can avoid sending things that might be on the line, wed appreciate it. We have kids here
not my problem
Ok
@hallow thistle still the same
Don't be a dick here. Anyone can make a mistake. The warning should make you aware of what you're doing. If you think you were unfairly treated, you can write down your appeal in AI Hub's website and let what other mods think about you. 
what the issues @hallow thistle ?
You need to scroll to the side to see the rest of the error
"Failed to accept"
voice changer colabs tend to have issues, another option is this
https://www.kaggle.com/code/suneku/voice-changer-public
Although you can still use Colab for other AI programs like Applio or UVR5. 
many of my numbers can't be used on kaggle but ok i'll try
I've never had any problem using W-Okada on Kaggle. The error can happen when you skipped some important steps like where to start the notebook and the settings.
is this setting voice good enought guys?
Use Virtual Audio Cable lite instead. VB-Cable gives random issues to Windows users, as many people complained to me.

That's fine
Yea looks about right for everything
I'm so used to seeing nightmare settings so yours are nice to see
If you have any problem using VB-Cable, you can try switch to VAC next time. 
Force fp32 mode on in advanced for improved quality
yo
i need help
my voice changer is super laggy when i run it
how do i fix that
im on AMD
Server can also have less delay using wasapi or asio
Share a screenshot of ur wokada and what u are using it in
i literally cant share a screenshot
there u go
Is it posible to tailor AI Voice to your own, like if someone is making AI model for me, they can also tailor it to my voice so that the sound is better, more fuild, less choppy ?
Hai. Can anyone show me how to train model in colab? I just came back after a year so i don’t remember anything 
@low shard
!give-media-perms 1h @past barn
Holy shit you're using an ancient version
That's prehistorical
Lemme guess, you used YouTube tutorials?
All video tutorials are outdated asf
Did u first try to check if ur PC GPU is good enough
Its not good thats why i wanna use colab
Where can you find the latest voice changer build ?
You can check out this guide 🙂
https://docs.applio.org/applio/getting-started/other-alternatives
Alright thank u 
Anyone got info since i only know of youtube ones ?
Github
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
rtx 3060ti
Wokada doesn't support it yet
Nice, get the windows Nvidia one
When will it?
Who knows
RefineGAN is experimental
It is disabled even in applio main branch
Since it's not that stable yet
Thank you.
Someone made a voice model of himself and merged that model into the model he wanted by small percentages to have the tonuation of his in it. I tested it myself but didnt do much, choppyness is usually result of bad model since realtime is way more demanding to be clean
Also, the colab notebook gives you only 4 hours of free runtime daily, but if you want a longer runtime, then you can use Kaggle notebook, they offer up to 30 hours of free GPU/TPU time per week
anyone know why my voice is cutting out while talking
same as for discord
my voice only cuts off while using echo feature on noise, anyone know why?
Max 4 hours, but it can be different daily
One day could be 4 hours
Another could be 1
Another could be 3
It's not granted time
Kaggle has granted time instead
Share a screenshot of ur wokada
!give-media-perms 1h @outer fog
What game are u playing
Hey, I'm trying to play the app in server mode but I can't see my headphone in windows WASAPI
I want to use my gpu but it shows me there is only cpu option to choose
how can i use my gpu instead cpu
how to fix mic background noises?
is it possible to kinda silent them?
because my mic is ig broken or something
the ai voice detects these noises as my voice and there is like a permanent moaning
and my voice has a kind of an echo without even activating echo
did u follow correctly the wasapi guide?
check sup2
elaborate your issue:
- what's your pc gpu
- what do u want to do
- a screenshot of the program
!give-media-perms 1h @stone lynx
holy shiet
that's an anciet version, an old version of original wokada
over a year old
never follow youtube tuts for wokada
they are all old
also vb audio cable gives random issues on windows
uninstall all you got off youtube
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
read the 1st link, wokada deiteris fork
there's none
you gotta run the .exe
it's a different version of the program, the deiteris fork
please be sure to not skip any steps of the guide
wait it says
It says it will open in the browser.*
so i cant use it as program anymore right?
it is still a program, it runs locally on your hardware
both programs always used a web user interface made in javascript and typescript
the only difference is that original wokada made it's own browser window to open the local hosted url, which iirc is removed in deiteris fork for sometimes cuasing performance issues
i just updated the guide to tell users to check that part around the parts where it says it will open in the browser
How can I completely uninstall mangio rvc from my pc?
i think u already got helped there #✦│chat message
yes :))
now it works ty
which one is better
to use
Use rmvpe onnx for best quality
Fcpe is good only for being faster but shittier quality
Hello
Is there a voice changer where I could modify my voice just partially? Something like voice model changing voice by 20% let’s say. Aim to mix, modify, not transform completely. I was checking w-okada and some other software, either they don’t have such features or I don’t know how to use them. I’m experimenting.
hello again
When I speak, I hear a small voice repeating the last word I said.Is there a way to fix this
I don't think so, maybe @pastel oak got an idea though
Share a screenshot of ur wokada
Dont think so
Alternative i can think of is making a model of your own voice then merging it with one you want with percentages


i guess it used system in and out