#✨│ai-help
1 messages · Page 251 of 1
lol
you're honor i am innocent
lmaooo
i didn't
that's pretty obvious it did not load anything
yeah it is
remove "
i just don't read the cmd

is that the reason??
and for fks sake, move the whole thing into C:\Applio

i'm preprocessing it
wym
sorry i didn't mean to be rude
my fingers genuinely typed that faster than my brain
i'm not even lying
I mean move the applio folder to the root
root of what
because keeping software under Desktop is not right
well i don't really consider it like that tbh
idk why
also i do need to access it quite a bit
for model files
so having to search for it everytime is a bit more annoying then just clicking straight to it
also holy moly this preprocess is taking ages
i have my trainer in my desktop too 
yay
Is there any alternative to vac lite (have to pay for a license) or VB audio (doesn't seem to work on win 10)
you dont have to pay for vac v4.7.0 lite
Any clue why I can't get any inferencing models to show up in RVC WebUI? I got everything working as far as I can tell, but putting downloaded .pth and .index files in their respective locations (weights and logs) does nothing.
vac lite will keep spamming my ears with that automatic voice about the trial, it is annoying
hi so it's been over half an hour preprocessing aint done
should i be concerned
you did not download the lite version
yes
1
the cmd hasn't updated yet
oh wtf
i accidentally hit a key on my keyboard in cmd
did you accidentally click on the terminal window and selected a text
and it started preprocessing
no
i hit enter
and it started it
are you FUCKING KJIDDING ME
yes, you gradded a mouse
selecting a text
and it froze everything until you press enter and copy the text
insane
use this, uninstall the other thing
That to bad.
I already installed it im using it in google colab
Oh.
um so i extracted the features and i scrolled down and i see stop training
instead of start training
i ain't even training anything 🥀
because UI thinks you're still training
f5
then make sure you re-select model name and sample rate
Okay I'm using this now, voice works fine. But when paired with discord and testing the voice, I cannot hear myself. Is it only supposed to work inside a VC?
you just need to route the audio properly
and use push to talk in discord / turn off Krisp voice or whatever it is
guys how to use applio in google colab for ai covers
google colab tends to disconnect users when using gpu instances on an external webui
i'd recommend not to go that way if thats not your only choice
have you even tried weights yet?
no
does anyone know where i can find a good real time voice changer software?
i heard okada wasn’t bad but i’m not sure if there’s better that’s available
diabolical, it doesn't even appear in the dropdown
check it out, you can create ai covers in the weights app, no need to go through all that trouble unless you specifically want full customization over what your doing
click refresh button
i did
weights ai model training sucks imo
no need to pay @prime dagger you can create 20 for free per day
24 threads seem crazy, must be an intel i7 cpu
but anyway using more cores in preprocessing & feature doesnt quite give benefit over 1 thread
well ngl, you probably either didnt give it a good dataset, or just messed something up
cuz its perfectly fine imo
hell nah i hate intel
🤷♂️ maybe
lets goooo
finally starting training
yes i learned my lesson
fucking hell my pc made my room so warm now
me when heatwave inside my room

lolll
what does this random spike mean?
suddenly went up and then went back down
noise
it's normal
what does that mean
why? is it better voice conversion quality than rvc?
check grad norm g and d
grad norm G should be less than 1k
should be around the 500 range
rvc = ai voice cloning software, retrieval based voice conversion
w-okada = rvc models in realtime
vonovox = rvc models in realtime
uh this?
i don't really know what i'm looking at
imo loss graphs aren't worth enough to watch, rvc finetune barely/never diverges and they lack the issues that are present in training from scratch
if grad norm g below 1k = fine
they can be a bit misleading too, for example, g/total going down while your model is frying
mel always goes down
fm always slowly rises
kl always converge at some value
d/total is a bit useful since it tells you if your discriminator is either too strong or too weak
theyre ok
why what happens if it goes above 1k
chances of the model having some weird issue increases
can be anything, it's random
may also affect the quality too
what i do for my models is to save every 10 epochs and hear them
interesting
then you just choose one epoch that isnt robotic and sounds ok
client or sever for okada which is better fir quality
setting input volume above 100 may possibly cause distortion due to clipping
tl;dr it affects more of audio latency
i let it train for a little while longer
it doesn't look like much has changed i think
hear it
oke
don't blindly trust the graphs, like i told you before, g/total likes to go down even when the model is overtraining
what is the current value of d/total?
Very quick question chat, so if i download a model from weights, and mess with the settings a bit I can use it as a real time voice with okada? Im assuming this makes it sound better quality?
i stopped the training
i'm testing it out
by ear
if it's overtrained you'll notice the model sounds compressed, robotic, very unnatural
compressed as in the effect or like compressed as in quality
as the effect
now ur job is to go down every 10e
eventually one epoch will sound ok
see? thats why i told u to forget about the graphs xD
lmaoo yeah 😭
@analog obsidian is this accurate?
weights models are regular rvc models
so yeah they work with realtime voice changer apps like w-okada and vonovox
and also local rvc
which do you recommend?
idk i don't use other people models
I see, so they are all fairly similar though quality wise? I know very little but from what I can see online there seems to be a lot of settings you can tweak to change the quality of the voices
^ at least in w-okada
the quality of the model depends in how good the quality was in the dataset, if the dataset had trash quality, you'll get a trash model
w-okada settings only change how the inference works, you can't change how good a model naturally sounds
lets say a model has trash quality, even if u set extra 5s and enable fp32 inference, the model will still sound bad
okay thanks for the thoughts Im going to attempt to download okada and a model and see if I can get it working
good luck 
I'm getting "No features exist for this model yet. Did you run Feature Extraction?" no matter what dataset I'm using. I even tried a dataset I recently successfully trained with and I'm still getting the same message. Is it a bug?
hi guys morning afternoon or evening. i have a question. i dont have a nvidia only amd so im using google colab to train a ai voice. rvc v2 disconnected. what does the red thing mean when i run dependencies. cuz i always keep getting the red thing on different browser and even my brother pc who has amd also. i check the video and the guy in the vid never showed what happen after he run dependencies
ive got the voice changer working only thing is, whats the best way to tweak the settings to get the desired effect?
its kinda inconsistent with the settings and Idk if I should be more focused on tweaking my mic sensitivity or the voice gain/tuning
everything about w-okada here: https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/
Last update: May 5, 2025
i wanted to wait to respond until after i trained a model to see if it worked but i'm like 10 hours in and just got to epoch 100/500 so uh it'll take a couple days 💀 but i did follow your instructions and it seems to have worked perfectly so far, so thank you very much!
@golden walrus hello, do you perhaps have some model that support spin? I wanna test new embedded
is it possible to combine two voices while having it sound normal when training?
what happened to the total number of AI Voice uses?
test several model checkpoints
the results may be somewhat unpredictable
my model suck
Just wanna see how spin work with rvc, doesnt matter if it sucks
when i try to load in it keep sayin waiting for web and itm does forever
google colab isnt working it says no version found for fais gpu
okay, i will send you the model.
it's more likely outdated
um i kinda wanna use it cause my gpu is integrated
and i have no idea if i can run it
try either of these:
https://docs.aihub.gg/rvc/cloud/applio-colab/
https://docs.aihub.gg/rvc/cloud/applio-kaggle/
Last update: June 15, 2024
Last update: Jan 13, 2025
tysm
how do i do real time voice changing in this
smh you didn't say for voice changer
it would instead be this https://docs.aihub.gg/rvc-voice-changer/cloud/w-okada-kaggle/
Last update: May 5, 2025
mb
oh ok
is that real time right?
the colab one may risk your account banned unless paying for the compute units
ahh ok
do i gotta repeat these steps everytime i use it?
the session state and data won't be saved after disconnection
(ask chatGPT if you don't understand it)
ah ok
the time limit is more than enough
how do i get more volume?
How do I get the voice changer to work in asio?
I would recommend using Wasapi if you are a normal user.
https://rentry.co/LessDelayWasapi
What does ASIO do?
ASIO accesses your audio devices directly, while the driver that you always use on the daily (which is "MME") goes through multiple layers within the Windows audio subsystem, ca...
you mean above the volume limit?
I already have an audio interface driver, should I download flexasio?
the guide says so
Don't the two drivers collide?
wdym? what another driver?
Audio interface driver
nah another one again
flexasio is a virtual audio interface driver
I use ASIO4ALL, but I haven't installed the program, other than being ReaRoute ASIO.
Can you choose the output for the input?
It's just this on W-Okada. As what I remember, ASIO4ALL lets you to change and select the main input/output devices there within the program.
Can you put output in the input? Not the microphone
Your question seems confusing, but the answer is "no".
Virtual Audio Cable provides both its input and output at the same time. A pair of speakers/headphones don't function as an input device, they are output devices.
I'm planning to set input:microphone output:output in the daw program and put output in voice changer input. Can I choose output in flexasio input?
Unable to set line1 in the output because the daw program returns to asio

i'll try thx
and why you use 44100hz??
I think it should be 48k for server mode, otherwise it may throw some error
you are server mode
Is that a problem? I did that mainly for example, not as a practical scenario.
Although I may set it to server mode in case if I have to use W-Okada with ASIO. However, I mostly use client mode in general.
What is the main factor that determines the quality of the voice changer? in this case w okada.
Equipment I have 3 1080TI's laying around and some hardware. My Main desktop is running 32gigs of drr4, intel i9 11th gen and an rtx 3090(24gig vram),
My Idea: connect the graphics cards into a rig(already done) add some kind of software layer to it so I can route all that power into w-okada or something simllar, then conntect the whole thing with fiber optics to my computer to stream the audio,
Why? I want my main computer to run without the AI bloat and such, peak performence, and if I can just use a 2nd machine for all the ai stuff and stream it to my pc, that would be ideal, I was also thinking 3 1080 ti's might also be better then this setup for AI voice changers in general, not mentioning the fact the desktop at the moment needs to run both the voice changer and all other apps I need at the same time.
1080ti is not ideal for the voice changer, but you can run it
have a second computer with 1080 running the server, connect to use from the browser on your main pc
Thats actually a pretty good idea, do you think the 3 1080 ti's can be the 3090?
i dont think you can use 3x cards at the same time
Is the gpu even the deciding factor is all of this?
This is currently my setup using my main computer.
It works very comfortably.
gpu is important if you want to run a voice changer and a game on the same pc at the same time
What if I'd like to build a pc for the AI voice?
a 3050 or higher would be ideal
Hm
1080 are old, but you can still use one to run the voice changer
Oh yeah but the performence will take a huge hit.
since that's all it is gonna do, there will be no performance issue
game on 3090, vc on 1080 in 2nd pc
The only reason I will switch the voice changer to a seperate computer is if the voice changer runs better on the 2nd pc
or the same.
Oh yeah I understand that part
you can just have a second card in your main pc
but uness you got an actual pcie x16 slot with at least x8 bus I would not recommend it.. 1080 on x4 gonna suck
so using a single 3090 is better
also RTX 2060 or 3050 is still better than the GTX cards
Why would it being real time matter? its pulling resources from the gpu, coulden't it combine the power?
(Note. I could be talking out of my ass, please call me out if that is the case.)
what?
Would a 3050 on a separate machine that is only used for the ai voice be better, or on par with the 3090?
mb I meant if in case using multi gpu workload for the voice changer alone, perhaps
which means like utilizing dual T4 in kaggle
it wont be dual gpu, you pick a gpu to run on in the voice changer
so a heavy game would run on 3090 while vc can run on second card (or second pc)
I see! the reason I decided to go with duel gpu's is because I already got a 3 gpu rig setup and running, so I was thinking it would be mega easy to just run ai on it if its better.
^this, and I don't think SLI/NVlink would benefit from it
1080 Ti will be just 1080 Ti
sli is long dead, nvlink requires higher cards
Yeah I guess so! if that is the case, then I would have to look into making a computer for it spesficly
If you guys have some time, and are even mildly intrested in it, I would love some help regarding that
make something out of spare parts
I want something that can be as good as my 3090 setup.
I don't have spare parts on that level.
These are my current settings, I need a system that can support this at the minimum
1080 is 8 years old, so something at that level
sell those GTX cards and you could get a decent RTX card in exchange
intel 11th gen is 4 years old
Definitely an option.
it is old but isnt the cpu a non factor in a machine that only runs AI?
I'd still not consider intel 10th gen or the AMD counterpart too old
well... what's your goal for the main pc?
Well I don't care about the main computer at the moment?
I'l talking about the other machine.
then find something old that can be used to run your 1080
i mean.. gee, my old spare rig with 4790k can run it
The main computer does its thing, boot games, do work, etc.
the AI pc needs to be able to run, the, well, AI.
to my understanding, the AI works like this: The better the gpu, the better it works. (faster response time, less glitching, lower delay, etc.)
I don't want to invest in a 2nd computer for ai for it to just run okay, I want something that can run good. really good. at least to the level my current computer runs things.
there's more to AI than just a realtime voice changer
Of course, but I don't need those.
I use the voice changer for work related reasons.
if you just want to use the voice changer, find a 10-year old piece of shit pc, plug your 1080, enjoy
If I do as your saying, and plug the 1080 "piece of shit" and run it for solely AI. would the ai, run better, or just as well as the computer I'm currently running.
meaning I can keep the chunk and extra.
the same or change them to even better?
3090 is 12-cylinder turbo charged engine, you have installed it into a regular car.. okay, it's fne. your 1080 is 6-cylinder engine, find a piece of shit honda civic to install it in and it will be fine
when you try to run a heavy 3d game and CUDA application on the same GPU at the same time, they compete for resources
unlike CPu where you can assign an application to only some of the cpu cores, there's no such thing for GPUs
so the game is asking the GPU to generate as many frames as possible and the CUDA application is asking it to crunch some nmbers as fast as possibl, that results in weird things
when you have a separate card, even in the same PC, to run CUDA task, everything work fine without delays or lag
So that would mean If I did plug the 1080 I could do this for example, and it would work?
not 2.7ms chunk lol
Ptff
GTX 10xx-series 320 ms chunk + 2.0s extra
RTX xx90 (e.g. 3090) 72 ms chunk + 2.7s extra
The 1080, or in this case the gtx seris, is too slow for me. I want to get something that has at least the same as the 3090.
The 3090 is largly overpriced, are there any equivalent cards you can recommend
would a 40 Series be better?
i can only tell you to check 1080 and play with the chunk to see how low can you go
since you're going to use it as the dedicated card, it can probably go lower
GeForce RTX 40 GPUs can be better than RTX 30, but their prices can be something.
Second-handed ones, real. 
Clearance (FOR PARTS) Gigabyte RTX 4090 Windforce (No Core & VRAM) With Box
found this for 135 dollars
https://www.ebay.com/itm/266771335969
English is not my first lang.
it is as dead as a brick
Yeah seems like it, thank you
You've been a huge help, and continue being really nice!
Making this whole thing be wayyy easier.
The "No core and VRAM" means there are neither actual chip or VRAM chip present on the "bare board", making the GPU unusable without them. This kind of bare board GPU is usually as donor part or a decoration in your room.
Thanks!
Atm, I'm looking in ebay and amazon for used rtx 3070s
as you guys suggested
Finding them for around 300 dollars so far?
So this is what Gigabyte NVIDIA GeForce RTX 4090 board without the chip and VRAM looks like. https://i.ebayimg.com/images/g/vVEAAOSwodJl~gRA/s-l1600.jpg
What should I opt for in terms for motherboard cpu or etc
you should read the description carefully
ah yes that topic is already done
Can somebody give me there voice changer settings please? Just one that’s efficient
Whats your gpu?
And which W-Okada version are you using? There are two versions of W-Okada.
Wow, that's an old W-Okada, outdated. Download and use the better W-Okada from this link instead. https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#download-nvidia-on-windows
Last update: May 5, 2025
Btw from what I'm seeing I can probabbly obtain an rtx 5080, do you think that would be better for the AI pc?
I'm kinda stuck on picking all the other parts for it
does it need a cpu? what about the mother board? etc
RTX 50 GPUs are too new. Not every AI program has developed to work on this specific RTX GPU, although a few AI programs like Stable Diffusion can.
I need to prepare a voice sample to upload to rvc gui. This is a recording of the voice that I want to use in this program. I've been trying for an hour and it doesn't detect it after creating a zip file and importing it. I'll send it in dm because it's not possible here.
RTX 50-series doesnt work for applications using pytorch 2.6/older
And I assume w okada runs that then?
That doesn't mean W-Okada won't work at all. There's a specific version of fork W-Okada developed for RTX 50 GPU, the program itself is experimental and might have a bug.
Oh I see
I'd recommend vonovox in the guide
I also know this is a hot topic, but is w okada even the best ai realtime voice changer?
may not be beginner friendly but it works for your 5080
I'll check it
Hello peko!
This whole thing left the beginner friendly a long time ago
I wanna build a computer for an AI voice changer-
Also, separate question, would it be worth it paying someone to make me a voice model based on a database?
The Vonovox, while not as widely known as W-Okada, its UI looks like this. https://raw.githubusercontent.com/dr87/Vonovox/main/docs/images/ui1.png
There already is one based on the database I wanna use, but I'd like to find someone who can remake it to be even better
with such high end gpu, I'd recommend ryzen 7700X or 9800X3D and an affordable AM5 mobo as possible
including more laughs and more to make the whole voice changer feel as "human" as I can.
Lovley thanks thats great.
try not to laugh challenge
or just non-raspy "hahaha"
i mean.. geez.. if you're going to get 9800x3d, that's the top tier gaming CPU
using it for a voice changer spare parts PC is a huge overkill
@knotty moth why would you even suggest that?
it would be "upgrade your main pc, move its parts down to spare"
@ocean holly dont get carried away
I assumed for general use upgrade plan (main pc), not always specific voice changer
but ye mb the "moving parts down" problem may seem convoluting
he clearly said he does not want to touch his main pc
which is 4-year old one with 3090
Hm, is it though? I'm running the voice changer for an avarage of 6 hours a day, everyday, for more then a year.
I'd want this thing to be as good as it can be
Hey hello,
I am having issues training a voice model using the RVC WebUI
please help
First of all, what's your GPU?
And which RVC version are you using?
there'a reasonable expense and there's crazy
if you can install a second card into your main pc, for the price of just 9800x3d you can do that
THANK YOU SO MUCH JUST SET UP AND IT WORKS SO WELL THERES LITTLE TO NO DELAY AT ALL
NVIDIA RTX 3060 6GB Laptop GPU,
i cloned the RVC repo today. so i guess it's the latest version?
I'm not sure, but you can also try Applio since it gets updated more often.
-rvc
Right here you got some guides.
Applio? i'll take a look
i tried training via rvc webui on Windows 11 and it failed. now i am on WSL2 Ubuntu 24.04 and i managed to make the model.pth; but I can't make the .index....
Thank you Leo
you dont need wsl
I just wanted to try in case it worked on ubuntu..
WOW APPLIO LOOKS VERY GOOD
And it sets to my language automatically, no chinese mistranslation gibberish
Is there a link to get started on training voice models?
You can start by reading these docs.
thank you!
You're welcome buddy.
i think this is realy cool how people are making models and it's exciting to know how everything is done behind the curtains
Yeah
- rvc
I wish you all a good hello. I have a question which once has brought me to the limits of my nerves. Does anyone know either a working google colab image to video ai with 60fps and 1080p feature that does not take 192938389283883hours until its generated ? If not then does anyone know any ai that i can use ? One free video per day is enough. All help is greatly appreciated.
Anyone know how to remove voices on real time voice changer I think so put too many
I too like to eat a 12-course gourmet dinner for free once a day.
Please, maybe someone can create a model for RVC GUI based on a sample of my voice, the file is in .wav. I ended up with some squeaky voice and I spent half a day on this. It can be done quickly. I want to create good content and I can't fulfill my dreams.
It seems like you love to provoke other people, no ? I am aware of the costs to run all of this you narrow minded idiot. Normally i would go further into this and explain my reason to why i temporarely look for this but i dont have the patience and power to do this. Its kinda ironic that only assholes are met on "weights" server, your personality is just as rotten as their founders.
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
AI needs good hardware, no one is gonna pay thousands of dollars to give you free unlimited power lol
Rvc GUI is extremely outdated, don't use video tutorials they are all old
Also it can't be done quickly, it takes a lot of time
What's your PC GPU?
Hello. I've been having issues with Deiteris' W Okada Fork; when I'm playing + recording with OBS, sometimes the VC has this issue where the Voice sounds as if it were "inhaling" while talking or sounds super raspy. I'm guessing this is due to strain on GPU. However, the perf number always stays green (which, to my understanding, is the indicative of the GPU doing more than it can handle). I've tried the troubleshooting of this document (https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/) but nothing seems to completely "solve" the problem. Does anyone have any tips for this?
the voice sounding raspy is a model issue, probably using an asmr model/soft model instead of a regular voice
no way to fix that
just using a model trained with regular natural speech is the fix
i tried a bunch of epochs and all were about the same except the more down i went the worse it sounded
Oh nono, it's not asmr and the samples have no whispering nor raspy. Also, it only happens when I game+record+vc, it doesn't happen if the GPU is not strained.
did you trained the model yourself?
Yes, I did. It is possible I could've made a mistake somewhere along the way though.
I did. No matter the chunk, it happens from time to time.
enable sup2
random inhaling sounds happen due to noise being present in the input audio
since breaths are technically noise
rvc sometimes mistakes noise for breathing
Let me try this
I have another Okada VC that doesn't do that (the quality is way shittier though, that's why I wanna make this one work)
i havent compared vonovox vs wokada myself so idk
i think vonovox realtime inference has a completely new realtime code
its pretty much wokada but fast
wokada uses mainline realtime code iirc
instead of using fp32 it uses tf32
sorry, where is that option?
Is there any way to mute the playback of the voice from lighthost? I cant mute it because it also mutes the voice, meaning I cant actually output it anywhere
found it
u cant change it in vonox afaik
nono, no using vonovox
oh ok
Using this one
ah yeah the fork of w-okada, same code but optimized
i have compared them multiple times and they produce the same result
if ur results are very different from each other then ur doing something wrong with one
or could be some weird windows interaction
both apps are extremely buggy and unstable
i wont be surprised if somehow one works but the other not, despite being the same thing 
Don't.
That program is paywalled garbage.
Use either vonovox or deiteris' fork instead.
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
hey im tying to use rvc but for some reason stops working as soon as i put on my headset
can any1 help?
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
- a screenshot of the program
the voice changer dont work fore me how to fix it?
hey can someone help me find what pretrain i need
Error(s) in loading state_dict for Synthesizer: i am getting this error
for the yuumy juice wrld model
I dont quite understand but you can't train using the existing model
what did you use?
details
screenshots
errors
this says nothing to me
it said the n_speakers was 9 and there was no config
I am on applio 3.2.8 bugfix is that why
from february
then wdym of this?
did he lowkey upload virus
pretrain is only needed for training a model (finetuning)
i was thinking it was because i didnt have a klm pretrain but I realized you do not need pretrain for model
yep
can you point me at what you had downloaded and tried to use as pretrain?
I think its because it cant find a config.json
so do u want to use the klm pretrain or the juice wrld model?
you cant use both of them
juice wrld
then go straight to the inference
i did it throws that error though
yep, my error is it just doesnt let me use the model and throws that state dict error
I am using applio 3.2.8 which should be fine because he released the model when that was the SOTA
hold on
did u change the speaker id? otherwise leave it 0
what do you mean by this? I changed nothing just plopped it in and it threw an error for having n_speakers = 9 or something
which version of applio you on?
could my version be causing this?
nvm ye just try the latest 3.2.9
i just dont wanna install new applio bc it says windows protected your poc
pc
should be fine but its unsigned still
download compiled version, unzip into C:\Applio
oh so I can just update preexisting folder?
thats a lot easier than fresh install ngl
I dont think it is 3.2.8 issue
dtype=torch.float16)}), ('config', [1025, 32, 192, 192, 768, 2, 6, 3, 0, '1', [3, 7, 11], [[1, 3, 5], [1, 3, 5], [1, 3, 5]], [10, 10, 2, 2], 512, [16, 16, 4, 4], 109, 256, 40000]), ('epoch', 345), ('step', 35190), ('sr', 40000), ('f0', True), ('version', 'v2'), ('creation_date', '2025-03-15T04:26:57.464005'), ('model_hash', '6e68437756a365f420100c31d8971f70a967755ff9670f7a460cab7b74bd4514'), ('dataset_length', '00:21:41'), ('model_name', 'DRFL(Scheming Model v1)'), ('author', 'None'), ('embedder_model', 'contentvec'), ('speakers_id', 1), ('vocoder', 'MRF HiFi-GAN')])
MRF HiFi-GAN
yes, you need 3.2.9
oh shit
I am nervous ngl I just tried to install 3.2.9 and i got a very sketch cmd prompt
theres 3.2.9??
2 months ago
???
thanks for your help @simple ore I really appreciated it man
saved me from confusion
Yo guys is there a guide to setting up version 2.14 with an nvidia 3080?
im only need help turn sims 4 into ai models and make photoshoots for my ai modelling show if someane can help me send me a pm is the user like antm by tyra banks
just make it yourself, it's easy and there's good free options
2.14 what?
is it possible to combine two voices and having a new voice while sounding normal?
oh so they won't sound like the combination of the two people?
even if the two people sound somewhat similar
if you mean the voice changer, you should get here
https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/
Last update: May 5, 2025
they're the combination of both models timbre
but the results of the merge cannot be predicted
most of the time the merge ends up sounding unnatural/bad
If it's two people then the model will just switch between the two depending on the pitch
but its all luck, sometimes it does sound good
You need more than two
oh oki
yeah keep trying until you get a good result haha
time to crank the slot machine
🔥
Vocoflex by dreamtronics is an option if you really want to blend the timbre
^ don't forget this when merging voices
can anyone help answer my question how are 2 providers gpu prices extremely diffrent
like vast ai has a 4090 for 0.30-0.40 but runpod has it as 0.60 does that mean vast is less reliable and cant fulfill the same things runpods can
different providers, different service.. shared means there may be a wait time
but its the dame gpu hows the price that diffrent
is their no catch?
btw its vast ai
the cheaper one is that a bad provider?
just the level of sharing
sorr ybut what does level of sharing mean
their 0.336
runprod: Community Cloud $0.34/hr
thats insanely low isnt it?
Secure Cloud - 0.69
no i have it on show secuure cloud only it showes the first one as 0.336
yeah ik but vast ai secure cloud has 0.336
for a 4090 24gbvram
is that not insanely low?
no, about right
there are cheaper clouds too
yeah but running mistral 8b is it possible to reach 10 req a sec on a 4090 24gb vram?
but I guess there may be some shenanigans involved in the calculation of hours
as my pc 4070 ti super only reaches
Metric Result Notes
Requested QPS 10 Number of requests sent per second
Actual Completed QPS 1.5 – 2 Requests fully processed per second
Average Latency 3 – 5 seconds Time taken to complete each request
Effective Throughput ~90 – 120 requests/min Completed requests per minute
ill check
if it preformers like this on a 4070 ti super 16gb vram how much could it preform on a 4090
mistral 7b?
no 8b
okay, it looks small enough to run with full gpu load on 4070tis
not sure how can I check that
.
this data could help?
i mean personally
?
yes, you can train a model, then use it in realtime
if it can only handle 2req in a 4070 ti super 16gb vram
ho wmuch do u think it can in a 4090 with 24gb vram
like 8?
oh ok ty
Mind checking my DM? I could offer you free 10 hours' RTX 4090 GPU just want to collect some user feedbacks
if you train a model using, for example, french speaking audio, the inference using this model, even without an index, may sound with a small accent
having both speaking and singing audio by the same person is better than not
you can include many
3 minutes is barely enough
how do i download
i fixed it, but i use a r7 9800x3d, rtx 5080+rtx pro 6000 blackwell, tiny 11 core 24h2, 128gb ram, and i was trying to do real time voice replacement bc im trans, got it working by running it on my 5080 instead of pro 6000
after reading realized my main gpu wasnt supported so switched to secondary
just for the voice changer+gaming?
lmao your money spent on pro 6000 could instead be for three 5090s, also the former is known to have kinda worse optimization on the geforce driver than the 5090.
and so I think ye it may only support the consumer RTX 50-series
im not sure what you're actially using, but vonovox is recommended for your gpu
Last update: June 2, 2025
nearly 50 hours later but it finally finished and it works lets gooo 🎉
This is a General AI Server, we won't be focused on voices anymore
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
- a screenshot of the program
in discord?
This is a General AI Server, we won't be focused on voices anymore
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
- a screenshot of the program
amd rx580x 8 gb windows 10 girl voice ive used this tut https://www.youtube.com/watch?v=SxdnGxicJOg&t=305s i cant send ss of the program here
Did you even check the comments?
yes
I said that video uses an over year old program 😭
It has outdated info
There's no updated video tutorial
Uninstall everything, it was just a waste of time
then whats the new program
-realtime
Guides for Programs that use RVC Models in Realtime for Calls/Games
Most suggested. GUIDE
ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE
Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated
alr
Uninstall everything ofc
i got amd tho
Yeah, it works
i see only the nvidia download
You just gotta read it and get the AMD version
It's literally below that
Also, don't skip parts
Skipping a single part Can fuck up your whole audio system
Be sure you delete the old program, and vb audio cable
Get vac lite
below nvidia download is opening on windows
i openmed other link
What
Everything is in the guide, read it carefully
Because messing up can accidentally make you not be able to hear anything from your PC anymore
Did you click yes?
yes
Did you set your default audio devices?
No, you need to change, show a screenshot
That's the specific part that If you mess up fucks up your whole audio system 😭
Check also the playback
idk
Anyways
Download the AMD version
Be aware that open source ai is complex to run, it isn't like ChatGPT
What anti virus so u have
Put the folder as an exception
The program is open source
You can even check the code yourself
norton
what a joke
it installed itself
it closed
nun appeared
and i opened the web link
and it closed
now i got this
what now
Extract everything, not just the .exe
Last update: May 5, 2025
Disable your anti virus
It doesn't do things by itself
i did
It's your anti virus
Are you on a school PC?
Be sure to fully disable it and add the file and folder as an exception
Norton is noisy asf
ill just unninstall it
cant even unninstall it
what is happening w my shi
now i dont got permissions
on my c driver
Learn how to uninstall Norton security products from your computer. This article contains steps to uninstall Norton device security from Window 10, Windows 8, and Windows 7.
real
i opened
the file
it said sum
and then it closed
and i got this again
why this so hard
@low shard
what do i do
this is an open source ai program, it's not a product made by a company like ChatGPT
its made by the community, ai isn't user friendly
you sure your anti virus is off?
ive unninstalled it finnaly
can u show a screen recording of u downloading it, extracting it and opening it?
with norton you can never be sure
there may still be services and startup tasks leftovers
lemme try
and they'll keep nagging you 6 month later
is windows defender enabled?
reboot and make sure it is
gpU: ur rx580
chunk: 400
input: microphone
output: line 1
monitor: headphones optionally to hear urself
then add a model https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#adding-models and click start
Last update: May 5, 2025
then set the input in the other program as line 1
do i change the f0
?
i need help to set up my settings
can u help
?
@low shard
i sound horrible
on wokada, you can optionally use more advanced settings for benefits:
-
Force FP32 mode: on (THIS IS OFF BY DEFAULT! Turning this on improves stability. Increases VRAM usage by 200 MB)
-
Reduce the delay via: https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#reduce-more-delay
Last update: May 5, 2025
be sure to get a good model
not all models are good
also set extra to 1.5
now i sound off
elaborate
my friends tell me i sound bad
find a better model
use the suggested models in the guide
or search in over the 20k rvc models
what guide
the one where you got the program
it has EVERYTHING in it basicallu
ok
the words
i speak
are weird
thats the thing
its not even saying them correctly
@low shard
how fix
that's not how it works
it loads the whole dataset, all files, shreds them to little pieces and learns from that
yes
show a screenshot of ur wokada
set extra to 2.0
then
CHECK THIS TRIANGLE ALWAYS
ITS THE MOST IMPORTANT THING ON AMD
does not matter
if it still sounds bad, try another model and check that ur microphone is good
can u suggest a model pls?
Last update: May 5, 2025
is it running on CPU for some reason?
is it a laptop?
may wanna invest into some proper cooling stand, yes not surprising
AI is a computing intensive task
but I'm make sure it actually runs on GPU
check task manager/performance
does it look like that when training?
that's hot
may wanna get a cooling stand or something or it gonna melt
gotta be rich
I have told you to stop using the old mangio
Mangio is outdated asf
it's the same as using windows xp in 2025
mangio is abandoned since 2023
all y tuts are old
applio is not going to fix your hot gpu temps, imo dont train with that, the chances of u melting ur gpu are pretty high
use cloud solutions
ok but be careful, i know someone that actually potentially melted their gpu because of rvc training on a laptop
i used to train on a 3060 laptop gpu
it never melted
battlefield 5 made it crash though
i did like 20-30 min
maybe you got a better cooling system
or his cooling may be defective/not good enough
imo i'd rather not risking my laptop for a rvc model
idk, it was really hot
i almost burned myself lol
Yo, do you guys know any apps to make rvc models, I'm a beginner?
You can use Applio on Kaggle, it's all online and free, I can help u btw just dm me
whats the difference between okada and vonovox?
vonovox runs better, in simple words
its a completely new thing
the quality is the same between all realtime clients
what changes between them is optimization and quality of life settings
Hey everyone,
I’m really interested in learning how to build automations for clients — things like automating tasks between apps, setting up workflows, etc. The problem is… I’m a total beginner. I have no idea where to start or what tools I should even be looking at.
I’ve heard of tools like Zapier and n8n, but I don’t really understand the differences, use cases, or how to use them in real-world client projects.
Are there any YouTube channels, tutorials, or other resources you’d recommend for someone who’s just starting out and wants to learn this from scratch?
Any advice, direction, or beginner-friendly content would be super appreciated!
Thanks in advance 🙏
is possible to train rvc model with colab?
yes
There's any colab still working? One I've found so far don't work 'cus colab changed the way it works.
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
I am wondering that my audio files that I training with, do they have to be in wav or mp3 format?
Wav, so they are lossless
This is a General AI Server, we won't be focused on voices anymore
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
This is a General AI Server, we won't be focused on voices anymore
Elaborate:
- your PC GPU
- your operative system
- what you want to do
Ok what colab do you recommend I should use for training?
You should first check that your PC is good enough, colab is only for people with bad pc
What's your PC GPU and operative system?
Don't use cloud as your first option
I've been told flac and wav back and forth, which is really best for training?
Cloud has limited free time
I don't have pc unfortunately, I use my phone
you're cooked
Why?
Both are good, both are lossless, simply FLAC is a bit smaller
Why?
Oh damn, your only option is cloud then, but remember it's harder on phone, the user interface isn't made for phones
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (max 4 hours of daily T4 16gb gpu not granted for free, not much hours for training, but easy to use, there's a paid tier):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus, either T4x2 16gb each or P100 16gb, only free):
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly, Free Studios run 24/7 but require restart every 4 hours. There's a paid tier):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio UI Colab: RVC Fork with some extra features like TTS
- RVC AI Cover Maker UI: Automatically Separates the Vocals and Instrumentals, converts the voice and mixes them back
You can use them, you will connect to a remote good PC with limited free time
Just the interface won't be comfortable
Ok
Phones aren't powerful at all, AI is intensive asfff, it's like running GTA 6 on a phone
Damn
ChatGPT runs on cloud, that's why it works on phones
Ah, do you have a pc?
It doesn't run directly on your phone
Yeah why?
You can still make it on your phone ofc
Just not as comfortable as a pc
And not able to do it locally
If I may ask you, can you please make it on your pc? My voice model I mean
Sorry but I'm not looking nor have enough time for that, it takes a lot of hours to train a voice model
And speaking of phones, I went into gradio when I used applio but when went to the dataset section, i couldn't click on the file
You can either make a free model request in #1159289738314919936 , or make it yourself on cloud
Have you checked first the model isn't already made by someone else?
And what cloud are you talking about exactly?
No
It hasn't been made by anyone else
The ones I mentioned above #✨│ai-help message
You can search rvc ai voice models at:
- https://discord.com/channels/1159260121998827560/1175430844685484042
- In https://discord.com/channels/1159260121998827560/1163592055830880266 , Do /find with @earnest musk
- https://weights.com/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- Suggested Models for Realtime Voice Changing (Wokada)
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- make it yourself with our docs guides
- Ask a free request in https://discord.com/channels/1159260121998827560/1159290139609137264
- Be aware that we don't allow any paid comms, so don't fall for any "pay me 20 dollars and i will make the model for you" dm
:wave: @low shard, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
why does this happen in the UVR hugginface space?
it'll add a pop at the very start for no reason
.... @low shard
Alright, if that's so you can make a free request, or read the guides for doing it on cloud
@viscid moss are u aware of this?
it happens mostly on datasets for me although when extracting vocals from songs or just from anything with already existing background music or anything like that it doesn't add that pop
That's our ai hub documentation yes
Ok
load the file in audacity
zoom in to the start
would using fl studio work as well, I really don't use audacity for editing my datasets
if it does no start from close to 0 you'll hear the pop
whichever method lets you see the waveform
ah
pop happens if you cut he audio wrong
how do you cut audio wrong?
like trimming the silence at too high value
If I don't have any storage on my phone to do the model on cloud, what can I do?
this is low enough not to pop
If you don't have any free storage, delete something on your phone
?
this is gonna pop
any of these is compatible with colab's python 3.11?
Elaborate:
- your PC GPU
- your operative system
- what you want to do
- what tutorial link are you using
@languid lodge cloud shouldn't be your first option, you should check if your pc is good enough first
I posted it already, the thread on #1159289738314919936 @low shard
alright, goodluck
Yeah let's hope so
Hi! I found a character's AI, but it's in English. Is there a way to make it speak Japanese?
Elaborate about your issue with detail.
Are you referring to an RVC model or some sort of character ai chatbot?
RVC model
You can try with testing a japanese speaking/singing audio with the model.
what is now the best way to use use rvc models.. like i talk and the app uses the model
i am still using rvc gui so i need a new one 😭
idc about the time of rendering, i want the one that currently has the better quality, locally
that is confusing
That's weird. It's working for me without that pop sound
like I record a mp3 and then the app uses a model which i give and it makes the original mp3 sound like the model
what
nvm
that's how it works
although
