#✨│ai-help
1 messages · Page 332 of 1
could someone probably help me out. I try to set up a local claude code with a free model on VSC but im getting an error if i type on the terminal; claude . Im surely missing a step im not a coder im just new in this space
i just watched a video and did everything that the guy said but i think im missing some basic stuff
ok
wow how helpfull
Thanks.
basically you need to import
import what
hope this helps!!
bro is trolling my life
your life?
yea
why
cause i needed help and u troll me
im not trolling
what do i need to import i cant read ur mind btw
chat gpt said i should install an ai extension so that vs code can connect with openrouter
yea
can i probably send u a video its just 5 min but im surely missing smthing
and maybe i can get ur opinion on it aswell maybe u know better options
W8ights isn’t a real server, this looks like a scam, don’t trust it, weights closed and there’s no official continuation
many staffers from there started bomb promoting here and got banned, i think i need to make an announcement on this later
rvc is retrieval-based-voice-conversion, not realtime voice changer
elaborate your pc gpu, your pc os, what are you trying to do and any tutorial link you’re using
that looks like an old version of wokada
can i get an invite?
please don’t troll users
invites can’t be shared in the server, and its a scam
i wanna know what type
- RX 9060XT
- 25H2 Windows
- Deiteris Fork
not really but the filters on other generators are too restrictive
Hi, guys I am trying for 1 week now to prompt a UGC charakter who is talking words out without mistake in german but it doesnt matter what I do it dont work out. Did someone here face the problem and can help me pleaseee.
OK thanks
I can DM it to you
Could you DM me too the invitation? We are going to investigate in the scam
anything NSFW related isn't allowed
This is a General AI Discord Server, please elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
i was trying to find out some script to video generating
ai
as i use voice tts model for text to speach
sure, thank you
so i just need some thing which will create or gen the video or it will atleast help me
Hello, I'm trying to start making videos on TikTok, with the character of Makise Kurisu, but I don't know where to find a voice of hers to put on the character, does anyone know how to find voices of anime characters for free? I have seen many people on TikTok who have them and don't pay anything
You can check here https://discord.com/channels/1159260121998827560/1175430844685484042 or here
https://voice-models.com
But if you don't find anything you like you can make one yourself, it's free and easy
Small question i want to isolate main lead vocals from a recording, what model can be used best to remove crowds and backing vocals in UVR?
I don't use local uvr but MVSEP does the same thing as it and I personally use this model
it's pretty good from what I've seen
Ah okay, yeah im already in the cue atm XD
mainly for bg vocals but I believe it works on crowds too
okay how do you know which models are the correct ones to use, because i have to wait in line behind 231 other people. so you know if i know which one to do i wont waste time figuring it all out
is there somesort of guide that tells you what each model is for specifically or is it just kind of trying and see what works really?
in other words whats really the learning curve when it comes to picking models?
make sure u make an account to lessen the wait time
it's free
anyway for vocal separation in general I use this
and reverb I use both these two, stereo seperation first then room reverb
thanks, although the wait time only lestens when you buy premium it seems
dang, I just used the seperation tool and swear it was only a few ppl using it at the time
I haven't paid
yeah mostly in the morning here in europe there are only 1-20 online
but thank you so much for the help!
one last thing, for echo I use this one
tho if it starts removing breaths I use deecho-dereverb
I have a problem with the image, I use flux2 but in the place where I used inpainting I have a loss of quality, the picture in that place looks a little blurry, how can I fix this?
How would you do it in order?
To get clean vocals as much as possible?
For cleaning a song I use the bs reformer vocal separation, karaoke (background vocals) stereo reverb, room reverb, then finally de-echo
@feral crater
anyone have like a bunch of rapper rvc? willing to pay
Payed commissions aren't allowed here, but you can ask in here https://discord.com/channels/1159260121998827560/1159289738314919936 or make a model yourself, it's free
If you're just looking for a model then look here
https://discord.com/channels/1159260121998827560/1175430844685484042 or here https://voice-models.com
Thank you
You're very welcome
When training a voice model, is it better to include or exclude vocalizations that aren't words or phrases? like grunts, screams, mouthy noises and such?
I personally keep them, it's best to make sure there is more speach tho than those souds
Fair fair, I figured as much. I just wasn't sure if it'd enhance or mess up the dataset
Ty! ^^
ur welcome! ^^
<@&1159293140440723499>
Hello y’all,
I’m planning to make a personal FitzyVA model (mainly for Cyn/Cynessa and Mel LARPing, mainly the latter though), and I was wondering, which program is the better option for expressing emotions when using RVC, (regardless of the dataset used?)
Personally, I think Vonovox is really good and has significantly lower latency. Meanwhile, Applio is also a solid option and works as more of an all-in-one program, though it does have some flaws in certain areas. But even then. It's really good. And from what I was able to gather. Vonovox is superior for real-time emotional expression. I'm not too sure though. Cuz I did test out both and got mixed results
So which one is better for expressing emotions and stuff, not just basic talking?
none because the model doesn't learn expressions in the training, it learns linguistics, pitch and spectograms
the outputs can sound a bit monotonous because the RVC core was primarily designed for TTS, although it did come with an option for voice conversion inference (without using F0, so it’s also quite monotonous)
rvc is a kind of hack that removed all of the tts stuff and improved voice conversion by adding f0, an self-supervised embedder (contentvec/hubert) and replaced the hifigan vocoder with nsf hifigan, although in older versions it is possible to use the standard hifigan, without f0 (it's worse)
tldr; rvc is kinda like an enhanced tts, but with the typical 2022 old tts flaws (not lying, the rvc core is that old), like not being able to be expressive enough
more modern TTS can do pretty good and realistic expressions because they learned them in their training, that tech did not existed when rvc was made/was too heavy to run in realtime
realtime clients like vonovox, applio, wokada etc can't fix architectural issues like these
Did you get banned
I saw you joined and aren't there anymore
that isn't a helper, ping the helper role for help lol
anyway what voice changer are u using, what gpu do u have (nvidia or AMD) and what are u using it for?
wdym?
Alright, much appreciated
I have a cyn model
one without effects so it sounds just like the va
it's not public here but
yknow
thats kinda cool actually, i like when people include the actual va speaing for their voice models instead of just relying on the character clips
well she did basically make all her voice lines she recorded public on her yt
plus the advertisements for merch
ye
if you'd like I can send my current Cyn voice to u
if youre okay with it, sure
oh wait i got dms off
add me rq
I just finished training a model with applio and I'm getting an index file the size of 500mb, every index i've seen from others was 50mb, anyone know what's up?
most likely you used the faiss algorithm to train your index
move/delete the index file, go to the training tab again, put the name of your model, scroll down until you find the training options, click "advanced settings", in the index algorithm select "kmeans", click train index
now the index will be smaller at the cost of slightly less accuracy
Ooh I see, I'm fine with a slight loss, I just don't want a model file to be paired with an index 10x it's size haha
Tyvm!
Huh.. generating an index with either Faiss or KMeans both results in a 500mb index file
huh that shouldn't happen lol
Could it be because I have a huuge dataset?
i mean yeah thats the reason, kmeans should give you a very tiny index file, i have trained index files of 60 hours datasets and with kmeans they're still very small lol
The dataset is about 50mb total but the amount of individual files is eheh.. 1875
im preprocessing some 3 hour dataset i have, let me see if i get a tiny index
hold on, did you deleted the previous index, right?
Mhm yeah, both times i deleted the index, swapped the algorithm, then generated
waaait ur right, this thing is generating huge indexes
im looking the code 1 sec
yep it's bugged
i fixed it
i love applio
when i clicked kmeans it was generating a simple faiss file
and now i got a 30mb file
bruh
rvc > train > process
no need to restart the gui, after you replace the file click kmeans again and train the index
Oh huh that's silly, fair enough, thanks a ton!
yep! it generated a 30mb file this time, it works!
it's not nsfw my guy
I just need less filters
*paid rvc voice model commissions AREN'T allowed
What?
i left
Hi
I7 13th gen 13700
Msi mobo mortar b760m
32gb DDR4
3090 GIGABYTE OC 24gb
2x1tb kingston mv3 ssd
1000w PSU
This will be the pc i will buy
Is it good
Is it enough need help asapp
Helpp
big dataset with unique sounds - big index
no
50mb index is a model created with like 5 min audio
big chonky model
yea but kmeans is not working currently in applio, i had to fix it lol
so Blaise fucked up something extra lol
i dunno why even 'Auto' is needed
just index_algorithm == "KMeans" or big_npy.shape[0] > 2e5
okay, made the fix
anyway, my point stands - big dataset, big index and it is actually good
No one said otherwise; I said it wasn't normal to have a 500 MB index when selecting kmeans.
glad it's fixed now
Hey.. everyone. I wanted to ask what is the best app for RVC conversions? I use replay.. and sometimes the voice just cracks up somewhere.
And I dont have the GPU required for some apps that I saw on YouTube and stuff.
So if anyone can help me out here.. I will be great!!
what's your pc gpu and os?
It's a Microsoft pc
I dont really know the GPU
Because I am a noob... and I only ever used hugging face and uhh replay
Yea it's a Microsoft pc
You can google it.. because I cant upload the pic here..
Open task manager and click the performance tab
U can check the gpu there
Ok one sec
If it's Intel you can't really run anything good locally
Wait sorry ppl I cant find it one moment
I'd show u how but I'm not on my pc rn I'm still trying to wake up
Oh.. sorry I woke you up
is it running windows 11 or 10?
if so, search task manager, go to the performance tab and check the GPU component
or is it a Mac or a Linux PC?
Ok found it
11
Lol did that fire wolf guy get banned
alright, check what I told you
NSFW isn't allowed at all
K
I think you misread that question
So it says utilization is 1% and shared GPU memory is 0.8/7.9 GB.
While the memory is 0.5/7.9 GB
So...what now..?
?
It was about uncensoring anime bath scenes and looking for an adult video generation AI 😭
you should tell us the GPU name
!give-media-perms 3h @loud rivet
send a screenshot
Ok..
No one reads the rules 😔
Soo true!! Whenever I request for a model they dm. And ask for payment before making the model.
You have got an Integrated Intel GPU PC, it's not good for Local AI, that's why Replay was slow and having issues for you too btw,
It runs on your hardware
It would be better if you use Cloud (Remote PC) alternatives with a limited Free GPU time
it would be faster but ofcourse not unlimited
what do you want to do? You could use the suggested cloud, or try to run on your PC CPU which will be slow ofc
What do u mean?
..I dont have patience.. astgfirullah...
you either still try Applio (RVC Fork) on your PC CPU, which will be unlimited but slow, or use the fast and more suggested Cloud alternatives, which ofcourse won't be unlimited for free becuase you're using someone else GPU
you just want the easiest fastest thing you can get, so like the easiest AI Cover Maker for cloud?
Oh.. what do u recommend sir?
Oh... what's that...??
Good thing I don't take model requests
that's not allowed, if you see someone do that, report them by dming @vernal bone
you just want the easy fastest way to make AI Covers right? I'm not sure what do you mean
Yes
there's a cloud program that automatically separates vocals and instrumentals like Weights did, try the cloud easy version: https://docs.aihub.gg/rvc/cloud/aicovermaker-cloud/#google-colab
Last update: March 24, 2026
read the guide and let me know
you're welcome
oh and sorry to bother again mister. i wanted to ask the link that u shared me. does it also allow music mp3 files to be uploaded? because i make AI music. and want to make JJK characters sing them..
can anyone help me in ai
i always wanted to do something i do not know what to
i always wanted to make money with ai
pls anyone help
dm me
and message ai so i can understand
this server isn't really about "1 click easy free money with AI", there isn't anything that will garuantee you free easy money in 2 seconds
Help the seller sent me this for the pc
I7 13th gen 13700
Msi mobo mortar b760m
32gb DDR4
3090 GIGABYTE OC 24gb
2x1tb kingston mv3 ssd
1000w PSU
Should i get it?
What kind of PC is this? The 3090 is good if it's the RTX 3090
Is it a desktop or a laptop
Pc
Intel for the CPU is kinda just ok, the other stuff I'm unfamiliar with as I'm not super into computers
How much will it cost?
Its 901usd
Equivalent in dollars for the whole pc
This is the 3090 gigabyte brand , its my first time with gpu pc so i need help
Depends on what u want to do with it
Ai and vidoe editing and fl studio
Could you expand on what u mean by ai, there's tons of stuff like video, images, rvc, ect
Rvc
Video editing and fl studio can be done for sure on a 3090 just fine
I only have 900 usd
I can add the 1 usd so i could buy it
Should i get it or nah i would regret for the rvc voice ai in this server
Vonovox should run just fine on a 30 series, I have a 5070ti and it's also pretty good there
You want real-time right?
Yes
Yea the PC you're buying should work just fine with everything you want to do
5070ti should i get it instead of 3090? But 3090 has higher vram wouldnt that be best for ai voice training
Since 5070ti is newer
Hmm
I mean I train on cloud bc I'm lazy so I dunno how it is with training voices locally but I'm sure it works just fine
But I definitely think if you can afford it to get a 50 series Nvidia gpu over 30 series
But i also want a text ai or train ai they said u need higher vram for that
Hm
I can only afford a 3090 for higher vram
I'd ask Lyery as I'm not as smart with PCs and rvc as him
Might get better insight from him
Just ping once
you want it for both realtime and training? hmm, the 5070ti is super fast and should give you good performance in realtime
it's true that you need more vram if you want to train models, but in rvc you don't really want to go above batch size 8 (about 5gb in fp16 mode, 7gb in fp32)
realtime inference uses 2gb of vram iirc
hmm hard decision... the 3090 might be useful in cases where you play unoptimized games that uses lots of vram
Yes also the 3090 price for the whole pc the owner/seller is able to get it to 901 usd so i could buy it right now
also, seems like faster gpus do get sligthly less delay, so in theory the 5070ti is technically better for realtime, but yea, the vram
also is way faster than the 3090 for training models
Its my first seeing in marketplace a whole 3090 pc less than 1000usd so should i get it, i only have 900usd the seller lowered it to match my savings
more vram helps getting less stutters too
Yes
yea tbh the 3090 is a good option despite being older just because it can provide a more smooth experience with not that much stutters
Should i get it then i wont regret it for real time voice/ the ai voice thingy in this server @analog obsidian ?
im not 100% sure, because the 5070ti is indeed faster and will give you better fps
I cant afford a 5070ti pc i only have 900usd for a whole pc
So 3090 can handle it even if its older?
yup
Thanks
Is there anything else i should ask other than this
thats enough for rvc
whats the directory / name of this program again
what program
Is it better to have separate audio files, or combine them all into single file for voice model training? Or does it not matter?
Does it matter if it's tons of files? Like hundreds or thousands?
Hi, i make videos for youtube, bit i want to ad the voices in applio, how can i use the TTS? Someone have the link to use it?
The best ai modal, hi ive tried to find the best modal overall, but i dont know what i should use i dont like cloude at all ive used it for 5 mouths is there any other?
-rvc
Likely a realtime voice changer.
cloud?
NVIDIA GeForce RTX 3090 is already enough, potentially a bit overkill, for simple RVC inference, but for voice model training, there's an advantage. RTX 5070 Ti is way newer, but RTX 3090 has more VRAM. Kingston NV3 is a budget SSD, uses QLC NAND which gives more capacity but slower than "TLC NAND", so if you were to do heavily files moving, either Kingston KC3000, WD Black SN850X or SN7100 might be better. Ensure that motherboard's BIOS is up to date. Should you get that one? It's your choice still if you have that budget.
Another concern is that one power supply. That power supply is stated 1000w, but always check its brand, model and health because you might not expect it to smelly or turn off suddenly during runs. The DDR4 RAM speed is not stated, but probably a bit slower than DDR5 anyway. 
This is a general ai discord server, what program?
there’s a tts tab, do you already have applio? If not, what’s your pc gpu and os?
Hi, I have a problem with Flux2 Klein. I'm trying to use inpainting to remove text, but in the areas where I applied it, there's a blur issue, it looks kind of smeared. The second problem is that in those areas the color changes compared to the surrounding environment. Is there any way to fix this?
Hello i can help you to fixed this
Well, how can this be fixed?
Let talk in dm
Why direct message?
Yeah that is a common thing with inpainting the blur usually comes from denoise being too high so it over rebuilds the area instead of blending it and the color shift happens because the model is reinterpreting lighting in that masked region
Try lowering the denoise so it keeps more original detail keep the mask tighter around the text and do it in two passes first remove the text then a lighter pass to match texture and color
If I change the prompt, won't it improve the situation?
Do you mean lowering denoise on ksampler? I reduced it from 1.00 to 0.75 and now it doesn't remove anything, it just changes the color in that area.
yo does anyone have a working rvc colab? i cant find one
sorry to bother anyone but is there a RVC of sash lilac from freedom planet to download anywhere ive been searching for so long and what is here are sadly gone since weight sadly had its farewell and applio is giving me 404 thing
have you tried applio on kaggle? it works fine for me
im new to this sadly and not the smartest there is i know of applio but not kaggle im just looking for a working download for sash lilac to put on applio
not really sure what sash lilac is but kaggle works well, it's a cloud option that can use applio
oh my bad i didnt mention this along side my first text she is a character on a game similar to sonic also i took the time to look up kaggle i know what it is now ill keep it in mind all im looking for is a working downlod since all the rest ive tried to look for dont work anymore mainly use for tts on applio
ah I see, idk at all about getting applio working locally but I know how to use the interface and train models ect
best bet would be looking here
-rvc
it should be in the applio docs
Any good, realistic models you’d recommend?
not really sure tbh, just check around https://discord.com/channels/1159260121998827560/1175430844685484042 or https://voice-models.com


i see i was hoping someone can share theirs before the it cant be downloaded anymore happened
wdym?
i found older downloads here but it all lead to 404 not found or just dead links for sash lilac so i was wondering if anyone have it they can share it with to me but its understandable if no one have it she isnt a very well known character and so is her game
is this the character you're looking for?
I found a 13 minute voice line video of her that could be used to train a model
yes but her voice to use on tts
ill see how i can make one on my own if my brain can handle it if and if its easy and if not ill have to wait for someone to make it or can share the old one thanks for your time for helping me out and sorry i didnt mention the clear idea of who she was and is and from
I have a video I made on Kaggle's applio if you'd like to try that
I did skip making an acc and verifying but you do need to verify by phone or one of the other options it gives or the notebooks do not function
Oh cool thanks ill keep it in mind for now
Hi! I am a bot for DISBOARD (https://disboard.org)
COMMAND LIST
/help: This!
/bump: Bump this server.
/page: Get a link to the server's DISBOARD page.
/invite [channel]: Set invite to this channel. If [channel] is specified, create an invite for that channel. (Admin only.)
How do I add my server to DISBOARD?
- Login on DISBOARD website,
- Go to Dashboard,
- Click "Add New Server"
Fill out your server info and save it. You will be redirected to Discord's authorization screen. If not, click the "Add Bot" button on the server edit page.
Need help? Join the support server.
<@&1159293140440723499> hacked account
<@&1159293140440723499> promoting nsfw
I once used Adobe Dreamweaver to create a HTML page. 
Check out my Neco-Arc model in #1175430844685484042. 
Following the RVC Applio tutorial on the ai hub docs,
I have a RX 6500 XT
Shader ISA: GFX10.3 (gfx1034)
Do i follow the top one or bottom one?
If its the bottom one do i follow this or the other gpus one?
Nevermind, checked the repo under the "Other GPUs" tab, it is in there
hey guys
im new to this ai stuff
i tried running llama 3.1 8b version on ollama locally on my system
i said "hi"
10 seconds later i heard my pc fucking screaming
then it shut down
what do i do?
Full GPU Name: NVIDIA GeForce GT 610
Operating System: windows 11
Hello! I just came here on one question: I have Projects (folder organization) on the ChatGPT Free plan, I also live in Poland 🇵🇱 so I don't have ads (yet, lucky me).. I do have more features, and I wonder - Should I move to Go or Plus? I usually run not a lot msgs per day, but I do give some kind of context (not a lot, but some details, I'm not an advanced user) and the weakest model is still enough for me I guess? I'm also using uBlock, I wonder if it works with ChatGPT lol
don't even try to run LLMs
This GPU is so weak bro
This GPU is too old to run any AI locally.
yeah that might be too old to run locally
not sure if it the GPU would even get detected, probably running on the CPU with RAM (and you probably have 8gb of RAM)
unless you try extremely small models, it's best you try cloud
in free time, pls answer my question 😊 It's important to me
uBlock Origin (and lite) probably should work fine
If you're not running into any any issues with your current free tier, it might be best to keep using it
You should decide on it depending on your needs, because if you're doing just small things and it works fine, there isn't really a need
But if you needed more powerful models, for like coding or complex math tasks, or are running out of free messages, you might want to try other services like Gemini, Claude, or subscribe to chatgpt
Like for example, you wouldn't need a 5K super powerful pc if you're just going to use it for Microsoft Word, if you get the analogy
its your money and your needs, so it's best you take the decision valuating what you want to do, instead of blinding just buying a 2nd car you might not even use because your 1st car works fine
btw I would suggest to learn how to prompt engineer it better, giving it more clear instructions and info can help alot for results, especially on complex tasks

I'm thinking about a PC upgrade
below 100,000 rupees tho ($1,053)
what should i get?
what specs should i upgrade
If you mean to upgrade a GPU in your current PC, there's NVIDIA GeForce RTX 3060, but there's another better workaround. With $1000 budget, you might rather expect a complete newer mid-tier PC with more recent specs.
4th paragraph: "But if you needed more powerful models, for like coding or complex math tasks.." - math is not for me, but I learn to code, often just asking questions about interesting settings to tweak in my PC or what tool to learn, that will suit my needs.. BUT I rather want to achieve some assistance for decision-making, don't expect to AI be my doer
I'm not a programmer tho, I don't work with ChatGPT, it's rather a set of hobbies, not PRO work, so just chilling out.. want to improve myself
Sounds interesting - I recently did ask my GPT for this topic, and he got me pretty satisfied with it.. He gave me that example, if you would like to rate this as a real person.. I want to objective feedback, another perspective 😊
NOT
What headphones to buy?
BETTER
I'm looking for headphones for work (long sessions, comfort, and good sound quality). What are some reasonable options for a mid-range budget, and what are the real differences?
===
I suffer due to ChatGPT prices in PL 😭 The Plus plan is 20$, it's going to be 72PLN by the currency calculations, but it says, like, 100PLN (which is 25$ in this price, and currency converter says 91PLN).. It's weird..
okay
you could get an rtx 3060 12gb vram or rtx 4060 ti 16gb vram
thx
guys whats this. this happens when i change index rate and when i click on start in vonovox beta
Rupees like, from the legend of Zelda?
Hi. I'm trying to run gemma4:26b through Continue on VSC
I can't add it to models tho no matter what's in the config
.\runtime\python.exe -m pip install faiss-cpu
run this command in vonovox directory
Does anyone know which version this is? I reformatted my computer and can't find it.

This is Wokada tg fork
But what gpu do u have, there's an Nvidia version and AMD version
I feel like those tik toks where they go "what kind of fruit is this"
nvidia



U dont wanna just, try it out first?
It's ok the only options you ever need to touch is the pitch
And block size (controls the delay and quality)
I know it looks like a lot but most of that you don't need to even touch
Btw make sure to import the index
Right underneath the embedder
I don't have an index on a model I like
Yah, it lets u use the voice changer on discord or in games

It's that second link I sent here in case u didn't have one ^^
hi guys im new just wondering how do i use a voice changer

where
Where it says output device, it's underneath the one that says audio device settings
Hey! What gpu do u have (Nvidia or AMD) and what do u plan on using it for?

It's ok take your time finding it
now what
Send a screenshot rq so I can see what changed
Nice
The important part tho is knowing your gpu, if it's Nvidia you can download Vonovox, if it's AMD you need wokada tg fork

It won't let me upload the index, but on the other one it did
I didn't like it

This seems like it's fine now, if the model is a female and your voice is deep/are a guy try from pitch 3-12 and opposite of you're a girl
And if you're using a female voice and are a female keep at 0 same other way around
uh?

Ok so if you are a guy and using female voice pitch the voice up until it sounds good
If you're girl and using a guy voice pitch down until it sounds good
If guy using guy voice keep at 0
If girl using girl voice keep at 0
No need to change this it is good at default
AEAEAEAEA

Can you send me the link to the first one I mentioned?
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project.
A Realtime Voice Changer with similar performance to Vonovox & Wokada Tg-Develop Fork, with extra features.
Deiteris' fork (modified version) of wokada that doesn't get updates anymore.
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
u need these two, download both, and extract the first one, then put the second one in the folder made by the first one
wdym
this is the one you want
that's what I just sent

this one is very old
you don't want that one
I didn't send u that one :(
did u have anything else running when u opened the one I sent?
no
hm

I am confused now
now what
close anything related to the voice changer then try one more time opening it
I may need to help you in a voice call if this happens again
that might help
how do u use the text to speech models?
Im not sure where to post this question, but it's regarding a1111. Its the program i started with and know it well. I had an nvidia 3060ti (8gb) and it was slow but it worked. Ive since upgraded to a 5060ti (16gb), was so excited to jump back into a1111, but then i found misery and pain, lol.
Is there any way to get it to run on a 50xx series GPU? Ive found some discussion online but its all waaaaay outdated, seems like everyone just stopped trying maybe?!? I currently have a driver that has cuda 13.2 and can find absolutely NOTHING that helps... is there help? Is a1111 just dead?
thats a very fine website, unfortunately I dont feel like digging through endless stuff that seems irrelevant to me in hopes that maybe I might find something that maybe is what I need. I want to ask a person and have an answer. Im just old fdashioned that way
what is a1111
how do i find models that are less heavy to run?
what does this mean
noted
Whenever i try starting my notebook it gives this error in kaggle.
It does it everytime. I've tried making a new copy, making a new notebook, restarting my browser
nothings working mannn
I got this too
ok good, its not just me
I dont know which program to run for the voice changer... im kinda dumb
what gpu do u have (Nvidia or AMD) and what do u want to use it for? just curous
how to use ai voice changer without using gpu?
Why?
my gpu is low end
What would you use the voice changer for?
i want to sound like kanye west
What model?

Not really recommended to run a voice changer with only CPU. Try run W-Okada voice changer on Kaggle instead. https://docs.aihub.gg/realtime-voice-changer/cloud/tg-develops-w-okada-fork-cloud/#kaggle
Last update: April 1, 2026
NVIDIA GeForce RTX 4090 or RTX 5070 Ti? You should set processing unit to your "GPU", extra time to 2.7 s and pitch extraction as rmvpe. Also, output and monitor audio devices look swapped but they'd still work.
5070 ti 16 GB

i got a question if i train a model using a song that got adlibs will the model come out well and will it know how to work with songs that got adlibs'
most models in #1175430844685484042 are RVC models, are you trying to use those for TTS?
hii i want some advice how to find the perfect rvc model for me. like how do ppl find the natural ones that will fit them 
AMD Radeon RX 5700 XT or NVIDIA GeForce RTX?
I like Nvidia
A gpu without cuda is paperweight
ok
Check the applio docs to see how applio works, there are many different options like using it locally on your pc or on the cloud with Kaggle or Google colab
it's a little bit difficult to seetup but thx
No problem! If you need help ping the helpers role
"Error: Pretrained model sample rate (40000 Hz) does not match dataset audio sample rate (32000 Hz)." both the dataset and pretrained model are 32k
are you 100% sure?
yep
link to the pretrain?
im trying to use this pretrain model (32k sr) and my dataset is also 32k sr but like
https://discord.com/channels/1159260121998827560/1492203850747216083
idk and its not only for that pretrain model ive tried other 32k ones and they all give the same error
if u chose 40k in applio when using a 32k pretrain u should choose 32k instead
did you select 32k in the interface? what are you using? applio?
yes and yes
i have this too, but only if i try to create a new notebook. existing one works just fine somehow
I can't get it to work either, this is killing me
okay now i can't even run my older notebooks anymore :')
So is anyone using https://multinex.ai/memq?
obvious sponsorship
so i'm very much a beginner when it comes to training voice models, are there any like biig tips or things i should be doing when training? like certain settings or things i can do before or after to improve the quality of models? currently i'm using applio and using essentially default settings
Heya, I understood that chunk size is used to control the delay of the voice resulted from the rvc app and higher means as well better results. Is it supposed to give such a big difference the OS (windows vs linux) and the developer (deiteris vs Tg)?
- On windows, I need to set the chunksize to 192ms at minimum to have a stable perf and set extra to 2.7s
- On linux, I can go as low as 24ms altho it's running tight (sometimes goes yellow and lag spikes in voice). Also 2.7s extra
I also tested different delays of chunksize on linux from 50-100-150-200.... they all sound mostly the same in quality from my perspective. So does it help if I run it at around the same delay like I do on windows or ? e.g 50 vs 192 or 100 vs 192? I mean they do seem to sound the same but can't be too sure and said to ask as well even tho I understand that I use 2 different versions of rvc from 2 different people but wanted to know if it gives any benefit running on the same delay even tho I could go lower.
this guide might be helpful if you havent read it yet
https://docs.aihub.gg/rvc/local/applio/#training
i'm glad I'm not the only one with this issue haha
hope it will work again soon, i like kaggle more than lightning ai
I cannot even use lightning ai it's too confusing
and colab is dookie with only 4 hours max
yeah kaggle is definitely more user friendly and simpler, but lightning ai has very very powerful gpus
their verification process is weird though
if u learn it :p
Probably it's your case because i can still run my old applio notebooks on kaggle
its confusing at first but you'll get the hang of it
But either way, @simple ore maybe you should check this out, tested it myself to check if that error was real and it is.
its user friendly but still slow
same speed on colab even tho its using two t4 gpus
meh I dislike how there isn't even a stop button anywhere clearly labeled for dumb people like me
uh
there is tho o - o
what exactly does not work?
if u click the cpu and gpu stats there's a button where can put applio to sleep
i could run them untouched earlier today, but as soon as i changed something i get permanent unknown errors trying to start the session
its really strange
@white shadow claims that when trying to import a new applio kaggle notebook and running it, this error pops up preventing it's startup
yeah i get the same popup
thatr was never explained to me 
hmpf
when does the error happen?
kaggle has an obvious power button to turn it off
Some seconds after clicking the power button
Tested it myself to see if i can recreate the error and it seems it's true.
tf is power button
start session
that lol
oh yeah that error happened to me a week ago i never told anyone about this error 
no matter what I press in kaggle I get that same error off it not working
start does it, that button as well
run all
just run install first, when it is finished run the other cell
That error.. Never happened to me on my current notebooks.
dunno, I just tried and it is fine
create a new notebook, then import applio/main/ kaggle thing
?
that's too confusing it should be always on the front without any extra steps
ty tho
wdym its literally simple
ur like that one lizard on the movie hoppers
not sure what that means but I still get the error when making a new notebook
I love this guy
okay, that happens when you select
maybe there's a space limit or somethng
so I just do nothing and let my model die if something bad happens
Just switch to variables and files after properly starting the notebook xd
No errors popped after switching
nothing to do with Applio
Already figured it prolly wasn't something related to applio
make a support ticket or something
Sorry if i looked like a dummy noobies
I mean, i only wanted to inform that issue in case of anything
Myself i never had issues with that
I figured how to get it working ty Leo
fixed now
@viral mason
fixed
How do I configure vonvox?

how so?

The main thing needed to change is pitch and block size
i followed the guide perfectly and the thing doesn't work (using lightning cloud w-okada)
Anyone got the TTS Ivy voice?
oh i meant like tips for things to improve outputs
i already know how to use the program itself, ive previously trained about half a dozen models on rvc webui
i was wondering if there are any like checklist type steps that you should always do or any extra things you can do to make it better
you can try merging epochs of different training runs
for example, you train the same dataset two times; one with batch size 4 and another with batch size 8, then you merge both of them and see if the quality increases
besides that, you could also try using different batch sizes for the same dataset, maybe batch size 8 is going to work better for that specific one than lets say 4
also,ive seen some use lower lr for the discriminator but honestly everytime i change the default lr things don't sound as good as with 1e-4
perhaps a different pretrain can help too https://discord.com/channels/1159260121998827560/1235952130855010365
ooh noted noted, tyvm! ill definitely be trying those
i did see pretrains and that, is it just you get a custom pretrain and tell it to use it when training and it can potentially change the result? i've no clue about them
pretrains are RVC models trained using many hours of audio
each model learned to reconstruct a spectrogram in a different way; for example, the original pretrain learned to reconstruct the spectrogram based on the VCTK dataset combined with pitch shifting.
when you load the original pretrain, what you’re essentially doing is continuing the pretrain’s training with a new dataset.
so if the pretrain has a good understanding of the human voice/spectogram, by continuing its training, your model will inherit all that previously learned knowledge.
well, when you load a pretrain different from the original, you’re continuing the training of another model that learned to reconstruct spectrograms in a different way, you get the idea
you can enable the usage of custom pretrains by enabling this
but first you have to download a pretrain G and D files (generator and discriminator)
and place them in rvc > models > pretraineds > custom
and then you just select the previously downloaded g and d
u can try these since they're the most recent
everything has to match, like, for example, legacy core 1.6 requires you to select 32k, hifigan and contentvec in the training tab, otherwise it wont work
huh, that is really friggin cool actually! thank you for being so informative with everything
ah and about the lower lr for the discriminator
if you open: rvc > train > train.py
you'll find these
with them you can either decrease the lr of increase the lr of both the gen and the disc
in the past people were experimenting by decreasing the d_lr_coeff to something like 0.5 or 0.7
lr is how fast the model learns
the discriminator is what forces the model to produce realistic audio
i personally don't like to change neither of them (1.0 is fine imo)
sometimes it gets too strong and prevents the generator to produce better quality results, thats why sometimes decreasing how fast it learns can produce better results
i know everything i said is confusing but they're actual ways to potentially improve a model lol
one thing to keep in mind is that rvc doesn't make miracles, if a dataset sounds bad, is going to sound bad always after training it
hi
i need help making an rvc, i have the latest version of applio, i have windows 11 home, 32 gb of ddr4 ram, a ryzen5 5600x and an nvidia rtx 5060 from MSI, im trying to make an rvc for caine from tadc, since in the rvc list there isn't an italian version of it. my dataset is 1:20 minutes long but i don't know what settings to put, i modified it in audacity to make it sound better and exportted it in 48k
ciao ti posso aiutare
che impostazioni dovrei mettere su applio
usa il 32k
è già su 32k gemini mi ha consigliato di usare refineGan
è abbastanza inutile, se non sbaglio poi avrai problemi con l’inferenza SE NON ERRO
sì okay ma puoi usarlo solo su Applio
direi ogni 50
io ho solo applio per fare rvc
e allora va bene
io ho iniziato a trainarlo, spero che si senta bene, il mio primo rvc, quello di N si sentiva male anche dopo 5 tentativi diversi
Any 1 has suggestion on running AI channel
which AI is best for Animation / frictional Character
<@&1159293204038955078>
what is the software called for these voice changers
what gpu do you have?
3050
ik its cheekz but its a laptop
nah its good
Last update: March 30, 2026
follow this guide
k ty
its a confusing installation, do u think W-okada would be better
no.
vonvox is simpler, whats confusing to you?
the vc cable thingy and stuff
its easy just install and restart your pc, sadly its the only way
bruh wtf
mxm
i watched this tut now it makes no sense at all
wdym?
i watched a tutorial video
what is a 16inch virtule cabe
you should only follow the text guide from the docs
mi confondo tantissimo, non so perche🤦♂️
ah ma sei italiano
scrivimi in dm
poquito
bro was so confused they switched languages
AHAHAHAHAAHAHAHAAH
I'm making the model btw
if you're still interested
Can I upload and rvc model I made?
you have to first go through this and if your model is good you'll get accepted to post in voice models https://discord.com/channels/1159260121998827560/1453581938538315847
idk what it means by vocoder, i just trained it through rvc, i did it a year ago so im not entirely sure what it means
asks like if i used hifigan or refinegan
alright i made a submission, was i supposed to upload the model somewhere for someone to test or
what exactly
oh
i got a message
im not sure what embedder i used lol
is the embedder model architecture version?
embedder is most likely contentvec
need to upload the model to huggingface?
idk why but it says it should be a valid huggingface link
i gave it the link to my huggingface upload
https://huggingface.co/Grimden/Daniel_Of_Mayfair this is what i uploaded
the model is not uploaded properly, pth and index should be in one zip file
lol
still gives me this error Model link should be a valid Weights.gg or Huggingface Model Link!
i uploaded it as a zip file...
then do this
you're welcome!
Like I said in chat, if ur gpu is Nvidia download Vonovox and if it's AMD download tg fork
Butt
Why don't u wanna say
You have no need to be embarrassed of why u want to use the voice changer to be w girl
No judgement here, unless it's to troll
Very judgemental there

So you're not gonna troll right?
Good
Here ya go
Was looking for the links
Virtual audio cable, connects the voice changer to games or discord
Nobody can hear it without that
You're welcome
Nah, for the virtual audio cable just run setup64 then install driver
then Vonovox just run start
start.bat
Helppp
So i got the 3090 pc
Its in windows 11
Which is the voice softwares in the server more comfortable with? Windows 11 or 10??
I use Windows 11. 
This is a General AI Discord Server, please elaborate:
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
Real time voice change and voice training
it's outdated, did you come from some old youtube tutorials?
This is a General AI Discord Server, please elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
ah ok because i grad it then followed YouTube tutorial telling me to go to a different place and then I saw the warning at the top
the link is for the original wokada, it's legit but it's really outdated and not suggested, the warning on https://docs.aihub.gg is related to people trying to impersonate us and not about the tools, or are you talking about another warning?
Last update: April 19, 2026
it would be best you reply to the questions so i can help you get better tools
could you send the tutorial link you're using?
https://www.youtube.com/watch?v=qiRpxHEoZZ4 was this one
That I'm pretty sure has the legit way of doing it the other one I just Google searched to find
it's kind of old, and not even related to the link you sent
the link you sent was about the original wokada, not wokada deiteris fork which the video talks about, i think you just clicked the first link
please answer those questions so I can help you better
and wokada deiteris fork isn't suggested either, it's more "archived"
Yeah like I followed a Google search to get to this one https://github.com/w-okada/voice-changer
Then the video sent me to this one
please elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
those are needed, there isn't a one singular best app that works for everyone
It's ok now that I know it's safe and I didn't screw up somehow
4060it
win10
Roleplay/dnd stuff
alr i replied here https://discord.com/channels/1159260121998827560/1500451925810872330
are you like looking to do roleplay or trolling as an e girl or eboy?
Doing all the voices for music production/music cover
Roleplay for dnd stuff
thats two , do i pick one or both is needed
Hi need some help how do i stop hearing my self from the voice changer
This is a General AI Discord Server, please elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
Applio is an all in one but specialized in RVC Inference and Training, you could use it for realtime too but you could also experiment Vonovox which is specialized for Realtime
fr
its not working for me
What's not working? This is a General AI Discord Server, please elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
isnt applio garbage for training?
i consistently heard people say it produces terrible models
Applio is an up to date Fork of RVC, the original mainline RVC is abandoned since years
Perhaps you confused with an issue that there was on the descriminator of an older version of Applio, or the Overtraining Detector which didn't work well
just relating what i heard from people
never tried with applio just because what i heard
what did you train with instead then? mainline RVC?
I wonder where you heard about applio being terrible because that seems just wrong to me
i used something else
might have heard it from here but im unsure
Applio is more suggested, did you use something off youtube tuts?
if you say so
what then?
FR
why is the voice changer so laggy
What the audio cable im supposed to use? I just reinstalled windows and I do not remember.
This is a General AI Discord Server, please elaborate:
- your pc gpu
- your pc os
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
This is a General AI Discord Server, please elaborate:
- your pc gpu
- your pc os like Windows 11 or 10
- what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
- the tutorial link used
i found out the issue and fixed it thanks though.
be sure you're not using one of those outdated yt tus suggesting vb cable for windows
Good morning. In rvc as anyone had the issue of noise suppression 2 stopping all audio output? My second PC is offline now and for whatever reason i can't get it to output voice
I had it working a few days ago but idk what its doing now
It def works normally when I have suppression 2 disabled
whats your voice changer and what gpu do you have?
Mmvcserversio. Realtime voice changer client for cuda. Nvidia 3070 GPU
you did not tell me the name tho, i suppose this is the original w-okada right?
you can either download vonovox (better for nvidia gpus) or tg developed fork (more like okada)
-rt
Guides for Programs that use RVC Models in Realtime for Calls/Games
A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options
A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project.
A Realtime Voice Changer with similar performance to Vonovox & Wokada Tg-Develop Fork, with extra features.
Deiteris' fork (modified version) of wokada that doesn't get updates anymore.
These options are not recommended for use.
Not suggested, older versions in youtube tuts are even way worse. GUIDE
The program is worse compared to the ones above, and much less updated. GUIDE
OK Ill try them both! Thank you so so much for the help!
no problem!
guys what does "epochs" mean
it's just how many times the ai went over the audio used to make a model before it sounded good to whoever was making it :D
ooooh thanks a lot!
you're welcome!
like in real life?
(what are you using? whats the gpu you got?)
please respond to the question i sent and provide me a screenshot of the program soni can guide you
yea dm me the screenshot
that is extremely outdated
This isn’t a basic usage question, i would like to share my idea of why converting a speaker e.g. (A) most of the time it deosnt sound really that good no matter the dataset, ive read some docs and it doesnt seem they cover this that well, if i missed something i am sorry, please dont hate me 🙏
I’ve been digging into the RVC training code and experimenting with different pretrained models (like Titan), and I keep hitting the same wall
B to B conversion (same speaker) sounds fantastic (where b is target)
A to B conversion (my voice to target) sounds glitchy, unnatural, and full of artifacts in many cases, even when my voice is similar to the target.
I think I’ve figured out why (might be very wrong)
if i understood right, the rvc's generator consists of four main parts, Textencoder takes phonemes + pitch and outputs a content latent, posteriorencoder + flow encodes the real mel spectrogram into a posterior latent then the flow transforms it to match the prior from the text encoder, decoder takes the latent and the speaker embedding to produce audio.
During training, all these blocks are updated together using the same optimizer
when you fine‑tune a generic pretrained model on a single target speaker B, every part of the generator adapts to that speaker s data including the textencoder.
Even though the textencoder only sees phonemes, it learns to produce a latent representation that’s subtly tailored to B’s vocal tract. the flow and the decoder expect that latent to decode into B’s voice. The encoder is never exposed to other speakers during fine‑tuning, so it becomes speaker‑biased.
The result
B to B works great because the encoder s B bias
A to B breaks your voice A goes through an encoder that outputs a B biased latent, creating a mismatch with the content, and the decoder struggles to produce clean audio
The solution I’m planning
Pretrain the generator on a large multi‑speaker dataset roughly 60 hours of my own voice (speaker a) and 50 hours from 15–16 speakers forcing the encoder to extract only content
then freeze the textencoder and finetune
i still havent tried this and i am curious of thoughts of people that are probably more experienced than me and maybe already tried this.
And why isnt freezing the text encoder a native option?
the only pretrains I'd use currently are Legacy core 1.5 and 1.6, and the new PABP one although it is a bit new and could be noisy
i mean, the point is not the pretrained itself
titan, ov2, rin3 are very old and cause harmonic distortions
its the way you fine tune it
iv tried almost all of them at this point
freezing the text encoder isn't native as it could cause issues to casual users who don't know much on rvc training
how do you clean your datasets btw and what are the usual lengths?
freezing te is not a bad idea, finetuning is a bit agressive in rvc and you want to preserve a lot of things from the pretrain
i have tried freezing it before but i didn't notice improvements tho
usually already almost studio quality, from 5 to 30 hours, ive tried many sizes
oh
I mean do you remove room reverb, echo if there is any, ect
yea if needed but thats rare
ah kk
as almost always they need no preprocessing
multispeaker models have natural timbre leak sadly
og pretrain and every other leaks
tried with a generic pretrained?
yes, with the original pretrain
my idea was to from scratch train a multi speaker pretrained that has 60% of its dataset my voice
to see if on my would sound less ai fart due to latent mismatch
(my first big explenation msg might be wrong so correct me if i say somethign wrong )
it wont work as you expect, the other speakers are going to affect the timbre of your voice in the model
even if your voice is dominant
it still sees the other speakers
its quite bad sometimes yeah
when you finetune the pretrain that gets fixed since every speaker in the pretrain gets override by that singular speaker
but the actual pretrain itself leaks
thats what im actually trying, training a singular speaker (ljspeech) from scratch, then finetune other speakers
but i havent finished the training bc i dont have money lol
but see what i think is that then if you do A to A alone the latent leakes like crazy no?, to make the dec a easier job
with that experiment i confirmed that yeah without any other sids, ljspeech doesnt leaks
Hi, so elevenlabs using the same RVC for speech to speech? as its S2S is amazing with the same dataset.
hmm i haven't tried this yet
like, finetuning the same speaker found in the pretrain?
no
like
if i train from scratch a model with lets say, 70 hours of my speech then the latent is mostlikely leaking my timbre so that the dec has a easier time, so when i finetune it on a speaker b to get a better A to B then the dec has to fight the enc
I don't think so no, if the models cannot be used in applio or in realtime it's most likely something different
never tried this so i am not sure
but its a possibility in my head
well i tried finetuning my undertrained ljspeech pretrain with a voice that sounds very different to her and that model had no leakage
but i haven't tried it with a fully trained model from 0
okay so no leakege
but i assume lj is different from your voice right
im not using nsf-hifigan as the decoder/vocoder tho, and we found that vocoders do have an impact in leakage
always used hifigan
yea i finetuned some random anime character that doesnt even speak english, no leak
becouse my goal is to get the model actually reliable on my voice specificly, even if you train with 0 leakege a model then if the vocie of the target isnt similar to yours you wil lalways get problems
sure feading you voice can may not leake but actually being preceptually good and realiable
true hmmm
thats why thought about my plan of training a single speaker or multi speaker from sctratch so the text enc is good on my voice and also is hopefully invariant
so then i freeze it and fine tune the rest
the flow and dec are forced to map my voice to target
you should try your first plan, even if the multispeaker have some leaks, it still knows your timbre
yea
also you said that vocoders have their impacts, what do you suggest?
i tried refinegan but never really chould say if its really better than the others
keep using nsf hifigan, i tried training refinegan pretrains and they all suck, doesn't sound natural
no problem and good luck, i really never managed to ever train a functional nsf hifigan from scratch, the model always got stuck at some point where it refused to do any changes to the quality
so 1m steps and 1.5m basically sound the same
how much time it took to get there?
4 days
i feel your pain
the original pretrain was trained for about 2.5m steps
on a single gpu i imagine
okay
solid
maybe you should try finetuning the original pretrain first and see if your new "finetuned" pretrain + freeze TextEncoder helps
Sure thats an experiment that i can do, even thought i doubt it will be reliable on my specific voice
italian
yes
thats it youre banned
🙏
aah but train your pretrain dataset using the og pretrain as base, thats what i tried to say (my english is terrible lol)
https://discord.com/channels/1159260121998827560/1235952130855010365 nothing here is trained from 0, everything is a finetune lol
so you are saying like, take my 60 hours dataset of me and finetune a pretrained?
exactly
i dont think that would help
maybe with a low lr
but that would make the enc biased to my voice wich is good but then the enc might start leaking my timbre in the latent as that is the easiest path to reconstruct my voice
so the dec fights enc anyway

it would need to undo my timbre then
maybe you can do single speaker pretrianed if you augment data?
like heavly augments that puts pressure on the enc to keep only content
never tried that, idk
i mean it makes sense
i dont know add like formant, eq and pitch shift (must pay attention on that tho)
and also change sids if you augment
and with those experiments bye bye 50kW
@analog obsidian have you ever tried my pretrain?
you did the italian 32khz no?
yes
i used it
its alright
depends thought
i had some voices that where kinda trash
but never had leakege
honestly the main issue is still always the same and its that on my voice is not reliable
why?
not becouse it makes a wrogn timbre but becouse of latent mismatch
if you havent read here i wrote my theory of why when you finetune a pretrained to have A to B it almost always deosnt sound reliable or glitches out with some words/ sounds unnatural most of the time
holy i cant write right
sorry for eventual mispelled words
Heya. im new to this hub . i tried researching first before asking here. using gpu 4070ti/windows 11. my question : i have decent experience using rvc, trained my own models and used the old wokada for the longest. and after a decent amount of tries i hit a plateau and im looking to improve my model. right now im trying to use the applio to train my model instead of rvc v2. which are the best options to pick from sample rate/ vocoder/embedder. custom pretrain etc.
[gpu_processing] cv2.cuda.GpuMat exists but missing: createGaussianFilter, resize, cvtColor – falling back to CPU.
mhh couldnt that be a model problem?
well yes me of the past, hence why the pretrain theory
maybe it wasnt trained on clean data(?)
?
i use studio quality
oh well
tried even datasets also similar to my voice, similar acent and all
never fixed this issue
deeplivecam error, its using my cpu instead of rtx 4070, i used to use this and get 30 fps now i get 3
no idea
what is deeplivecam?
indeed
hes a catfisher
wtf
pretty obvious
even if he is what even is deeplivecam 😭
face swap live
a thing to live render people faces, deepfake category
no 
do it
@analog obsidian for the pretrained multispeaker i might have an idea to no leak and still train to be very good at a specific voice, lets say i have 60 hours of voice A and idk 50 of other 15 speaker, since my goal is to have a pretrained that is very good at voice A but also invariat, maybe if we modify the batches to load idk, 1000 clips of A 1000 of B and so on for all speakers even if speaker F has like 3 hours compared to A since the batch has the same amount of each if the model tries to learn specific speaker it would hurt the other speakers and since gradients have same weight might work better.
hmm sounds like a good idea to me, but never tried this so im not sure if it's going to reduce the leakage, but in theory, it should like you say
Yea same, would need to be tested
Oh my goodness your my hero, this new vonovox is so much better quality. Thank you again
no problem!
[VCClient] wait web server...340 http host wouldnt load
that's extremely old what gpu do u have? (Nvidia or AMD) and what do u plan on using it for?
im trying to train a model using replay (i have a 3060 12gb and im using windows 11) and i get this error
Error creating model: CUDA error: unspecified launch failure Search for cudaErrorLaunchFailure' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information. CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA` to enable device-side assertions.
how would i fix this
i have auto epochs turned on
replay still gives the same error with auto epochs turned off
im on version 8.7.2





