#✨│ai-help
1 messages · Page 337 of 1
good luck 
I know how to use kaggle and wokada tg but not the browser version like this
ye
Aight
actually no not if u have it already
Ohh oke
So this means when i wanna vc with the voice changer i gotta use discord ob my browser as well

the Intel ARC GPUs also bad? I mean a CPU/APU I can get, that is bottom of the performance barrel, but the dedicated GPUs also?
Oh yea you did say thay mb😭
dedicated gpus just cannot run ai at all
not locally
then what do they even have all that VRAM for xD
honestly no clue lol
Konichiwa
i mean i will put my own ai models, i know how but generally spongebob mario v1 from ultrakill etc.
mainly from games and shows
wonderful! was checking because a lot of ppl join here for weird stuff
here's the downloads!
first the voice changer second is a virtual audio cable
really? bruh
i just like to roleplay my favorite characters
so dw
and thanks for software!
no problem! if u have questions lemme know
yea, ppl can be weird lol
Does anyone who do models take suggestions? My pc isn't good for training so :/
Wasnt there a server network for finding voice models a while back? I remember there was a seperate server where you could get german voice models but i cant find it, does anyone have the link?
When I click a huggface link to download a voice model I got: Invalid username or password.
of course! I can send a video I've made a tutorial video not on yt but one I can send here
what are you trying to download?
- Goal: TTS
- Specific Issue: doesn't work
- Full GPU Name: MSI Geforce rtx 5070ti 16gb
- Operating System: win 11
- Tutorial Link used:
what are you using for tts?
could you show me the link you downloaded it from? Applio is the best tts if you're using voices from here https://discord.com/channels/1159260121998827560/1175430844685484042
oh yeah real time i think
real time is for talking into it and it makes your voice a different one, that's what you're wanting right?
like a voice changer
it's alright
since you have a 5070ti I will get you Vonovox, it's the current best for Nvidia
the first link is the voice changer
the second link is a virtual audio cable to connect the voice changer to discord or games
<@&1159293140440723499> hacked account
if I had my mod status back I could deal with this. ugh
+yo
Hey guys. How everythign is going?
Need help seems the this set up is no longer working.
MMVCServerSIO
is it better then the one i have rn?
when the cmd pop up i dont understand how i can open the voice changer
massively better, vonovox is the best currently for realtime as it has improved quality and is much more optimized so it will run better
what gpu do you have (Nvidia or AMD) and what are you using it for?
what gpu do you have (Nvidia or AMD) and what are you using it for?
GPU: NVIDIA GeForce RTX3050.
Im using it for voice dubbing for shorts clips 😄
i did it
cool! u should try out Vonovox, it's currently the best
https://huggingface.co/dr87/vonovox/resolve/c8034f5f6d50648a8109bb4f847182362e2b779b/Vonovox_beta_17_11.zip
don't you want to upgrade in case you're using something old?
Do i need to delete the files that I have? and DL vonobox?
it would be good to delete it in case the old one messes with Vonovox for whatever reason
u can keep the models u were using tho
yea!
you also have installed vac lite yes?
what is that
by any chance do you have a walkthrough vid how to use it(connecting to other apps and uploading models)?
it's the second link I sent
i already have virtual cable
what part are you confused on, just making sure so I can show you the right part
It's just I have an amd gpu so
ill run vonovox first then ill get back to you 😄
alrighty
did you download it from here? https://discord.com/channels/1159260121998827560/1175430844685484042
yes
clicking the link shows it's not working
that means it was most likely deleted, nothing can be done
F
you could always remake the model yourself, I made a guide in the vonovox server that shows how to get the voice you want and how to train it
i dont have start.bat on my file. Which one will I use to run vonovox?
I just want a good meiko and kaito, adachi rei and like narita taishin 😔
wdym you don't have start.bat? try redownloading it, that file should come with it when installing
it should be fine to run on Kaggle, which is in browser
follow this
- Goal: realtime
- Specific Issue: not sure this is the right spot to ask, but i downloaded vonovox and tested a model, and now when trying to move/rename the folder it's in it doenst and it says 'the folder or a file in it is open in another program'. not sure if this is a usual problem but i cant find anything in my task manager that's holding this up (after closing vonovox and everything i could find)
- Full GPU Name: RTX 4060 ti
- Operating System: windows 10
tts? vonovox is for real time, like talking into it and it saying the same with whatever ai voice youu chose
sorry right i meant real time
my bad
i think its now running. how can i monitor the voice?
all good, you can rename a voice in vonovox by right clicking it I believe inside the app
Where are the sounds located?
models?
in a moment I can show you, but go to control panel > voice settings > recording > and enable listen to this device on the virtual audio cable
i mean the overall folder where vonovox is stored. im more wondering because it would mean there's still a file running in my system from vonovox thats hidden somewhere in my programs (or so i think). not sure if its been a general problem or not. maybe its just a one time predicament ive gotten
I'm unsure tbh, I think it would be best to ask in the official vonovox server
alright thank you
I need a link to a website that contains audio files so I can search for them on Google.
what are you looking for?
you're very welcome!
The best voices in terms of quality and performance
if you want voices they're here
are you confused?
What is the difference and which is better?
Im all good now. Thanks @viral mason
quick question, does it matter whether I train a voice model on applio or mainline rvc
like, is it gonna sound better? train faster?
also, is using a pretrain a viable option for creating a voice model of a specific character? or is it just going to make it sound like a random person instead?
how do i install the program?
The only differences are the settings and features, but training code is shared anyway. My vote is for Applio
You always want to use a pretrain. You can't train a particular voice from scratch without one. It won't work like that
The pretrain contains a lot of knowledge, including how to generate the spectrogram at all. Then fine-tuning it for the particular voice is what makes it just learn to sound like it.
any good local RVC applications suggestions for macbook air m4 ???
Retrieval-based voice conversion or realtime voice changer? And what will you use the software for?
a realtime voice changer and im planning to show it to a friend to spread awareness on how dangerous it could be
i think a retrival based one would work but im not sure if i can send the clips so i would prefer a realtime one

????
There's Tg Develop's voice changer for Mac Silicon. Follow the guide https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/#mac, and download https://github.com/tg-develop/voice-changer/releases/download/b2364/voice-changer-macos-arm64-cpu.tar.gz.
Last update: April 15, 2026
thanks a lottt
guys is this server purely for rvc? or can i ask about other ai too
it's general, so feel free to ask
how do you fine tune it?
just train a model using a pretrain
that's effectively finetuning the pretrain for a particular voice
this is pretty much how all models in #1175430844685484042 are made
how do I know which one to use?
!help-template
To receive assistance, you must provide your system details. Copy and paste the block below into your reply and fill it out.
⚠️ NO INFO = NO HELP
- Goal (e.g., TTS, AI Covers, Roleplay):
- Specific Issue:
- Full GPU Name:
- Operating System:
- Tutorial Link used:
• Check Docs: Many fixes are in the AI Hub Docs.
• Be Specific: Say "RTX 3060 12GB", not just "NVIDIA".
• English Only: Keep all discussions in English.
• No assistance for NSFW/Porn or ANY Illegal Activities.
• Read the [Full Guidelines](#1402790586028789830 message).
it was purely for RVC, now its general ai
what to use instead of weight ai for covers
What is your PC GPU? There's Applio RVC.
gtx 960
- Goal (e.g., TTS, AI Covers, Roleplay): AI voice conversion cloud using Kaggle
- Specific Issue: There's a output error in the log ```2026-05-17 13:36:41.721330963 [E:onnxruntime:Default, cudnn_fe_call.cc:33 CudaErrString<cudnn_frontend::error_object>] No valid engine configs for ConvFwd_
2026-05-17 13:36:41.722017527 [E:onnxruntime:, sequential_executor.cc:516 ExecuteKernel] Non-zero status code returned while running Conv node. Name:'/hubert/encoder/pos_conv_embed/conv/Conv' Status Message: Failed to initialize CUDNN Frontend/onnxruntime_src/onnxruntime/core/providers/cuda/cudnn_fe_call.cc:99 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnn_frontend::error_object; bool THRW = true; SUCCTYPE = cudnn_frontend::error_code_t; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cudnn_fe_call.cc:91 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudnn_frontend::error_object; bool THRW = true; SUCCTYPE = cudnn_frontend::error_code_t; std::conditional_t<THRW, void, common::Status> = void] CUDNN_FE failure 8: HEURISTIC_QUERY_FAILED ; GPU=1 ; hostname=c2e0cb52600b ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/nn/conv.cc ; line=225 ; expr=s_.cudnn_fe_graph->create_execution_plans({heur_mode});
- Full GPU Name: GPU T4 X2
- Operating System: Windows 10
- Tutorial Link used: https://docs.aihub.gg/realtime-voice-changer/cloud/tg-develops-w-okada-fork-cloud/
- Exact time when it happened : after i click the model to start the ai voice conversion (on ngrok GUI)
hello
anyone have experience dealing with enterprise AI plan , i need advice on which option is better for enterprises using AI , a custom wrapper that have DLP( we need this) or enterprise plan purchase is a better option in any case
Is the weights.com working? because I can't access to it?
Have you missed the news? The Weights.com has shut down few months ago.
Yes, but i Can't access to previous version of Replay?
anyone know why this happened?
@viral mason
i know you use kaggle
Yo! I use Kaggle, are you trying to use it for simple conversion like ai cover or realtime talking?
oh
nvm
i've used it before, like 3 months ago, never got this issue
have you looked at the guide to see if it may explain any issue you may have?
I know nothing at all on how to run the voice changers on kaggle just applio
Yes, but I can't access to the previous version of Replay
which one is the troubleshoot guide for realtime ai voice kaggle?
@hallow thistle
When I go to Weights.com and I get the “Page Not Found” message. I need to access the previous version of replay. I’m on Windows 10
How do I make this sound less fucky </3
I used replay and I didn’t touch anything else
@viral mason
why yall use replay its garbage
When I go to Weights.com and I get the “Page Not Found” message. I need to access the previous version of replay. I’m on Windows 10 with a Intel Core 13th generation
Sorry if I sound mean, but I just think those who go for Replay mostly prefer the easy-to-use than having many features like in Applio RVC, and it's kind of a bit understandable at least to me.
I’m trying to transfer the Song Ai cover and when I go to Weights.com and get the latest version of Replay, I get the page not found message
Has anybody has the problem?
Thats what im used to using!!
What do you use?
replay is ok but there's also ai cover maker
Whats it calleddd
ai cover maker 😭
Hold on pleas
AI Cover Maker (or AICoverMaker) is basically another RVC fork.
Im bsck
Tyy
you're welcome! I haven't tried this but I heard Nick say it's like replay
im getting 190 perf, using 196ms only with 2.7 sec
Teach me for a.i cover
NOTE: Some RVC-Related HuggingFace Spaces got Paused by HuggingFace Staff without a response, such as Ilaria RVC Zero and Applio (old)
Huggingface Space by r3gm
Huggingface Space by IA Hispano
HuggingFace Space by Nick088. NOTE: Paused by the creator as you need to duplicate your own space.
the easiest option is to use either ai cover maker or replay but Applio is also an option tho that one requires you to seperate the vocals and music for it to work properly
Hi i have been using vonovox and wokada for a few months now, but i want to begin learning how to make/train my own models, is there a tutorial video or some sort of a guide for that somewhere online?
Hi there! I made a super easy to understand 2 part tutorial for the Kaggle version of Applio, kaggle is like google colab but much better as it gives 30 hours a week for free users
applio is the main training software used for model making, lemme know if you're interested
can you send me a link for that tutorial?
MVSEP performs separation of audio into vocal and instrumental parts, extracts text from audio and it is free. Uses Artificial Intelligence.
I used these but error
if you follow the kaggle tutorial as shown there should not be any errors
Can you send it
Do i ask here for help?
yep
Hello, the bot told me to post and here, and I want to ask for anybody who a model maker role can you review my model and if you like it upvote it? It's a Lil Baby model
how do u make voice models for free? is it possible
- Goal making a voice
- Specific Issue: idk how gng
- Full GPU Name: AMD Ryzen 9 9900X processor, 64GB DDR5 RAM, a fast 2TB NVMe SSD plus 4TB additional storage, powerful Radeon RX 9070 graphics, and liquid CPU cooling. Comes with Windows 11 and Microsoft Office installed.
- Operating System: windows
- Tutorial Link used: I NEED ONE
can any1 send me a voice changer link
can anyone help me test my new ai chatbot site?
Yo I saw your friend request, I can help with this in the morning when I wake up bro
I'll be able to help you in the morning when I wake up, DM me in about maybe 10 hours or so
It's 1 am for me rn
Hi everyone, i have been working on a new AI project and i think it's finished, but I would love if someone could test it and share their thoughts, thanks!
!help-template
To receive assistance, you must provide your system details. Copy and paste the block below into your reply and fill it out.
⚠️ NO INFO = NO HELP
- Goal (e.g., TTS, AI Covers, Roleplay):
- Specific Issue:
- Full GPU Name:
- Operating System:
- Tutorial Link used:
• Check Docs: Many fixes are in the AI Hub Docs.
• Be Specific: Say "RTX 3060 12GB", not just "NVIDIA".
• English Only: Keep all discussions in English.
• No assistance for NSFW/Porn or ANY Illegal Activities.
• Read the [Full Guidelines](#1402790586028789830 message).
Read up the applio ai hub docs guide
-rvc
!help-template
To receive assistance, you must provide your system details. Copy and paste the block below into your reply and fill it out.
⚠️ NO INFO = NO HELP
- Goal (e.g., TTS, AI Covers, Roleplay):
- Specific Issue:
- Full GPU Name:
- Operating System:
- Tutorial Link used:
• Check Docs: Many fixes are in the AI Hub Docs.
• Be Specific: Say "RTX 3060 12GB", not just "NVIDIA".
• English Only: Keep all discussions in English.
• No assistance for NSFW/Porn or ANY Illegal Activities.
• Read the [Full Guidelines](#1402790586028789830 message).
- Goal: Real time voice changer for Roleplay
- Specific Issue: I was able to run the okada fork which belonged to AI HUB before but i have started to face issues with the voice changer it is perfectly changing on server option but gets squeaky or noisy when i try to run it on the client option.
- System Specs: Running of a Apple M4 air
- Tutorial Link used:https://docs.aihub.gg
wich voice changer are you using?
anyone help?
- Goal (e.g., TTS, AI Covers, Roleplay): Create a voice model
- Specific Issue: I don't know how to start
- Full GPU Name: RX 6700 XT
- Operating System: Windows 11
- Tutorial Link used: None
- Goal (e.g., TTS, AI Covers, Roleplay): AI RVC Roleplay
- Specific Issue: already selected input device, but still asking for one
- Full GPU Name: Tesla x2
- Operating System: Kaggle
- Tutorial Link used: https://docs.aihub.gg/realtime-voice-changer/local/tg-develops-w-okada-fork/
maybe anyone know which setting i've missed, why does it still ask for my audio input device when i already have one?
use the docs
who can help me get visuals for my tiktok
<@&1159293204038955078>
I can help with that, I've got tutorial videos on how to clean audio for a dataset and how to use Applio on Kaggle to train a model
what are you trying to use the voice changer for? btw that video is very outdated
all yt tutorials are old
Note: they're using one of wido's videos apparently.
tags of concern: #trolling #egirl #roblox
also the thumbnail is concerning, but not as much as their latest
how so? that video says "egirl ai voice changer"
that isn't allowed if you're using it for that reason
<@&1159293140440723499> (this may need investigating / judgement)
The video's transscript shows the following:
that's catfishing and whatnot
no accusations, just saying if that's the reason then no help
people often come from those werid tutorials and want to do bad things, so it's common to ask
It's just concerning. I'm not accusing you. Just saying its sus.
by using a normal model maybe? like Goku or something that interests you
would be funnier
Imagine u call ur friend and u answer as Spongebob
one sec, I'll get the links
here ya go, first link is the voice changer, and second link is a virtual audio cable to connect it to games or discord
u need both
for the virtual cable run setup64 and then install driver
and after for the voice changer run mmvcserversio
once u install it I can help with that
not currently
the program isn't difficult to use tho
u did these two things right?
yep that file
👍
did u extract the voice changer btw
both need to be extracted before running anything
lol
it's downloading everything needed for it to work, then it should open up in browser
Hi, how much does my microphone quality itself matter? I currently have a one year old headset and I use the microphone of it (its not trash, but not good either for sure), but i will soon get a dynamic mic. Will it improve the quality of the voice models? I sound a bit robotic sometimes, and it makes weird noises sometimes, no matter what settings i use in vonovox.
it's stuck?
try clicking the command window and hitting enter a few times, idk if that helps but interacting with it sometimes does stuff for me
cool it loaded
u use windows btw right?
the input (your voice) gets resampled to 16khz, that is cellphone quality, then the model uses the phone quality audio to extract the phonemes and do the voice conversion
having a better microphone helps the model stabilizing its outputs, leading to better pronunciation, and in some cases better quality too
is every model you tried robotic? if other models sound better it means the model you're currently using was trained using a bad dataset and perhaps it's overtrained
almost
mainly audio settings
run setup64.exe
not the one with the a
then install driver
make sure in ur audio settings over to the left to set your regular mic back to default
you should be
so in the voice changer make your input your regular mic, then output your new virtual microphoen (line 1)
the what?
set your gpu to your AMD card, and fix the input and output if u need to
it's easier to check using discord before going to other ppl
do u have the settings right on other apps?
Every model is robotic for me, and i have tried a lot. I also used vonovox audio player to test out my voice models with other peoples voices from random english YouTube videos i recorded. Then the random "weird noises" and the "robotic feeling" was gone. I dont think the problem is the voice model itself, because in that case the voice models should be bad with other peoples voice as well, right?
i see, i think vonovox has a noise suppressor vst, but if you already tried that then yea it could be that your current headset microphone is not clean enough for the model to understand what are you saying
new mic should in theory fix it
How do I inference my audio (Applio Google Colab method) without the automatic "Instrumental" separation? just "cover" the audio as it is
I have a 4080 Super. What real time voice changer should i use?
- Goal (e.g., TTS, AI Covers, Roleplay): AI cover
- Specific Issue: replay isnt outputting converted audio
- Full GPU Name: nvidia 4070 ti
- Operating System: windows 11
- Tutorial Link used: none
it worked just fine like a month ago and now all of a sudden its not working anymore
share the downloaded program you’re using
If you got a supported gpu then the gpu
!help-template
To receive assistance, you must provide your system details. Copy and paste the block below into your reply and fill it out.
⚠️ NO INFO = NO HELP
- Goal (e.g., TTS, AI Covers, Roleplay):
- Specific Issue:
- Full GPU Name:
- Operating System:
- Tutorial Link used:
• Check Docs: Many fixes are in the AI Hub Docs.
• Be Specific: Say "RTX 3060 12GB", not just "NVIDIA".
• English Only: Keep all discussions in English.
• No assistance for NSFW/Porn or ANY Illegal Activities.
• Read the [Full Guidelines](#1402790586028789830 message).
Replay
!help-template
To receive assistance, you must provide your system details. Copy and paste the block below into your reply and fill it out.
⚠️ NO INFO = NO HELP
- Goal (e.g., TTS, AI Covers, Roleplay):
- Specific Issue:
- Full GPU Name:
- Operating System:
- Tutorial Link used:
• Check Docs: Many fixes are in the AI Hub Docs.
• Be Specific: Say "RTX 3060 12GB", not just "NVIDIA".
• English Only: Keep all discussions in English.
• No assistance for NSFW/Porn or ANY Illegal Activities.
• Read the [Full Guidelines](#1402790586028789830 message).
Replay is unsupported, it’s abandoned since weights shutdown forever, use Applio instead
-rvc
Is there any voice models i can download for applio? Is there a website for that or here?
#1175430844685484042 , its better you elaboratr
hi i download ollama and its run good but i want to know how i could learn my model about specific data on windows 10
It says select voice model but i dont have any ready made , is there any best trained voice model here for appolio so i can test how this applio works
Hey is there a number of audio and audio length limit in applio voice training?
no
what the recommended amount of training
you need to use the hep template
Hi everyone, I’m trying to create an English version of a Japanese song while keeping the original instrumental, melody, tempo, and structure.
What is the best possible way to it? Thanks
for free?
i dont think theres free programs
Applio RVC notebook?
yep
Does anyone know what sort of prompt/site/software people use to make those AI brainrot reaction images that are all over IG Reels? Those ones that are all like "when I step on the mango mustard golden dandelion 67 CITY BOII" or whatever
Hey is there anyone with just 5 minutes to get into a call. my head is so messy right now cuz im getting so frustrated, if someone would be willing to help me id really appreciate it
why does the sliced audios sound bad quality
When I listen to them
but my actual dataset sounds higher quality
anyone know
ChatGPT's image model can be used to make brainrot images as far I know, but Google Gemini's Nano Banana and some other image generators are the honorable mention as well.

I posted some reference pics in the thread I started. Will start with taking some of the original/base pics to plug in as references and then prompt for things such as lightning and laser eyes lol
i need someone with chatgpt pro
Good luck. If you're interested to show your own generation, you can send to #🏙│ai-images, but you'll get an image perm once your username turns blue here. 
What is it about?
So im having so much trouble with cognitive overload, i cannot think straight, mainly because my interests are so many, and i jump all over the place. Ive tried physical planners an more, but i just cant seem to map it out clearly. Therefore ive started working with AI, such as adaptive.ai and perplexity computer, but there is just so much going wrong and its so frustrating. My goal is to make a basic AI agent / assistant who can remind me, relief some of the stress and give me clear structure
that isn't allowed <@&1159293140440723499>
yo
Thanks
hi
if u have a 4060 u should be using Vonovox for realtime
also if u are using it in games turn down the graphics of the game to see if that improves performance
why is it glitching and sounds robotic
not sure, what GPU do u have? (Nvidia or AMD) and what are u using the voice changer for
ok cool, u should use Vonovox, you probably downloaded something off a very old youtube tutorial, what kind of voices are u using btw, Anime characters or maybe Spongebob ect
I use okada and yeah lets just say spongbob
wdym let's just say?
like I use it
you're not using it for anything weird right?
wdym?
idk u made it sound sus lol
nah just for trolls
eh?
alr what do I use
wdym troll
spongebob
what voicechanger should I use
this is the newest and best realtime voice changer for Nvidia
second link is a virtual audio cable like VB cable to connect it to discord or games
if u have one already u don't need it then
how good is it?
it's the best currently. Quality is much better, pretty much no robot sounds unless you're using an old model that wasn't trained good, and it's very fast
how do I open it
run the file called Start, make sure to extract it
does anyone know how to get a better ollama UI than cmd?
LM Studio?
what kind of trolling?
I mean what kinda models are u using
like anime, Spongebob, something cool like that
oh
it's not allowed to do that here <@&1159293140440723499>
should have read the rules
Okay, so I noticed that all, not most, ALL online AIs are greedy paid services with requiring paid monthky plans to even use them properly, and even if local AIs exist their quality aint as good AND worst of all it requires 64GB+ RAM PCs which obviously not a lot of people can afford.
As someone who wants to use AI as a tool without paying yet has a modest gaming PC what is the best way to use AI for game dev and animation(bonus points if I can train AIs as well as use references)
For training the voice model
Vonovox is only a voice changer to use the voice models, Applio is what is used to train them
In mp3 i was able to download a 30minute video in 64mb
But in youtube to wav online converter the result is 285mb
Which should i pick
mp3 aren't suggested, it's best to use Wav and download with yt dlp
file size isn't what matters it's quality
I noticed the video's audio has like background noise, whats the best do u guys use to clean the voice or having background noise is recommended?
I personally use this site https://mvsep.com/en/home
I made a cleaning tutorial on how I usually go about my models
Whats your youtube channel linkk
this is the tutorial I don't use yt for much
That is the best one right? Can i use that offline like local download
For training what epoch is the best quality
there isn't one, just train it until it sounds good
If i export the trained one can i retrain it again with more audio if the result export wasnt good
wdym?
The trained voice model in applio
Is there an alternative that is free or has free credits/trial
uvr is free, also mvsep is free too
But for mvsep the export to wav needs credit
Oh wait
Its only the wav32bit
But now my wav file it says file is too large
it doesn't tho
convert the audio to flac
Is uvr as good as mvsep
Also for the voice model from applio
If i export the first trained voice model can i retrain it again with more audio if the result export wasnt good
Also what do u use to download youtube links to wav?
I use Yt Dlp
What command do u use via cmd to download the wav file of a youtube link?
yt-dlp.exe -x --audio-format wav yt link
it requires the app tho so you need to download it
Goal: Roleplay, Specific Issue: Not seeing virtual devices, GPU: Nvidia 4090, OS: EndeavorOS (Arch Linux), Tutorial link used: tg-develop rvc setup. More detail: I'm trying to get this to work in arch linux, and while I feel I'm almost there, there's just no sound that ever comes out of here and I don't have any virtual audio devices showing up in discord.. I've tried everything I know how, but I'm surely doing something wrong here. I followed all the instructions on the site for the tg-develop rvc setup (cuda) but am just stuck now because i think its just not hearing my mic, and it's not seeing my virtual cable despite pipewire being installed. Any ideas?
I'm not too familiar with linux myself, so this may not be very helpful
Can you see if the microphone is working outside of the voice changer?
Does the browser have permissions to use the microphone and speaker?
Of course, is it possible to use the virtual cable you created with pipe wire at all, as in; did you test it outside of pipewire and the voice changer?
Basically, just try testing things like and narrow down under what exact circumstances things stop working, because then it will be easier to fix.
I should add that, one thing I don't understand about the guide is that, for linux it makes people download portaudio, however you can already 'wire' audio (send it from one application to another) using pulseaudio which is in most linux distros. Also this utility will let you work with that more easily https://github.com/theRealCarneiro/pulsemeeter
where can i download the lastest vonovox softare
It's best to get it from the official discord server but this is the most recent beta, much better than the current full release
https://huggingface.co/dr87/vonovox/resolve/c8034f5f6d50648a8109bb4f847182362e2b779b/Vonovox_beta_17_11.zip
Hi Rumi, sorry Discord did not ping me. I got this working actually, just now. I am not sure why but I had to completely reset my config. I destroyed all the mics/etc and started with a clean slate and it worked. Portaudio, in my case, wasn't necessary at all since my distro already had the capability here. I did have to create a virtual mic for discord though. I used Helvum to view things better which also helped. So sorry for the earlier message. Perhaps I should have tried to just start from square one again. XD
hey can i ask u a question aobut vonovox
how can i use like the models i find here
wait nvm i figured it out
i think
Helpppp what model do you guys use in UVR5 to get the cleanest vocal with no background noise?? Latest
I need music/audio extend tool
I using RTX 2080 16gb
and I searching extender with no credit no limit and uneditable lyrics (you cant extend with your own lyrics) at the same time my friend has it and him dont give it
Yeah, idk any. I did a quick search, and all of them have some sort of limit. Example: https://github.com/ace-step/ACE-Step-1.5 (limit: 10 minutes)
I recommend waiting for when more people are online.
How would you know if it gets worse
So does that mean you can retrain a smaller epoch to make it bigger epoch?
32khz
-issue: My issue is my terminal is not running it just shows text, I have a RTX 5070 TI 32 GB Windows 11 and used this tutorial https://www.youtube.com/watch?v=agDg2kDEwX4
What are you trying to use the voice changer for?
To change My voice for my friends and play around with it
You're following an outdated tutorial video that tells you how to use the older voice changer with "E-girl voice model", and just letting you know using E-girl model is against the server rule if you're wondering.
I cant even use it in different servers ?
Im just here to set it up and leave
Well, there are options. Either try more recent voice changer and use another funny voice model. 
is there any other videos that walk through it i cant find any new ones
Better not to follow any tutorial from YouTube about voice changer. There's Vonovox.
how should I set that up
Helppp i have i7 13700 rtx 3090 what chunk length also do i check noise filter and noise reduction (i already processed the 41 10-30 minute audio files in uvr5) also normalization mode which do i pick none,pre or post? Also overlap length? I want the best best voice quality possible
@hallow thistle
For batch size i should go for 24gb right cuz i have 24gb vram? Also save every epoch at what? Total epoch set to what? I have 41 wav audio files also do i check the cache dataset to gpu? How about overtraining? What do i check in advanced settings
Can help in an hour sorry
16 is enough
In advance setting what else do i check , also i set epoch to 1000
Higher epoch doest mean greater quality
Do i check fresh training? Cache Dataset to gpu? How about overtraining detector?
No, just overtrainer
What do i pick then they said they play around the epoch numbers to see whats best so if i set to 1000 and it saves every 10 does that mean i can check lower epoch results or nah retrain it again with a new epoch number
Each 10 it saves a model and you can hear
So what epoch number should i set it as
1000 and overtrainer ypu wont even finish them
This is what i have 41 10-30minute wav audio mostly in 25-30 minute each though
What do i do then what epoch do i set it
What
Set total epoch at least 500.
You said if i pick 1000 i wont finish it , does that mean the training would take forever
No
Like weeks sjgdians
It ends earlier
What does might not finish mean
I don't understand this one.
It ends earlier
If you Set 500 and overtrainer to 50 if after 50 epochs ther is no improvements it ends
Sorry for any misunderstranding
Its hard to text for me rn
alright thank you so much!!!
How about index algorithm should i set it to auto?
You prob wont use it
Its up to you
Do i generate index first or nah start training
Also prob you already split but splitting in the software you are using doest even matter
How about thiss
Start Training
Index, if you have large datasets is even more useless
Does this need internet?? Why did it took a second
No it doesnt
For what?
What took a sec
It says this immediately once i clicked start training
It says trained successfully
You have a cmd opened?
Yes
The tutorial said i should zip it
You should not
Alright wait upp
Sure thing
Also since you are aiming for max quality there may be more things to cover
Im preprocessing audio its saying this what does it mean
Its saying errors
Actually no
You said you have chunks of 30 mins
41 of them
Right?
It now says this
Yea but answer
Okay
I thought it was 41
How do i do that
It now says preprocess completed in 55.26 seconds on 08:19:11 seconds on audio or nah still split it
Yeah i can use python code to split the audios, you mean it needs the length to be smaller
If there something was processed is not complete
Yea
What length should i cut the audios then so there will no longer be errors
Ram you have?
31.8 gb ram
Yea i need to guess by graph like
It doesnt show how much used in numbers
Can you pic better
why dont you take screenshots?
Thanks
He sends stuff, at this point i ask what i need
Is a problem?
no but i mean
Alright 5 minutes seperated in all audios then, with my 3090 gpu i7 13700 how long do u think it might take to train 42 wav audios ranging 10-30minutes but mostly 25-30min seperate 5 minutes each? Cuz its night here right now i hope my pc wont blow up while i leave it on running while sleeping
It 7:23pm right now here
Theres like video online about gpus like catching fire on its own so aaa would many wav files can cause it or nahhh
It wont blow
Dw
A bit
It will take a bit
You will use the model on your voice?
Yes realtime voice too, im gonna do all the voices on creating music im very broke i gave all my savings to the pc
Man goodluck
It might not sound as good as you hope
Really what does that meann
Your voice means a lot for quality
Lets say your voice is far away from target it might sound off
Yes, so is applio trained voice and then put the trained file to vonovox will the sound be accurate
Maybe
To my exp
Can u send me a sample , now im nervous
Alright thank you
Also example if i trained the file can i retrain it again with new audio files if i am not satisfied with the sound it creates
Yea
I can provide way better help later tho
Can i still get my hopes up by spamming more audio files so it can level up to adjust if the singing involved screaming and many more quick tone changes, cuz voice quality matters alot to the listener especially in music, as long as the voice sounds different from mine so it sounds like different people singing in the music
Meh
What does that mean
Which is more convincing those ai tts or the vonovox one
Vonovox
But the model itself does the lifting
Really but ai tts are like so advanced now only small audio files and now theyre convincing
Is vonovox really better
Is vonovox also what those vtubers used
Idk
Once again, model does the heavy lifting
If that doesnt work well on your voice the model is the bottleneck
Like for me, its a pain, i am not even a native english speaker
My speaking style makes model sound a bit off, almost anything
Any target pretty much
I have tons of samples
They sound perceptually not right
Chat with you soon then imma make a python code for cutting the audios to 5 minutes
Okay
Ping me when you will thought
Is this normal
Yes
This is for just 1 5 minute wav? I produced 180 5 minute (and others less then 5minute) videos
You mean audio?
What video
The wav file , yes
No thats not for 1
1 epoch = ai saw full dataset
Whats epoch2? Is this like the same 5 minute wav and it will count like up to 1000 (i set the epoch to 1000 out of curiousity)
Hold on there is a misunderstanding
5 min video what
Like is the dataset made out of all the files each long 5 mins?
I guess so no?
Yes the 42 wav files ranging 10-30 minutes turned to 5 minute (and the left over less than 5 minute) cuts
Okay
Yea it eill count to 1000 epochs but since you have the overtrainer on it mostlikely wont count up to 1000
You set it to 50
Yes i set it to 50
So after 50 epochs of 0 improvements it will end regardless of the 1000
Like, 430 epochs, you had 0 improvements over 50, it will end at 480
Do you also train locally too? Is this gpu temp normal? Hopefully it doesnt go higher than that since i saw videos of like people's gpu smoking all of a sudden
Ye its fine
I just saw it go to 80°C
90 would be odd but 80 is okay
Thought wait
May you tell me vram usage please
Just write it
It doesnt show what i need thought
But
Doing some guess work you are fine
why dont you take screenshots tho i dont get it
How to see vram usage
Its written as dedicated memory for gpu
In the gpu tab
Same
Also if you want you can already listen to a sample, it saved 1 model
So meaning now its in epoch 3 that means one epoch is a whole 180 wav files? And it has to go to like epoch 200 or 500 or wherever it feels the training is enough?
Yes
Also you asked earlier if i train locally too and the answer is yes
It takes like 6minutes to 7 to finish one epoch hopefully when i sleep this pc doesnt blow up
No it wont
You can sleep safe
Whats the max temp did you gett
Old gpu, 80 (3060 12 gb)
You say that but your pfp says something else lol
New gpu, 70 (5070 ti 16 gb)
Never really reaches that temp tho
Max usually is 60
Important: only becouse yours is higher doesnt mean its bad or harmful
I saw people in other forums like has latest 5060 series gpu that has 8gb and they said they dont think they can use these applio and stuff
Idk about them
Sounds wierd
Anyway the dataset is a single person right?
Yes
Yes
Cool
If you do music with it please pay attention, i dont know where you get it from, just telling you
Also i heard about this seedvc vs vonovox one what do u think , im seeing this in like youtube comments
And you publish ofc
Trust me, seed vc is not better
I tried it and its, quite bad if you want max quality
(If i remember right, i tried many things)
Whats your top 3?
Rvc (v2), thats the only worth mentioning tbh
And I dont remember the overall quality of other as its been quite a bit, unless they updated stuff you should use this
What you are already using
Is rvc v2 the same thing as vonovox they do realtime voice and voice file to changed voice?
See vonovox as an engine that can run your models, yes it supports the v2 variant
Good luck getting it to sound good
Also with 17 hours you might take a whole day
Maybe 20 hours
Whats the link to this, the only ones they kept mentioning to me is applio and vonovox
In short you are already using the best
Also about training you might want to use a tool to look how training is doing
And its called tensorboard
Is it the one in the applio file the run tensorboard one
Yea
Doesnt work for me thought
So i do it manually
If for you it doesnt work i raccomand to just install tensorboard from pip and simply then running in the cmd "tensorboard --logdir=[path to the training folder]"
It wont what
To me looks like is running fine
Can you go over scalars and look there
Also no need for all images
You could if really needed just 1 of all the scalars
May you write the numbers please and not all images
What does that mean is it bad
In norm_g
You can see in the graph 2 arrows going up
At the beginning
If you hover eith the mouse it should say NaN, you should be good
Also i raccomand to smooth the graphs (in scalars all the way to the left)
Set it to 0,987 or 0,999
What do i pick it says 0.6 also if we change scalar is it affecting the training
Hey, it's been a long time since I used AI voice models, and I've forgotten most of it.
What's the best way to export voice models to an MP3 file?
Export a model to an mp3?
translator was tweaking xd
i meant put an voice model on an mp3 like voice lines from an song or smth and export it to mp3
Converting the vocals of a song with a model then puntting it on top of the track?
yeah but i dont know anything anymore how to convert the vocals with a model like programs websites like that xd
I would like to help but i dont have much time sorry :(
So, converting sound files to sound files with RVC? Use Applio.
Yes I read
hey guys anybody know how to get a indian female rvc model or voice
why whenever i open the MMVCServerSIO it open a website
how do i open the application/pop up i dont want to use the website
Most likely you have either wokada deiteris or Wokada tg fork
They only open on browsers but they work the same
yea and that cant be changed
Ye
If they have AMD they're stuck with it on browser since no Vonovox
I don't get the problem tho besides I guess the browser eating up gpu or cpu
Especially if it's google
do u have the download link off the other version that doesnt use website
Do you have an Nvidia gpu?
yea
Is it a 1660, 3070, anything like that?
4060 RTX
Ooh ok, Vonovox runs on an app and is much better than those two I mentioned
I'll get the links rq
is it real time voice changer aswell vonovox?
it is yes
u can use the same models u used in the other one
second link is a virtual audio cable for connecting it to games or discord, if u have one already u don't need it
could someone help me with wokada?
what gpu do u have (Nvidia or AMD) and what are u using the voice changer for?
3080, vr chat
i clicked on start but no one is hearing me
what kinda models do u wanna use? like Venom, Anime characters, Spongebob ect
girl voices
why?
wanna test these
why girl voices in particular tho, any reason?
if its better than dubbing ai or not
i have a voice disorder and doesnt like my regular voice
Vonovox does the same thing but it's much better optimized, better quality and such
Okay
but it still doesnt fix my issue
nothing comes out when i select virtual audio cable
are you able to send a screenshot?
i need download both?
the second link is a virtual audio cable so if u have one already u don't need it
i have it already
alrighty
downloaded it
your mic should be your default mic, is that is
yes it is
some settings i need to do?
and adding models
Good night! Where can I find the link for the collabs?
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Tg-Develop
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
here ya go good sir
where i can find good models
ok ty
you're welcome :D
Thanks!!
the Real E-girl (RVC 2 - 60 Epochs) voice
:(
i saw it got updated
use something normal instead of an egirl voice
i try it for first
just need an start base
there are normal female voices on female characters
you know how to use the newer version?
there is old and new
old and new what
I don't use egirl models. so idk
will do! just unsure if i should use the effects like noise gate
is gpu delay 60 ms good?
I've got my vonovox like this, I at times use noise gate but it's very sensitive depending on the model
other effects I use in FL studio
which would require a complex setup
I think so
if u turn the graphics of vrc down some it could increase preformance
I play the game as well and could test to see if u got it working btw, my user is the same here as on vr
basically the index makes the model sound more like itself
aah thank you
hi all , i am hopping to get some advice on how to connect my main computer to my RVC computer :
both are connected to the same network
RVC computer specs is 8700K with rtx 2080 super
i am using deiteris voice-changer
u should use wokada tg fork instead unless you've gotten bad results from he lower gpu on that pc
Quality from vr chat to discord makes so much difference haha
i see , thank you for the reply . i was not clear when asking . i am trying to connect my main pc to my second pc which will be running the rvc but do not know how to
ah, I'm not sure about that in particular, try askin one of the helpers by pinging the @ helpers role
ah i see , thank you very much
hi @tame oracle , i am hopping to get some advice on how to connect my main computer to my RVC computer :
both are connected to the same network
RVC computer specs is 8700K with rtx 2080 super
i am using deiteris voice-changer
are you free and able to advice me ?
hey, what's the website y'all use to use voice changer?
i know the voice changer in real time but not like premade
or even voice changer in real time, what do y'all use please?
hey I was wondering if I want to do TTS, should I go with Fish s2? Currently running a gguf version of it on a 2080 ti 11gb vRam; I don't mind spending money on runpod or smt to train models but I'm trying to accurately create TTS voices that have human emphasizes voice acting like almost
Its js getting the prompt right for it seems impossible is that the best way to goa bout doing it
also is there a reverse speech to text? that can tell me what tags a certain audio would fall for
so i can learn wat i need to do
tyty
Does weights website use vonovox? I recall back then the voices there are very well made
Hello an update its counting down now
It says g/total:24
If it goes to zero does that mean it got the best quality even if it reached the threshold? Its only at epoch 136
But it has been running for 16 hours now though
One epoch is like 7 minutes to 8min maxed but its at 6:30min rn per epoch
Woah 60 hours of voice? Thats more than mine! Is it very convincing and not robotic?
Weights has been gone for a while, it never used any of the voice changer apps
You will never reach 0
But yea the lower the loss is and the "best" it should sound
That is an old thing and i never ended up doing
You tried some of them?
Also looks like it may will end soon due to the overtrainer
Yeah
The g/total i meant like the overtrainer since if it ends does that mean it got the best sound out of the dataset?
Probably
You can already listen to how it sounds, just go in applio, inference tab, select your model, record real quick audio and upload it or record it from applio and listen
Its at g/total 16 now
What
What does the numbers show now does that mean the result will be good
Where do you read 16
Mate maybe
No thats not loss
Thats the num of epoch before it ends
If there is no improvement
Use tensorboard for loss
Not the cmd
The scalars rightt
Dont look at norm
Smooth it
Also cant you send the full page with all the scalars
And not 1 at the time
Is this it
Yes
What does it say is it good
Mate go and listen how it sounds, trust your ears
You dont need to stop training to listen
If it doesnt sound well can i retrain it again with the same dataset or no its has to be different
It will be pretty much the same
But wont that break my gpu its already at 100% and 80C°
You can retrain and keep training in future but it wont change much
Its fine
It wont explode
For reference mine doesnt reach 80 but today is the 5th day its running ai training
Non stop
Wow
How many hours of wav is it
But how many wav audios do you max out on each model?
Usually more than 10 hours atleast
15-30 or 20
Wow so how did the result go
Thats more than mine
Not that great
As i said my voice is a bit of a pain
I am working on a solution
Lemme hear cmonn
How is it a pain
I cannot send nothing, i am on phone
Is the model a pain or your voice have to be close to the model's
The model
In short my voice is never vlose to target
So to get really high quality its very hard for me, i need to speak in some unnatural ways for me
To match the target
This is my experience atleast, prob some other will not have those issues
5th day does that mean you cant use your pc for 5 days? How many gpus do u have
1
Actually i have tho 3 desktops but yea, i dont use the pc a lot
3 desktops? What are the specs? Wow
Why asking specs?
I dont remember the cpus tho of 2 of them
They all intel all 32gb of ram, main have 5070 ti 16gb, second has 3060 12 and third a quadro k1000
Wow
You can use it eithout stopping
81 degrees you mean
Do this
Best server settings ? not using client
but im kinda confused with vonovxes settings, and vonovox uses alot more vram
Hello guys I have a project for which I need good voice models in my language(german). It does not need to be specific ones. But no matter which platform, I just cant filter german ones out, so I joined AI HUB agein. (I thought it was still dead). Can someone help me finding actually good quality models that I can browse and try without wasting hours of time only finding different languages ? Thank you.
Some more details
Please don't ignore the bot. What is your goal? What is your hardware, etc.
'best settings' will depend at the very least on the hardware, but also on what is running.
Which voice changer? There are multiple. This is general AI server, not one for a specific tool.
Try filling out the template; it will help helpers figure out how to help you properly.
It doesnt sound near to the target voice it should be adapting to my voice and just change the tone and pitch to replicate it exactly but its not doing that
It just sounds like my voice but the pitch is lowered and not a completely different person
can i send u ss
Target is a guy?
Yes

