#✨│ai-help
1 messages · Page 219 of 1
I was suggested to use this, but I don't know what else I should be using
-rvc
Suggestions for @median island
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
Thx
I'm now on the new one, but I'm getting this error and I have no idea what this could mean
AttributeError: 'NoneType' object has no attribute 'dtype'
If anyone could help me with resolving the error I'd be grateful
full error message
if data.dtype in [np.float64, np.float32, np.float16]:
AttributeError: 'NoneType' object has no attribute 'dtype'
Is that good?
i mean the full error stack
the entire error message that shows the entire chain of the failing modules
Are you sure your RAM is large enough to handle that?
depending on your ram capacity, try processing a shorter file, or split it first
Hey guys! I'm looking to find/create an rvc voice that is free to commercially use. Idc if I have to collect massive sets of text-to-speech voice lines in order to create the rvc, I just want to know what options are out there.
the best bet is to use your own voice or your friend's and have him/her consent
it was a regular RAM allocation, running on CPU
So I do already have my own voice, but is there any other options? Like I know you can use the voicevox characters as an example.
voices have copyright, and people can sue you if you use them without their permission
u should try asking the VAs for permission
it's safer to pay some random fiverr vas
is there an easier way to search for vas that would be down for that? or do you just gotta reach out to each one and hope they say yes pretty much?
basically the second one, most of vas hate ai tho
good luck
🦈 🤙
thank you.
if you don't tell them for AI use even though you pay them, you could risk getting sued
I understand
Like I said, I just know there are options out there like voicevox or elevenlabs voice to voice stuff
you can just merge a bunch of voices and bam og voice
was wondering if anyone else had other options
true lol
u can merge rvc models to create a original voice
buuut i know that merging too much can cause unstability
with some notes that it could only be done on the same sample rate and could be hit or miss in terms of quality
yeah
yea 😔

VB-Cable cannot be used with regular RVC program. Unless you mean by W-Okada, which you're supposed to ask in #🔍│help-w-okada.
already answered there
That's why I told him to go there.
Is the Hina mod colab not working anymore?
Yes, it is bugged from now on. There's another better one available, that is Detris' W-Okada, but using this one on free Colab can get you terminated from using the service.
For W-Okada, go to #🔍│help-w-okada.
I'm trying to make covers, not ai voices
Has Hina ever made the "RVC" Colab? Never heard of this information. I've only heard of Hina making W-Okada Colab notebook.
Still. If you mean by RVC that can do AI cover, there's another better one available.
Which is what RVC
Applio. https://docs.applio.org/applio
What is your PC GPU?
What does that mean? You only have a phone and no PC?
Laptop is still a PC.
Oh, thought they were considered different
there's no reason not to use the laptop over the phone
Using a Colab notebook on mobile phone is much harder than using it on desktop/laptop PC, you know.
I know
But most of the time I'm not at home, and when I am I pass out from exhaustion
So I multi task and do what I can on my phone, so I was using hina mod
Since it was easier, but it's not working now
There's various different Hina mod colabs, Hina is just the name of the creator, mod means modified
Which are you talking about? Link?
And what's your PC GPU?
And what do you want to do?
He said he has a laptop, aside from using phone, but never tell the name of GPU of it.
It's Hina_Mod_AICoverGen_colab.ipynb, I was using it to make AI covers. As for the GPU, idk the name of it, I'd have to look
check at task manager
or the laptop package with spec details
that’s never coming back
@hollow thunder #📰│dev-updates message aicovergen got replaced forever, and it’s abandoned
always check that channel
Also You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
Cloud )remote good pc) services like google colab should be the 2nd option only if you got a bad pc, not the first
Now that I messed around with the voices a bit, how do I train my own voice?
Ok
did u check ur pc gpu?
what’s ur pc gpu
GeForce RTX 3060 Ti
Good
it's recommended to train locally, you should check out this guide: https://docs.ai-hub.wtf/essentials/how-to-make-voice-models/
In the context of RVC, the dataset is an audio file containing the voice the model will replicate. It can be either speaking or singing.
Thank you.
Thank you.
lmk for any issues
Which RVC you're looking for? And what is your PC GPU?
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
Thats the thing.. Im lost and dont know what to do.
Is this too weak to handle?
There's only just GPU 0? That's real weak.
Does that mean its not compatible for my computer?
welp its over
it would run in cpu mode
oh
it's just much slower but still doable
I see. Thank you!
Didnt know just having to change your voice could be so demanding 😔
This is Applio the RVC. https://docs.applio.org/applio
So I could use this?
It is best RVC program out there.
"best" welp
it'll probably lag
but ill check it out
you can't compare anything else not based on rvc
How do I know if I did it right? XD
Are you having any issues? If you did it all right the program should open normally and the model would sound good
Hey guys! I made a post about this on their Reddit, but I never got a response. Wondering if anyone here would have an answer.
I was wondering if it is within elevenlabs terms of service to record a large set of audio from a voice I made on their site, to then turn it into a RVC model that I can use in real time with a program like w-okada. Other than having to stay subscribed, would this go against their TOS at all?
(also sorry if this is confusing at all, I'm still very new to AI voices and am unsure if I'm using the correct terms here lol)
using synthetic audio as a source for a voice model training is not ideal
if you're the one who only crosses the road at a designated pedestrian crossing, then violating ToS is very bad
same as using ablock on youtube
/s
Haha I appreciate that.
Didn’t think about using a synthetic voice and how that might effect tho
you can use elevenlabs to earn money but only if you pay for their suscription which has a commercial license
iirc they have a non audible watermark in the results and thats how they can know if you're using the free plan
That's interesting
but ye, converting it to a rvc model might be against their tos, and even if isnt, the rvc results will be worse actually
I love how I do my recording with my new mic with 48k sample rate and voice ends up being pixely as always.
#🔍│help-w-okada message for realtime voice changer, that's wokada not RVC
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
oh damn, integrated graphics, pretty bad
you got any other gpu?
I told you I have a laptop, not a desktop
No
well in this case it's better to use cloud (remote good pc), #📰│dev-updates message here it's explained what's the replacement for AICoverGen
you can use RVC AI Cover Maker UI instead
What is the best progeram to clone voice to then use in Applio?
Applio can do that
this one good ?
Don't ask in other channels, Im talking to you in #🔍│help-w-okada
yh i forgot
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
anyone else getting the problem with applio google colab right now? everytime i try to click on the public url, it just takes me to a website error
Same
I was just enjoying it an hour ago
But then erorr
huh, thats weird
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
-audio
- Creating Datasets for RVC using iZotope RX11, by Cauthess
- Gathering and Isolating Audio, by SCRFilms ❄
- Instrumental and vocal & stems separation & mastering guide, by deton24
- Vocal Mixing Tutorial, by Roomie
- https://mvsep.com/
The problem with Colab seems to be coming from pydantic. Before clicking on "Start Applio", run this:
!pip install pydantic==2.10.6
And it should work
most models I put sound really robotic... I tried so many models, even adjusted my settings over and over again
but nothing
at this point... I think its just my voice
I use the w-okada fork voice changer
try another model, and you should go to #🔍│help-w-okada for the voice changer discussion
need help
got this error in applio kaggle
prob gonna make a new acc on kaggle to test
idk how to do all that
can the kaggle space be updated so it doesn't do that, literally didn't do that yesterday
als remove the old bugfix code
are hugging face links still currently used by users or is there a newer way to?
I don't quite get what u mean, but it should be publicly available unless set to private
nono, it says a voice model should have a hugging link with it right? but i didn't find any voice model with a link of it
im available to commission for a voice model or smth, need it for my vtuber project
where do i look for artists?
yes, one of the libraries that gradio uses got updated and it broke the UI
welcome to python development

Sure, is it possible to fix the error or should I just wait?
dms ok?
as I said, you make a new code cell using +Code button
paste the pip install command, then run the cell
see I'm slow and kinda need a visual example
which cell do I put that before
the one that +Code creates
which one
applio is working now?
read the pydantic workaround above
Should I create a cepda and put this?
create what
Can you show me where is that?
Cell*
I alr follow the instruction but nothing seem to work
It says TypeError: argument of type 'bool' is not iterable
what's the output of the cell?
Im sory it worked when click the public link again
-realtime
Interaction has expired, use the command again for a new interaction.
Do I need to set cable as default for realtime? or no?
nothing is coming thru cable atm
I just realized that after doing some searching 💀 me and VB Audio Cable have a long history
Glad to be rid of it so soon again. What cable should I be using?
Nvm I see which one to use
They need to update the guide. I was wary of using VB because it’s been pretty shitty in the past
For W-Okada, go to #🔍│help-w-okada or stay here for a regular RVC program.
I'm not using Okada
Do I need to restart PC after installing muzychenko virtual cable
I'm not getting any voice through it rn
Do you wanna use the virtual cable with RVC or something?
Yea, realtime rvc
I have the gui
I'm just getting no sound thru it
wait
im dumb
Realtime mode of old RVC is so far outdated than W-Okada.
W-Okada is literally the realtime voice changer.

the guide didnt say that!!!
rvc realtime gui is pretty outdated
You were too confident in yourself on this one.
Cut me some slack...
anyways thats wokada
this is all I got.
There's a better one available. That's why I'm telling you to go to #🔍│help-w-okada for the realtime voice changer.
how do i test out my model using sample?
hi guys any alternative to applio to generate text to speech and lbe able to download and load custom models from huggingface?
is applio not working also for you on colab at the moment?
what settings are recommended for FL Studio Export for dataset (I used it to fix harsh frequencies).
dunno which sample rate to use
There are different Text To Speech (TTS) AIs:
GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/
Freemium 11labs: Easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS
FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site
You can check TTS in our tts index
With RVC Models:
RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)
If you wanna do tts locally with RVC Voice Models (if you got a good pc):
If you don't got a good pc you can do tts with RVC Voice Models on cloud:
-
Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide
-
Use Applio UI Colab (with google colab T4 free daily limit gpu)
-
You could try another tts from our tts index and use the output as an input in rvc
Elaborate the issue
pydantic library updated it’s not working for everyone
Perfect
@simple ore another day, another colab broken (applio)
Google colab breaks soooo much
I'm guessing? It's better to ask @flint solar @verbal oasis
!pip install pydantic==2.10.6 ?
yes
err... I guess it needs to be !uv pip install pydantic==2.10.6
since that was changed to uv
U should ping dev updates do that everyone sees it
Friend, I still get this error
hey is this the fix for applio?
for UI colab, yes
so need to run that after installation and before starting?
yes
It got already fixed anyways, and there's already a text about broken colabs
This isn’t a question about RVC so much as it is about the weights bot. All of the models that I have uploaded to the server have never been credited to me on weights. Am I doing something wrong? I have no idea why or what to do.
I need to help, I have been using cover AI for 1 month and now when I use it again, it is a mistake
Screenshot of your interface + error log on cmd
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
That still doesn't give enough context of the problem you encountered.
Sorry but my laptop can't send pictures, does it matter if I send it to you separately
No need to hop into my direct message to send that. A mod here will give you the image permission.
Can I write?
Can you tell me the name of RVC program name you're trying to use?
the rvc i use is Hina_Mod_AICoverGen_colab.ipynb
The error reads as follows: "Traceback (most recent call last):
File "/content/Hina_RVC/src/webui.py", line 8, in <module>
import gradio as gr
ModuleNotFoundError: No module named 'gradio'"
This one is guite too abandoned at this point. What is your PC GPU?
GPU is my pc or gg colab's?
GPU = Graphics processing unit. It is a graphic card in your PC.
To check your GPU name, open Task Manager, go to Performance tab, and spot where GPU 0 or GPU 1 is in the left side.
My GPU is 0
So what is the name of it?
That one is outdated and replaced, check #📰│dev-updates to know what replaced it
Ok. An integrated GPU isn't so good for AI. So let's go to online option.
It's bad integrated graphics so yeah you should use cloud
Check what I told u
ok thank
yes, I know my GPU is not suitable so I use online data for example gg colab
Yeah lmk

Can I ask one more question?
I really like AI training but the AI training website I often use is gone. Is there any solution for the server?
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio UI Colab: RVC Fork with some extra features like TTS
- RVC AI Cover Maker UI: Automatically Separates the Vocals and Instrumentals, converts the voice and mixes them back
wow

I struggled all day and when I entered this server everything was easy, thank Nick088 by Weights and the server very much

how many epoch should i do if i have 55 minutes of talking no silence only words and no background noise
You're welcome and lmk
No one can know
Use the tensorboard
wheres that
local mainline
the one i found in guide channel
first one
it doesnt mention how many epochs i should do for my dataset
Share a link and tell your PC GPU
What do I mean? You wanna train RVC model or something?
yea im trying to train my own
but im wondering how much epochs i should use for my dataset
i dont wanna overtrain it or anything
Yup
Last update: Dec 24, 2024
Also I would suggest Applio personally
More like easier to use.
Starting preprocess with 16 processes...
100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:38<00:00, 38.59s/it]
Preprocess completed in 38.60 seconds on 00:23:34 seconds of audio.
Starting pitch extraction with 16 cores on cuda:0 using rmvpe...
0%| | 0/1 [00:00<?, ?it/s]An error occurred extracting file C:\Applio\logs\tep piseth AI\sliced_audios_16k\0_0_0.wav on cuda:0: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:24<00:00, 24.75s/it]
Pitch extraction completed in 31.58 seconds.
Starting embedding extraction with 16 cores on cuda:0...
100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.52s/it]
Embedding extraction completed in 10.07 seconds.
Starting training...
Loaded pretrained (G) 'rvc/models/pretraineds/pretrained_v2/f0G40k.pth'
Loaded pretrained (D) 'rvc/models/pretraineds/pretrained_v2/f0D40k.pth'
Process Process-1:
Traceback (most recent call last):
File "C:\Applio\env\lib\multiprocessing\process.py", line 314, in _bootstrap
self.run()
File "C:\Applio\env\lib\multiprocessing\process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "C:\Applio\rvc\train\train.py", line 497, in run
reference,
UnboundLocalError: local variable 'reference' referenced before assignment
An error occurred extracting the index: need at least one array to concatenate
If you are running this code in a virtual environment, make sure you have enough GPU available to generate the Index file.
@low shard
More simple and optimized
Elaborate
What are you using
What's your PC gpu
What do you want to do
okay thanks 👍
3070
train my own model
Are you using the latest version of Applio?
Have you checked if your GPU driver is up to date?
Update windows and your drivers
hi how do i make my model show up in tensor board
do i have click train model then it will show up?
Tensorboard shows when u start training
no like
how do i get a pth file
do i have to start training for tensor board to work
ik where tensorboard is but theres nothing here when i train will something appear?
i already have a dataset and i processed it do i just train now
then it will appear in tensorboard?
wait till it saves some model checkpoints during training, then click refresh
oh okay
thanks
doe sit matter what i have the epoch at
i left it at 500
how many epoch should i let it run for
they should also appear in the logs folder
Mhm
oh my bad i was just finishing up on this
what should i put for saves
This is the TensorBoard.
hi i got an error with codename rvc and i have no idea what to do 😭
it says:
TypeError: argument of type 'bool' is not iterable
and also
unable to find files for the specified model(s)
pip install Pydantic==2.10.6
So, I know this isn't exactly an RVC issue, but I've recently decided to try out "f5tts" model in the AllTalk web-ui, and uh... the actual voice just generates gibberish. Like, it sound as if it's speaking in reverse at an increased speed or something. Any idea what's going on there?
f5tts does not support long texts
Well, just a short sentence produces the same result.
mmm my rvc webui doesnt even attempt to make an index
it just spits out a couple of these
install normal requirements, then run the pydantic cell
It worked for me now, I had actually entered the pydantic code wrong, thank you very much for everything
im trying applio now, looks betta
does the applio gradio google colab not work anymore?
every time i try to run it i get an error
oh wait nvm this works ty
click 'show code' under the install applio section, and then add !uv pip install Pydantic==2.10.6 after all the other pip install lines, then run the install again
then when you run the start applio section it should give you the gradio links without errors
i trained till 450 epoch should i keep going
look at avg loss graphs and you should test several model checkpoints
this happened to me too
@tropic garden do this workaround
thanks i'll try again
Can the STT api do real time?
When creating a dataset, should I upload the audio as one file or separate it into multiple files of 10 seconds each?
@knotty moth
hi guys, is there a way to make a generated audio more human?
I'm using applio, I generated my audio file and I'd like to make it sound more human after I generated it.
Is it possible?
Get a better model and clean the audio as much as possible
there are many things you'd have to learn, do experiment, and improve on: quality, consistency, variations, dynamic range, etc. as well as training stability and convergence
and note that the source quality and consistency may vary, and cleaning/restoration methods have their limitations
one file, use slicer in the preprocess.. or use simple slicer as explained in the doc
Tam anlamadım 32K mı? 40K mı? 42K mı? hangisi?
32k
thx
how good is laughing in a dataset?
would that ruin the model
i recorded my voice in audacity while playing cards against humanity and was laughing so much
it's very expressive this one unlike my other dataset
it's ok but doesn't really help, rvc will still output unrealistic laughing results
yeah ik that i'm just wondering if it'll ruin the dataset and if i should remove it
like if it'll make the model worse
as long it's not excessive, should be fine
but idk what could happen if the dataset has excessive laughing
how much is excessive
20 mins of laughing in a 30 minute set
eh i'll just experiment
well i would be laughing and reading out questions and answers so i don't think it'd be that bad
what matters in the set is using the whole vocal range, i confirmed that yesterday when i trained a model of my voice
10 mins and was using my whole voice range without problems
also helps speech models have decent singing
yuh, go for 30 minutes of expressive audio and you'll be fine
10 mins still felt a bit unnatural
looks like the sweestpot for a natural model is rlly 30 mins
rvc does not stand for realtime voice changer, it means retrieval-based-conversion
use help wokada for realtime troubleshooting/questions,etc -> #🔍│help-w-okada
rvc and w-okada are two separate things
rvc is an ai voice cloning software
w-okada allows the usage of rvc models in realtime
one is for training and inference of audio recordings
the other is for realtime inference
WOAHHH
i started training
and it shows previous model graph too
that's so cool
bro why is the new model (blue) taking so much longer 💀
your dataset is bigger?
the new one
i forgot by how much
but shouldn't be THAT much
22 mins
total
check model_info.json of the old model
it was originally over 3 hours but i cut it shorter
total_dataset_duration": "00:29:53"
30 mins
the newer one is shorter
hol up
old dataset
new dataset
both are batch size 8?
yup
idk then lol
😭
that's so weird
anyways i remember i put me singing thru my ai model cause i was curious
it made me have an american accent
like a country accent
i do not have an american accent why'd it give me one
contentvec main language is english
original pretrain trained only in english
this is the voice of the original pretrain, if you're curious
your model is training using this as a base
the point of training using a pretrain is to change this voice ^ to the one in your dataset
find 50+ hours worth of audio
then you have to do a pretty weird folder structure since every speaker require it's own folder
right
and train for like 1 month idk
there should be different ones with different accents
make a british one 🗣️
well contentvec was trained using multiple languages
but it had more english data
oh
so the english data contains british accents too? or nah
i'm assuming it had a lot more american accents
i have no idea lol
Hii
Im using kaggle on applio
this is a model with a dataset of 7 minutes and 4 batch size
but it gives me just 12 steps

On local training it gives me about 43 steps
but my pc explodes 
same dataset, same slicing, same pitch extraction
kaggle is 2 gpus
so batch 4 means 2x4
7 minutes = 7 x 20 x 3s segments... so it should be ~105 steps with batch 4
so perhaps you did not slice audio properly and a bunch of segments are too big or too small
batch size 4 in kaggle it's batch size 8 in local
because every gpu runs the same batch size
so if you put batch size 4 in kaggle, the two gpus will run at batch size 4, which makes it bs 8
try batch size 2
in kaggle
thats it's the equivalent of batch size 4 in local
it is a simple check - if the train loader has less than 3 samples it means you f'd something up
show the extract features log
and preprocess too for a good measure @edgy tangle
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
I said show the logs
uh
sorry
what do you mean by logs
console or files?
im not native english speaker

anyways, ill send both
that seems like 5s slices were used?
the training script should handle that, but it is weird
start anew
delete the folder with the prepared files
I've tested myself
5s slices
Ok
but you should not use 5s, it is too much
I didn't
Just forgot to set it at 3s
hehe
I don't know what I did
but I deleted the notebook and created a new one
And now is working
hehe

Anyways, thanks for the help!

now with 2 batch size gives me 22 steps
I think it is better(?)
@analog obsidian hi
looks like the slicing is still messed up
you're still using 5s slicing
is this good
really?
or overtrained or need more training
can you show me the same thing but with ignore scalars on and 0.6 smoothing
because from what i looking at, looks like it got stuck
but could be because smoothing 0 + the second box enabled
something like this
very flat line pattern
should be like mine
can you
disable the second box
and click the third one
maybe it's just that
if still looks like a flat line
then uhh the model got stuck
third box?
bro
😭
yeah the model got stuck
do i change settings? delete model? restart?
honestly i have no idea, last time i had that issue was because my set had eq and compression filters
but yours didn't had those
i know that too high batch size may cause the model to get stuck
the dataset was too hard for rvc
probably caused by the excessive laughing
but im not 100% sure
there are rare cases where rvc really hates one voice in specific
and refuses to learn it
could be a lot of things really
hmm
probably that
not many voice ai's can learn my voice
i was really expressive in this dataset
i believe the issue could be that the dataset is too hard for rvc
maybe you can try increasing the batch size
9, 10, 11, 12, etc
hm i'll do it another day
see if the dataset is damaged in a way too
would i need to start from scratch
maybe the denoise was too much
it had a lot of mic noises so i did have have to filter out
things like that
probably the dataset got too damaged during the cleaning process
really?
and that could be also the real problem
yeah
eq and compressors kills rvc training because they remove vital information needed
excessive cleaning do that as well
right
really?
and the model will of course sound noisy
yeah
but it will work
lol
idk why it happens either, £5 mic stand fault ig
i think probably something went wrong during the cleaning process
natural plate reverb whenever i shout too 🔥
oor it could be that the set was too expressive and hard for rvc to learn
it was difficult to get rid of
i had to use an aggressive model for it
not many models recognised the reverb noise
rvc can also train room reverb
im gonna try the model anyway lmao, even knowing it won't be good
but it adds clicks to the output and honestly sounds bad
distorts the voice too much
it can't really handle room reverb
in my testings of training noisy datasets, natural noise also randomly adds clicks to the output
but in very rare ocassions
i trained that with this
and rvc outputs something like this
i recorded my dataset with my cheap 20$ usd mic
so far the only problem i noticed is that the model sounds metallic
im gonna add more samples to the dataset and if that doesn't fix it, it could be the noise thats causing the problem
HOLD ON
ok some parts
it sounds really weird
but when i first clicked play i was blown away
sounded just like me
and then it sounded weird after that
like
some words are good
Hi everyone, I need some help please.
I'm trying to run Codename-RVC-Fork v3.0.4 on my PC, but I’m getting errors when launching it.
The terminal shows this error:
TypeError: argument of type 'bool' is not iterable
It also says:
Information: unable to find files for the specified model(s)
Then a bunch of errors follow related to Gradio and FastAPI, and the app crashes.
It also tells me: "When localhost is not accessible, a shareable link must be created. Please set share=True or check your proxy settings to allow access to localhost."
Does anyone know what could be causing this? I'm a bit stuck.
Thanks in advance.
imo the best you can do is to record the dataset while reading a script and pretending to speaking to someone else, using your whole voice range
thats what i did, and it's sounds like me in every audio i use
oh woah
what script?
i asked chatgpt to do it for me haha
i've asked it to generate random text
and then pretending to talk to someone
maybe ask to generate a dialogue between to people
and pretend you're them lol
lmao i just listened to the model
it has my mic noises
the de echo didn't remove it 😭
uvr de echo doesn't remove room reverb
rx11 dialogue isolate does
actually now i think that the reason why the model got stuck is because the dataset had room reverb
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
it's not room
it's mic stand
it genuinely creates reverb noise 😭
my room is tiny
example (amplified a bit)
Can someone help me, please?
?
Why do you lo-o-ove me, when i'm walking away 🗣️ 🗣️
gotta update gradio version @glacial pollen
I’ve already updated Gradio. Is this a general issue?
yeah, one of the libraries it depends on got updates last week and broke 5.13.x
newer gradio works fine
I’ll be patient then.
i mean you can just update it manually
How do I update it manually? Can you guide me through the steps?
conda
ok thx
he tested it and in fact they had a problem
nvm it's the room reverb thingy
tbh im not sure how much rvc tolerates against little reverb that may be left by anvuew mono derverb or RX dialogue isolate
it can also depend on pretrain used
a
I need a ai hub website so I can use the ai
Pretrained RVC voice models are voice models that have been trained with many audio datas, typically larger than average RVC voice models (53MB).
-rvc
Suggestions for @south orchid
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
click the second one
Duh, you simply just called it "the AI" again. 
Talking about pretrains, for the newer Refinegan models can we use the cloud Applio?
So is the lowest point on the graph or lowest spike?
you should test several model checkpoints
I can't do ts by ear. My hearings Ass 😭
refinegan isn't available at this moment for some quality issues
my ears are rather sensitive to robotic voices, esp through headphones
it is a relatively flat chart for a small(?) model
The datasets 42 mins..
Small is just a "little" understatement 
hey, so I'm having a bit of a problem
I just started this stuff up, and I'm not sure why, but the voice I imported doesn't seem to be working. It's just playing my voice through my headphones.
I'm not quite sure what I'm doing wrong, or if passtru is what's screwing me, but still
This is the error I seem to be getting on repeat
To the point where the client itself is telling me that it's getting frequent errors
[Voice Changer] VC PROCESSING EXCEPTION!!! Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (0, 20) at dimension 2 of input [1, 128, 12]```
It wants me to check if the model is loaded, which I'm pretty damn sure it is
Fuck it, I'll just figure this out in the morning
This is giving me a headache
for the voice changer discussion, please go to #🔍│help-w-okada
is it possible to even run rvc gui on mac?
-gui
what do you recommend i use?
Applio https://docs.applio.org/
I thought that applio is only for windows
I suppose it should be similar to the installation for linux
I use pyenv on linux, so it is pretty straightforward
install pyton 3.10 or 3.11, activate local env, pip install requirements
finally got it to work thanks to chatgpt and terminal
Only this time, but you can't expect ChatGPT to help you everything about the RVC program itself. 
trying to get applio to work, finally got a voice model and an audio file uploaded but the conversion seems to have a problem. How long does it normally take to convert?
what problem? it mostly run in cpu mode, or supposedly in mps acceleration for apple silicon M-series
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Is this the correct channel you're asking about W-Okada?
For W-Okada, go to #🔍│help-w-okada. This channel here is all about RVC programs.
Eh... Thought w-okada was rvc (real time voice changer)
sorry bout that
rvc stands for retrieval-based-conversion, it's an ai voice cloning software that can train voice models and do speech to speech conversion (convert the voice in a recording to another voice)
Nope, it's not.
W-okada is for realtime usage of models.
RVC is for model usage with prerecorded audios/inference
Hi Lyery!!!!

Ah.... I see, thanks for the clarification
is there a colab notebook where I can input audio files to train an rvc model of a particular singer?
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
Suggestions for @glass rivet
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
(the applio link)
Thank you so much. omg
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
someone knows why the kaggle notebook doesn't work, I get some kind of error with gradio and local tunel
no one could know your problem without showing the error messages and screenshot as needed
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
Which Kaggle link are you using?
Also make sure you screenshot the problem you encountered somewhere on Kaggle fror more context. The "Kaggle notebook" doesn't refer to a specific RVC notebook link.
@empty knoll what kaggle link? What issue? What u want to do and what’s ur pc gpu?
im training this green model and it just keeps getting better
why are you pushing crazy epochs? and you seem to have messed up some training configurations 
thats only like 500
I did give it like almost 2 hours of material
and it was all split into sentences by hand
well you should test the model and find out what's wrong on it
idk how
idk why you said the green is "better"
what kind of misinformation did you get from?
I got a feeling it's gonna be good
the red one was a lil lackluster, I didnt use any pretrain though
it wasnt bad
how do you know it without the actual testing by inference?
you have baked a cake, how would you think it is "good" without tasting it?
Hey everyone, I'm training an AI model with Applio and ran into a TensorBoard error on the Colab page that says:
'Data could not be loaded. The TensorBoard server may be down or inaccessible. Last reload: Mar 31, 2025, 2:27:49 PM. Log directory: logs.'
I'm not sure if this is a solution or not, but since I can't access TensorBoard, I stopped the training on Applio, enabled 'Overtraining Detector,' and set it to stop after 50 epochs without improvement—just so the training doesn't run indefinitely.
Does anyone know what might be causing this issue or how to fix it?
the graphs are named "loss" because they measure the errors in the ai calculations
so graph going down = less errors
graph going up = more errors
you need to start training and reach at least epoch 1, then click refresh button
also don't lose the tfevents file in model's eval folder
the green one isnt getting any better bro
its quite the opposite
the graphs are named "loss" because they measure the errors in the ai calculations
so graph going down = less errors
graph going up = more errors
yeah but this isn't a conventional voice
you need to look at other chart to see which one is messing the whole thing up
fm and mel
its probably fm
I did reach on epoch and refreshed but it doesn't work im at epoch 65 and it still doesn't work, how do I see if I haven't lost the tfecents file, plus does overtraining detector do any good?
idk what these are
listening to the inference results could tell you what's wrong, so stop judging it by yourself
im not at a checkpoint for another hour
why arent u using a pretrain
yeah its fm
because i wanted to try it raw
the pretrain has an accent
d/total approaching zero means it suffers mode collapse, which means the model would produce relatively same results, i.e. static noise
that's good
that's because you didn't use pretrain for the green graph model too
no I did
why the hell is mel at 50
it's 2x as fast
so, faster means better, right?
it means the pretrainer is active
you still haven't figured out what's wrong, what are you smoking of? 
This is so weird.. even without pretrain fm and mel should go down
this is complete fuckery
like trying 32 batch maybe with a small set
funny ass graphs
with default mainline configurations and insufficient dataset for training from scratch, it will never be good, the results will be the same static noise, causing the discriminator easily identify the generated samples as fake (hence the d/loss keeps approaching zero)
it has pretrain
8 batch
somewhere between 90min-120min of content
and 2000+ files separated into sentences (by hand)
i see the old graphs but could be also the compiled version of applio
disabling the pretrain in mainline is a bit tricky (just dont put anything in the pretrain path
)
that's why you should use an existing pretrain, and the process is called finetuning
training from scratch would need massive amount of dataset containing several speakers (for example, VCTK dataset)
imo the option to disable the pretrain should be hidden by default because this is not the first time something like this happens lol
it is
you have to go to advanced options and uncheck it
okay, so using pretrain and 2 hours of audio
my expert analysis just tells me we have to push this to 2000 epochs or beyond
can you send a piece of dataset to check?
with a pretrain training to 2000 epochs would wipe most of the pretrain's knowledge, especially if you got 2hr set
he said its 500 epochs
here's a random selection
what the fuck is this bullshit
it's my uncle
are these recordings from the 60's
and why is it .wav.mp3
😭
so much shit going on
the audiofiles all been converted a couple times...
6khz from shitty chinese answering machine
#✨│ai-help message I still can't believe ts 
i mean... this is real
i've not tried to run training with shitty recording like that
💀
no, it is from an answering machine
but it is not even 12k
i've resampled my set to 12k and it still sounds better than this shit
the worst thing I ever had was when trying to train a dataset from some mobile games in 96/128 kbps mp3 quality, yea it sounded robotic as hell but the graph pattern with the default mainline pretrain was not really much different
I think it would be better to just get good 30-60s and then use that as reference for XTTS
and make some tts outputs
I had a really good model but I accidentally deleted all my AI stuff
😢
it was GOOD
excellent
i dont remember how I made it really
how about microsoft sam in windows xp
i mean to clone the voice
it will be less shitty, even considering that synthetic audio is not good for training
rvc miku is going real
make sure tensorboard points to the logs folder, and if ur using latest applio it should log every epoch
how do i do this??
refer to the guide as im lazy to explain rn
yes
Go back to #🔍│help-w-okada.
This channel #✨│ai-help is about RVC programs, not W-Okada.
sorry
Which RVC program/Colab are you using?
I used three fork models
Mangio-RVC
applio
RVC1006Nvidia
Applio is the best one. The rest is outdated.
What is your PC GPU?
3080ti
ApplioV3.2.8
Yes, but the problem was that the need for embedders, which cannot be found in our language, was an example of the use of which the sound became extremely awful.
while the embedder is mostly trained on english, your sibilant and breathing problem is not related to that
Do you think if I add more pronunciation of s and breathing sound to the data set, the problem will be solved?
the sibilants sounding bad are prob due to them being overfitted or you only have those bas esses in the dataset
https://www.youtube.com/watch?v=Ts_kPyK8mek
found u a video
I don't know what to do, I'm confused
why that random video?
How long is ur dataset
https://www.kaggle.com/code/deiant/applio
I train a model since my gpu is not Nvidia and I have to use kaggle
rx 7800 xt
Btw you could try locally with applio on AMD https://docs.applio.org/applio/getting-started/installation#amd-gpu-support-windows
I had already used it a week ago but for whatever reason now it doesn't work 
Also, have you tried deleting the Kaggle notebook and refolloeing the guide steps?
yes
OK U GUYS WERE RIGHT THE MODEL IS NOTHING BUT SINE WAVES
I think I mismatched the sample rate or something
!uv pip install pydantic==2.10.6
Try running this command before starting the gui
Send a screenshot
In my 2 years of making rvc models I never seen anything like this before
@simple ore wtf is this
its bec the dataset is horrible
yeah it shits out between 20 and 40 epoch
has to be on purpose
I didnt see it
U won’t because it’s censored in ai hub
I think I’m getting timed out lmfao
Can u send a portion of the dataset
flac 
Send one file
and I deleted the mp3s...
fixed thanks friend 
Why the HELL are u training ts
ts?
This shit
just how
4k
Who is this racist fella
how does someone get this bad audio
i dont think it's even possible to re-create accurately
Noobies said it’s audio from Chinese answering machine
6k audio is funny asf
AI HUB Docs


