#✨│ai-help
1 messages · Page 221 of 1
it's pretty good but can't remove delay echoes
I know, to finish eliminating the echoes, I use the normal UVR De Echo at maximum
same here, but I did it in the opposite order
I found the aggressive one is better
did it give you a better result?
which ?
maybe you can try melband dereverb mono by anvuew or uvr de echo, but uvr de echo has some cutoff output
oh, any modifications to his aggression or are you leaving it intact?
sometimes
yeah the aggressive one did it better imo
dereverb mono is good for room reverb in recordings, though it may need to apply RX 11 dialogue isolate
unlike the latter, the former doesn't damage breaths
The way I do it usually removes the back notes and adlibs well, although sometimes it eats up the high notes too. It's quite clean. I should try your method.
yeah I usually use anvuew's dereverb after extracting the vocals, if there are still background vocals left after dereverb I usually go to uvronline (xminus) to remove them using melband karaoke or uvr bve v2
but if the background vocals still not removed, then i didn't add the audio to my dataset
Removing the back and leads from Billie Eilish's songs is very difficult, but that method helped me a lot and my voice is very clean. Now I'll try your method and see how it goes.
most of billie eilish's songs is available in dolby atmos, maybe the stems can help you to get a clean lead vocals
True, but I don't know why the vcoes in DA are low.
which song?
all of them, meaning the volume is generally low and it is necessary to turn it up, that is where quality comes into play.
have you open it in audacity?
Yeah
just normalize the volume or use amplify
I'll try, thanks for the advice
thats a breath
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
Whenever I try to use Inference tool on Applio, and press Convert - it just says processing and the amount of seconds next to it - but it never completes it, just displays "Error" - I haven't gotten it to work a single time
How long does it usually take to convert a 2 min audio file?
pitch extraction is taking forever, is that normal?
thread:7, f0ing, Hop-Length:64: 100%|█████████████████████████████████████████████████| 22/22 [54:02<00:00, 147.36s/it]
thread:3, f0ing, Hop-Length:64: 100%|█████████████████████████████████████████████████| 23/23 [56:09<00:00, 146.52s/it]
thread:10, f0ing, Hop-Length:64: 100%|██████████████████████████████████████████████| 22/22 [1:46:29<00:00, 290.43s/it]
thread:1, f0ing, Hop-Length:64: 83%|██████████████████████████████████████▊ | 19/23 [1:46:12<35:03, 525.98s/it]
thread:3, f0ing, Hop-Length:64: 100%|██████████████████████████████████████████████████| 23/23 [56:03<00:00, 65.50s/it]
thread:4, f0ing, Hop-Length:64: 64%|███████████████████████████████▏ | 14/22 [55:19<36:17, 272.20s/it]
thread:4, f0ing, Hop-Length:64: 68%|████████████████████████████████ | 15/22 [1:08:59<50:59, 437.06s/it]
thread:5, f0ing, Hop-Length:64: 73%|████████████████████████████████▋ | 16/22 [1:43:21<1:21:19, 813.17s/it]
thread:6, f0ing, Hop-Length:64: 50%|██████████████████████▌ | 11/22 [1:46:07<2:48:07, 917.04s/it]
thread:9, f0ing, Hop-Length:64: 32%|███████████████▎ | 7/22 [37:58<1:40:01, 400.07s/it]
thread:8, f0ing, Hop-Length:64: 64%|████████████████████████████▋ | 14/22 [1:35:28<1:47:45, 808.22s/it]
thread:9, f0ing, Hop-Length:64: 36%|████████████████▎ | 8/22 [1:42:16<5:47:19, 1488.57s/it]
thread:10, f0ing, Hop-Length:64: 100%|██████████████████████████████████████████████| 22/22 [1:46:13<00:00, 124.93s/it]
thread:11, f0ing, Hop-Length:64: 68%|█████████████████████████████▉ | 15/22 [1:43:07<1:24:01, 720.25s/it]
thread:12, f0ing, Hop-Length:64: 68%|█████████████████████████████▉ | 15/22 [1:43:07<1:00:33, 519.13s/it]
thread:13, f0ing, Hop-Length:64: 59%|██████████████████████████ | 13/22 [1:46:00<2:09:05, 860.64s/it]
it has been 3 hours
please check some error messages in the console window
processing it in cpu mode is supposed to be like that
I had it use my gpu,,
RTX 2060 shouldnt be insanely slow
and it is the lowest recommended gpu for training
be sure to check gpu usage during the process
hi i cant run applio colab and i get this message
ERROR: Failed to launch TensorBoard (exited with 1).
Contents of stderr:
Traceback (most recent call last):
File "/usr/local/bin/tensorboard", line 4, in <module>
from tensorboard.main import run_main
ModuleNotFoundError: No module named 'tensorboard'
Traceback (most recent call last):
File "/content/program_ml/app.py", line 1, in <module>
import gradio as gr
ModuleNotFoundError: No module named 'gradio'
how do i do that
restarting it seemed to fix it pitch extraction for some reason, its almost instant now
ask google or chatgpt
Applio colab is broken for now @blazing solar @woeful crow
This is a Google colab/uv issue
thanks
Yw, use an alternative for now
What kind?
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio UI Colab: RVC Fork with some extra features like TTS
- RVC AI Cover Maker UI: Automatically Separates the Vocals and Instrumentals, converts the voice and mixes them back
Or do you have a PC? Did you try checking it's GPU
Is it fine to add the same song but in a different language to my dataset?
Is there any documentation on this?
Yes, the model will learn new sounds that come from different languages
ah okay, thanks for answering my question
For colab, atm you can replace all the instances of "!uv pip" in the install cell by "!pip" and it should work, although it might take longer
Hi, may I ask what was the way to clone the main repository into applio no UI colab so I can see the newer average graphs ?
-guides
in the clone cell remove --branch parameter
Only this just removing it will do the thing?
Also where's the guide for this colab, I lost it
What aggression do I put on the UVR Aggressive? 0.5 ? or 1.0?
What rvc I can use to create and train voice model?
You can use any version, but we mostly recommend Applio.
One question, have the creators said anything about this?
gatekept 
that ?
what rpc?
sry
The correct name is Ilaria RVC. I think you just misheard it somewhere.
nah just being dyslexic
Too crazy.
-rvc
This is the correct command for showing guide links to RVC and Applio.
What is this error on Kaggle
Which Kaggle notebook is this?
The one mentioned in Applio Kaggle Doc
can u help?
Try delete the notebook you created for yourself and then create the new one from start.
Nope , tried same issue
@hallow thistle Please help
try doing this #✨│ai-help message
they reported it to uv and google
So what is the cause of static/crackle in silence gaps in resulting file from inference?
I convert the same piece of audio, and the model I created in Applio has it, while the model I created in RVC Mainline doesn’t have it.
that happens when changing envelope unlike mainline inference
Envelope? Im doing the inference in the same program, all I change is the .pth file between Applio created and Mainline created.
seems like you didnt use this slicing method
I used exactly the same files for model creation in Mainline and Applio.
They are already edited to remove excessive silence.
both have different slicing method, thats why
so you might want to fall back to the old labeling method for use in mainline
The Mainline model is better, less noise.
you need better denoising perhaps and dont truncate silences too short
In my audio for model training? There is no noise. And only enough silence for breathes.
Also, Applio has more “electronic” noise on actual voice than Mainline model. Especially with “SSS” sounds.
can't really say that, the result may vary on different epochs
some training configurations are actually different tho
What is strange is I tried conversion in the Weights Replay app, and it works well, but it doesn’t ask for index file.
is applio main automatically set to fp32?
latest main branch? yep, fp16 got removed
in the latest compiled version fp16 is enabled by default
ah i see, i was confused because i didn't find any option to change precision
i downloaded the apllio from code
nice, fp16 doesn't exists there
everything is fp32 now
okay, thanks
is using checkpointing really that slow? it's been 5 minutes after i click start training, and the cmd didn't show anything
uh no?
it's very slow for 48k tho and i haven't tried 40k
for 32k the speed is ok
maybe you've got out of vram even with checkpointing enabled
i know that 48k requires way more vram to train models
40k should also require a bit more vram than 32k
guys, 1 question, is it possible to train a model that can be smooth in my language?
oh really? tbh i can do 8 batch size without checkpointing on codename fork, but i'm just worried about my pc if i continue the training
yes because codename's fork still has the old inplace flag
that saves vram
but it got removed in applio because it was causing problems when training from scratch
Is there any recommended cloud collab for training?"
Im using a mobile here and i cant access to any of these
oh, lemme set the batch size to 6
i've heard that rvc sometimes adds an american accent in models, but I'm not sure about that, I personally haven't had any problems with this.
however you can use the index to force the accent of the dataset in the results
somehow i train it with my language data but end up sound like a British
yea you're not alone in that, idk why it happens
never had that issue personally
you're right, i just set the batch size to 6 and it worked
i guess i have to try kaggle sometimes
yes training is faster because there's a new flag named "benchmarking" that speeds up the training
oh wow, it's been a while since I've followed the updates
but noobies told me that the speed boost was more noticeable while using fp16, and that fp32 negated the speed gains
it''s very noticeable here, it's like I was training with fp16 back when I was still using rvc disconnected colab
for me it's also very fast lol, specially when training 32k
for some reason i doubted that my applio is using fp32 lol
you can have fun and experiment with it if you want
go to rvc/train/train.py
find
torch.backends.cudnn.benchmark = True
set it to false to disable the speed gain
i have not compared how slow the training is with that setting turned off xd
is pretrained 32k is used if your dataset is mostly lossy?
is used if your dataset has a 16k cutoff
there's no point to train 48k with a 16k cutoff
slower training + high vram usage for basically the same results as if you were training using 32k
what if my dataset is mostly lossless, but there is one audio has 16k cutoff, is that okay to use 40k pretrain?
if your dataset has one audio with a 16k cutoff but the rest isnt you have two options:
resample the entire set to 32k
remove that specific 16k audio and train only using the lossless audio
the audio source is actually lossless, but after removing the vocals and reverb, some echoes still remained. then i removed them using a de-echo model, which caused this cutoff.
oh damn
you can either train 40k or 32k
i would use 32k
the dataset now has a 17k cutoff
and also is damaged due to separation models being used
trying with 40k rn
the frequency?
you won't notice a difference
the dataset doesn't reach 20k nor 18k
17k is pretty much 32k territory
is the sound detail still good at 32k? haven't tried 32k until now
rvc doesn't care about sample rate
you'll notice a difference if the dataset is actually true 48k/40k
but if the dataset is 32k instead, then no, training that in 48k/40k doesn't give you better results
it's just going to copy your frequency cutoff
i just thought 32k is less detailed than 40k all the time lol
it is but only hardcore audiophiles can hear the difference lol
40k and 48k are more clear than 32k
but again, only if the audios are true 40k or 48k
17k from separation models is not true 40k nor 48k
ah okay okay
if your dataset had a frequency cutoff of 24k you can downsample it to 32k and compare them
see if you can hear some difference lol
i mean the original audios, with all of the reverb and sheet
if you still have them, try to convert one of the audios to 32k
and try to hear if there is a big difference
i did it rn and i cant hear the difference
me neither lol
see? don't be scared of 32k
48k has more fidelity than 32k, buuut, only hardcore audiophiles can hear the difference
i guess i don't have to worry about mix lossless audio and the audio from youtube into my dataset
no wait don't do this
the dataset needs consistent quality
dont mix stuff


i tried to make a singing model, tbh, my dataset is kinda repetitive if i only use lossless audio, while the singer is hitting high notes on youtube lol
well you can experiment and try training your current dataset, but it is a known fact that training from different sources can lead to bad results
I always do this when training the model, but for some reason people say my model quality is really good, i even got ai master role on some server
im just telling you what i know
if u want to continue training like that, do it lols
at the end what matters is if you like how your models sound
no, i mean i feel so bad for people who always use my model, of course i will change how i do my training now, thanks for giving me some advises
and now I feel like I don't deserve the ai master role on that server
ai is random
i also make mistakes while training models
this thing can be really confusing sometimes lol
im still learning more about rvc
it's all trial and error here
yeah, you're right
@viscid moss Why does only UVR-Deecho-Dereverb works? The rest models in Dereverb tab are just giving error
I'm working on that
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
no, it will only fry if your pc has bad airflow and bad thermal paste applied on the cpu/gpu
not sure it works anymore
details ≠ additional high ends by going higher than 32k
2 min dataset with consistent & fine quality could be better than the longer one with such mixed sources
it has been fixed
guys, is there any guide on how to use UVR UI ?
i've met some error but i have no idea where i went wrong

like this
or
not quite there
check the file path name
D:\path\to\your\file.wav
wait i need some kind of ngrok token for applio colab?
is this the voice path ?
or the uvr path lah
if I were you, I would have known where is your wav file you'd want to process
ahhhhhhhh
i thought uvr just need to throw in that file and done

because i don't see any output path
or input path in the ui

try another path like D:\
I mean like C:\yourfolder\your.wav
also you can try first pre-convert the mp3 file to wav using audacity
roger that

yeah, that's why i stick with batch size 6
chat can you chek it's minus 1 minutes i cook https://www.youtube.com/watch?v=r_bGpHbdSy0&ab_channel=Airashi
sire, i failed you. it still doesn't work
i think i have to download the model and import it in the file somewhere
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
do you think it's overtrained after 8.7k steps?
Still having issues?
if still... can u show me this part of the error?
My DMs are also open for help
I think the problem is the installation path, it's directly in C:\ Try installing it in Downloads
Is anybody else having issues with Mainline? I keep getting error "ERR_NGROK_8012".
same
from the #📰│dev-updates it seems there are still issues with the damn colab so i guess we wait
Me too! I get the same error on Kaggle too.
if you're using UI colab you need a way to connect a browser to colab, it can be done with a public gradio link or with ngrok tunnel
are you training without a pretrain or something? 45 g total is very high
old g/total graph
https://ngrok.com/docs/errors/err_ngrok_8012/
tl;dr possibly incorrect/screwed code cell to run the webui through ngrok, or the previous tunnel session wasn't successfully stopped so the port to connect was in use
@left flame add those lines at the beggining of the cell
.
hello, does the rvc gui also exist with RMVPE
-gui
^suggestion: the bot should show a link to the latest rvc or applio: https://docs.aihub.gg/essentials/whats-rvc/#forks
Last update: Oct 21, 2024
Don't expect RVC-GUI to get updated with rmvpe model; this particular RVC fork is too old. Use a better RVC program like Applio instead.
Oh okay thanks, I'm not up to date, which is the easiest?
Applio no UI doesn't wat to preprocess dataset
I actually found the rvc gui quite good, audio in, audio out in the same folder
why good if there's no rmvpe?
That's exactly why I asked if the gui is also available with rmvpe, well I just have to switch
im using klm hifigan final
rmvpe is better than others in most cases
Applio. It's more up to date.
except that it may struggle on inferring polyphonic/choruses
I have already loaded trained voice models into rmvpe. Which folder do they have to go into in apollio?
rmvpe isn't RVC program. It's a pitch extraction model used in RVC.
Applio no UI doesn't want to preprocess dataset
Traceback (most recent call last):
File "/content/Applio/core.py", line 15, in <module>
from rvc.lib.tools.prerequisites_download import prequisites_download_pipeline
File "/content/Applio/rvc/lib/tools/prerequisites_download.py", line 3, in <module>
from tqdm import tqdm
ModuleNotFoundError: No module named 'tqdm'
@simple ore also in the "download models" sub cell
the voice is also weird, idk what is called, overlapping?
the installation seems screwed
Hm well then?
What do I use for training
What do I use for training is there something that works?
does it use rvc or nah?
do you know what purpose of this channel so that you're asking it here?
anyway you could ask chatgpt or claude
sr bro
I will delete the message
what’s your pc gpu and what do you want to do?
Jokes on me...
Actually I'm a phone user Applio no UI used to work fine, now what happened
What happened
yeah the issue is that uv and google colab are having issues with eachother, applio reported this to both, so all we have to do is wait for uv or google colab to fix it
you can use alternatives
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio UI Colab: RVC Fork with some extra features like TTS
- RVC AI Cover Maker UI: Automatically Separates the Vocals and Instrumentals, converts the voice and mixes them back
ofc don’t mind about applio since it doesn’t work right now
Mainline?
It happens often that cloud breakes, especially google colab
because notebooks devs need to adapt to every cloud site update and bug, which is tiring
Or the other option would be buying a good pc, but not everyone can afford that
Hmm I just want a working Colab
mainline is the og rvc project yes, it doesn’t have the same perks like easy ui and tts or some adjustments, but it should work
Yes but it doesn't have option to input pretrains.
The Colab
Mainline
I mean it’s on cloud and google decides what to update on those pc, cloud is not as stable as local (running it on your own good pc)
google colabs break basically weekly, check also #📰│dev-updates
I’m also checking right now if mainline colab works
Okay
because the same creator of the mainline colab, hina, is busy and some others of their colabs are broken like wokada, so i’m not sure if mainline is broken too
I hope mainline works at least but I find it difficult to use
Mainline colab does
Now?
i mean, if you really need to use applio,u could try the lightning.ai site version, which gives only 22 hours monthly of gpu and is harder to setup and needs a phone number verification
else, the only other working way to use applio is to buy a good pc
because applio kaggle is broken too
or you could try other RVCs like mainline or rvc ai cover maker ui or ilaria rvc
there’s a whole cell in the hina modified mainline colab about pretrains
Uh that's perfect
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
or pay the model master who opened the commission
nevermind, just tried it, mainline colab is broken too
Yes but the pretrain I wanna use isn't listed in
so yeah use other alternatives
who knows,
...
for applio i told you we need uv and google colab to fix it, it’s their problem that causes issues for applio
it’s not an applio issue, it’s related to another thing that applio uses for cloud
your other option is weights.com
I understand but when will it get fixed
or paying a model master commission, or wait for someone o do your free model request
Yes but it doesn't let me use pretrains
I know that
no one knows, there isn’t a specific date
Weights doesn't allow pretrain input
i’m not sure if weights uses its own pretrain, tho u could suggest that feature in their server
medleyvox really ruined the frequency, and I didn't realize I was training the model with this audio
@brittle wing your only options are:
- use weights
- use other cloud alternatives, like mainline or applio on lightning ai
- buy a good pc
- ask a model master or maker to make it
- wait for fixes
- pay for cloud
I wanna train it myself I have actually trained models but the notebook doesn't work
i understand, i told you the only options available
Applio colab?
i just told you it’s broken
no other ways than the ones i said
unless someone makes a script fix
the kaggle notebook is another option
Link?
https://docs.aihub.gg/rvc/cloud/applio-kaggle/ (how to setup)
Last update: Jan 13, 2025
that’s broken too iirc
This is so complicated
the interface is harder than colab but ujust gotta read the guide
so the only works cloud notebook is only lighting ai?
For rvc yes
Idk
Link?
sorry wrong reply
-lightning
Lightning AI is a Cloud (Remote Good PC) Service. The Free plan provides up to 22 hours of monthly usage.
by Shirou
Lightning AI
by Shirou Lightning AI
by Nick088
Lightning AI
by Eddy
Lightning AI
by Eddy
Lightning AI
the first 2 should work
Does Ilaria rvc on hugginface work?
yes but only inference
For inference
That's what I'm asking
Is it the same result as Applio?
How do I use the lighting notebook
Last update: Nov 21, 2024
How do I use this
Wait it's similar to colab
idk, i haven't tried it. but we got the docs to know how to use it
Okay does it allow pretrains
you asked for training
traiming = make models
iference = use them
Yes...
it’s the same as applio, just different cloud site, on lightning ai the interface is a bit harder, needs a phone number verification which could take some days for them to verify u, like 2 or 3, then also gives u a max 22 hours of gpu monthly
But the program is the same
But this isn't okay can you train the model.
it is okay, it’s for free, you can’t expect free gpus 24/7 for free
I'm waiting for them to verify me
gpus are expensive
Colab was better
and yeah you can train the models on lightning.ai applio
22 hours a month aren't enough
colab igves more gpu time and is easier to use, but its broken
When will the fix come...
i can’t do anything about that 
I know
you can pay for more gpu time on lightning.ai
I'm just saying
just like you can for colab too
don’t need to ask it again, we don’t know
you can’t expect to get the easiest service 24/7 all for free
Who do I ask
AI is expensive asf
I always thought buying a good pc (at least rtx 4060) is an easy thing in other countries. considering the salary from their job is at least bigger than mine which is only 2.6M IDR ($162.50) per month.
How do i fix it
i’m glad I got an rtx 4060 ti16gb pc
Is it s python script problem or a general one.
We buy Macs instead and get stuck. 😥
Unless you know coding and are able to find a workaround, all you can do is wait for google colab and uv to fix it, it’s a problem of the site hosted machines and of the python dependencies

hey guys whats up
I give up
Macs are perfect for people who want easy to use pcs
But AI is mostly good on nvidia gpus, since they are more powerful and got more support
...
I mean you can just use the options I told you
Is this the right help channel for the voice-changer?
Weights and...?
I’m talking about those ones
which type?
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
tbh kinda regret getting a PC with an rtx 4060 8gb. now I can’t even train my models with a batch size 8
Its meant for simple to use RVC usage yeah
I know that
RVC, yup. I'm using a realtime voice changer client and wanted some advice on models since I don't fully understand something about it
i mean you can use batch size 4
rvc isn’t a realtime voice changer
rvc is for pre-recorded audios and model training
Did you read the whole text i sent?
please read it
nvm, I see. some words I didn't understand just flew by me
it's different hehe
i’m guessing u want a realtime voice changer, so wokada, ** be sure to not use yt tuts**, elaborate ur issue in #🔍│help-w-okada
thanks then!
is the graph stable on batch size 4?
ipinged u in it
Can I dm you so you train the model...idk anymore
I know it will improve accuracy, but codename fork says the graph won't be as stable as batch size 8 (iirc)
i also trained my model now
I know I saw that
@low shard idk maybe someone found a fix
Yeah 
you need to have patience unfortunately
Look at this...
No that’s an old fix that already got added, the issue is related to something else now
all you need is wait or use the alternatives
i don’t know how to tell you but there’s no other way
WHAAT
you don’t have other options
like the fix was made in March, I'm not sure that's the way to fix clouds for now.
Exactly
there isn’t as of right now
it’s another type of issue
But how some other colabs still work
because they use different code
i understand you need colab, but there’s no other way
Please just follow the options i told you
What can I do to fix the bug
keeping to ask won’t fix anything
nothing unless u work for UV and google and know coding
The fix I sent was uv fix
Since when are the colabs malfunctioning
Ah yes
Some from momths, some from less, like applio since some days
I see
It doesn’t matter if you keep asking, the answer will be always the same, cloud gets broken weekly
hopefully things get fixed soon
if it were that easy to fix, the devs would've handled it by now. The fact that it’s still broken probably means even they aren’t sure how to fix it.
then it's useless to tell you how to fix it
Exactly
the issue isn’t even applio’s fault, it’s that uv is having issues on google colab, which is used for installing applio requirements
What is uv
Python library
uv is a new requirements installer
the workaround is here https://github.com/IAHispano/Applio/issues/1025
Ima announce that
And?
Will this help with the fix answer please
guys, is the overtraining detection working ?
Yes if you follow it
You need to modify the code like in the issue discussion
I mean that manually for yourself
i have so many audio with varying frequencies like this on my dataset, should I delete the audio that has cutoffs, or keep it and resample everything to 32k, then train it using a 32k pretrain?
and what
Nice but will it work in applio no UI also how do I input the dataset on there as a zip or a folder with audios?
lyery already gave me advice, but I just want to hear other users' opinions
If I'm being honest you're thinking too deeply into it.
Or you wanna know everything.
i just want to improve my model quality
Wow
so i need to know how to train properly
not really
Okay just a question how should be my dataset be In applio as a zip file or a folder with audios?
Cause sometimes it's different
idk if it's in the cloud, but on local I just need to copy my datasets folder to "assets >
datasets"
The colab Noobies sent a fix there
what fix?
I know you have trained on Applio colab that's what I'm asking
Zip file, right?Not a folder in drive containing audio files
yeah extract the folder with the audio inside into zip, so you will get jhope.zip\jhope
Oops how did you know...
hahaha
Wait what do I do
too easy
i mean i recognize you
But should it be a zip file uploaded into Applio
Well where's THE guide.
Last update: June 15, 2024
everything's here
useless for those who know how to properly use tensorboard and test checkpoints
do you have any screenshot showing that overtraining detector is detecting overtraining?
Yes but I didn't understand about the format zip file or folder with audios.
Nvm
I understood
Is "Load a backup" for resuming?
Where's the pretrains tab.
Nvm found in advanced settings
Hi guys im kinda a noobie how do i get my macbook voice changer to work
How do i connect the ai voice thing on mac to discord
Hmm how do I import dataset in Applio colab
is it normal for the loss/g/total starting from the bottom and not from the top?
Will merging all the dataset audios work for the Colab?Sorry for bothering
idk, i never tried applio collab
Well the zip archive didn't work it kept returning errors
Still doesn't work
I'm just wasting GPU for nothing
I have a question guys. I heard some voice changers which also go with your depths and highs of your voice without sounding so electronic all of a sudden. Is there a quality difference depending on the model and where can I get a high quality one?
it is not starting at the bottom. Turn off ignore ouliers, turn off smoothing, you'll see
How do I input dataset into Applio colab so it doesn't return errors?
Like as a zip file or audio file folder or a merge of all audios???like I need help.
I just wasted 2-3 hours of GPU just waiting for an answer...
you make a folder on google drive with wav files
then you provide a path for noUI colab
if you're using UI colab, then you make a dataset and upload wav files
No UI doesn't work.
it does
Any fix
Will the bug fix you provided fix that one too?
it is the same, you just need to run it before you run the install
I tried & it didn't fix
hello all 🤗
im searching a open source or free voice cloning with emotions.
best would be, the emotion reference audio can be a different language.
can you please give me some tipps, im near blind from searching google 😦
i want to dub some old anime to my homelanguage, so i used applio to train some models.
then tts for dub the new audio in my homelanguage. actual im testing f5-tts for the emotional speaking.
but a more "automation" on my work would help me a lot, so i start searching for something
that can help me speed up the process and is not limited to english/chinese language only.
thank you!
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
can I get an actual download link without signing up for weights.com?
Download link of?
question, is it possible to stop training mid way ? because i accidentally set 500 epochs 
As long as you have save Checkpoints enabled (check if you have G and D files in the models folder) you can stop at the next checkpoint
yay, thank you so much

ah okay
use loss avg50, its more accurate than old loss graphs
this one?
yes
after some time the model will converge to a value
it'll have a flat line pattern
The monitoring method is the same as loss g total, right?
similar
the model then is going to stop going down and its gonna stay in a flat line
stay in a flat line means overtraining or what?
means the model is near convergence
its gonna start doing small adjustments, during that there is a risk of the model forgetting the pretrain
ohhhm where can i find this one
when you get to that point what you can do is to let the model train for maybe 1 hour more and choose the lowest point and see if you like how it sounds
I didn't even set checkpoints to be saved per 1 epoch, kinda worrying about my ssd tbw if I do that
but thanks for explaining
at least i know what to do after this
just type "loss" on scalars
well if you dont save every epochs then it's mostly random and you have to hear the epochs until you find the one you like the most
i only have these
i used applio mainline, idk what are you using now
same applio, it's 1 8 3 something
hello, i downloaded okada but after extracting and opening the exe it opens up cmd but nothing happens after any idea on what I should do?
do you mind show me how can i download precompiled version?
they are not useful for choosing epochs, they are very inaccurate
in the case of having those graphs the best thing to do is to listen to the epochs and choose the one you like the most
or a way to update
oh, okay, i need to test it out right ? these are just for reference ?
yeah, that's what i always do
https://github.com/IAHispano/Applio
and place the applio folder in a NON onedrive location
after extracting it, run this
let it install the env, don't close the cmd until the installation is done
after that, run this and you can train like always
ah, that's the reason why I got bad results even though I chose an epoch with low points
mainline graphs only logs the last step
which is most of the time, rvc learning the mute file

why?
🦈
idk bro, i’ve been training since everyone was still using svc, so I shouldn’t have any reason to be nervous, but somehow, there’s always that bit of worry whenever I do this
i might’ve been training for a long time, but it was never really backed up with the right knowledge on how to do it properly

I am having trouble with the Applio cloud it won't start applio. It will install but it will not start after installing it says error. any fixes
Thank you
always be sure to check #📰│dev-updates
if "const" in schema:
TypeError: argument of type 'bool' is not iterable
An error occurred launching Gradio: When localhost is not accessible, a shareable link must be created. Please set share=True or check your proxy settings to allow access to localhost. i keep getting this when running apllio locally
that requires gradio update as one of its dependent libraries messed it up
if the requirements have gradio >5.0.0, then can go for 5.23.1
what bout 5.24
ah they Merged Voice Changer chat here
the time is about to come
may i ask the link and other stuff directed to W-Okada again, because no pinned message yet for it
yuh, you can try training for a little longer and then start listening the epochs
you have very high values in your graph
thats weird uhh
small datasets usually converge to a value of 35 (or even less than this)
okie
mine is 22 minutes
you were training without a pretrain?
no, i used klm hifigan final ver
and thats even weirder because klm usually gives me lower values than og pretrain
why are you using overtraining detector 😭 😭 😭
because i can't really monitoring the training properly, i thought it's gonna automatically stop my training if it was overtrained😭
should i start it again from the beginning?
no
ive asked him like 5 times to remove overtraining detector
and he always tell me "just dont enable it"
😭
🤣
lol
"If it doesn't work, just dont use it"

he knows it's bad
i know it's bad
we know it's bad
yet still there
anyways that thing stops the training when the graph fluctuates a bit
which is normal during training
but if i maxed out the smoothing, the graph seems going down
should i always maxed out the smoothing?
use 0.6
avg loss50 don't need too much smoothing
1 just mixes and makes it look like it is going down

it's nearing convergence
it stopped going down at 20k
when it start overtraining?
at 2k epochs
now the model is doing small adjustments
are u locally training?
yes
did u do the preprocess dataset in a zip file or just a folder with wav files?
wait, if i have a data that 39 mins long, it will get overtrained around 200 right ?
2k? isn't that too early?
true overtraining happens at 2k epochs
2000 epochs
fork, i will train it to 200 and see then
what are you seeing in your graph, is your model being close to converge
thank you
meaning it's stopping having meaningful improvements
when it reaches the point of flattening you can keep training for a bit (1 hour or 30 mins) and then you can start hearing the epochs
just choose one that doesn't sound robotic
my preprocess taking ages
can you mark the sweet spot on my image?
at 20k steps it started to overtrain ||(i think)||
@analog obsidian Im right? 

stopped going down after that
yay im right
how long it supposed to take when i preprocess my dataset on applio locally
how long is the dataset?
6 mins 30
what's your hardware?
im still on gtx
ah okay
1660
It shouldn't take more than 5 minutes
is that enough to train on 1660?
im on 32gb ram
not ram, vram
it's 6gb right?
idk if this will help you but if you draw straight lines it can kinda help you spot where it starts to overtrain
yeah 6gb
6gb it's not recommended, but should work fine with checkpointing
and for feature extraction shouldn't take too much time
owh nice advice, lyery was right, it's on 20k
cus ive been waiting for ages to preprocess the dataset
its been 15 mins
usually the graph stays in a flat line and then goes up
thats what im thinking
How many cpu cores did you set?
6
set it to 1
maybe you've got out of vram
how i check
It starts using shared vram
ohhh
And becomes too slow
do i need to restart the preprocess when setting it to 1
check your gpu vram usage, if it's almost full then its using your ram
or just let it keep going
yeah could be out of vram like me who tried batch size 8 using checkpointing and nothing happens for 7 mins
restart the whole thing and use only 1 cpu core
nop
local?
put them in a folder
yeah ok
if you on local, no
If it is just a single file, use dataset creator hehe
you can also put the folder in assets > datasets
yeah
i did
Guys how to use kaggle?? Im newbie
i already sectioned out the dataset on audacity
do i keep audio cutting and effects on
uhh
merge all audio files into a big one
and use this in audacity
after that use simple mode slicing
with these settings
mine dont got this
ah you're using the compiled version
yeah
download main branch instead
the other version wasnt working
Download main brach file
can u send the link?
goat
Click code, then Download ZIP
download the zip, place it in a non onedrive location
is this a must? because i always cut my audio in fl studio before putting it to rvc
after that, use run-install.bat
don't close the cmd until everything is installed
when it's done, you can then use run-applio.bat
yes
slicing does impact training a lot
so that's the problem why i can't get lower value
okay ill let u know when its done
mainline auto slicer tries to slice the dataset into 3s chunks too but most of the time fails
3 second slicing ❤️
are u saying on local traning its better to keep the file as 6 mins instead of slicing it?
no
look
first have one audio file
if you have multiple, merge them into one
mine
after everything is merged, use this
No, just put the single file on applio, and then slice it with simple slicing
then use simple slicing with the default settings
ohh okay
it's very important that the silence is truncated with the settings i sent
If you keep the silence, it will take more time at training, trying to learn about the silence (not good for gpu's health
)
is there a way to automatically merge this audio? it's a lot
merge them in audacity
Don't you have the og file?
I ALREADY DELETED IT

all i have is sliced audio
just load everything into audacity
ah okay
select all tracks and use this
align end to end
export it as a single file
then open it, run truncate silence
and train that
pretty sure my pc would straight-up roast me if it could talk, all this non-stop training
what should i set the silent training files too?
Leave it at default
Pls help me with kaggle
1000 epochs for my file should be good?
6.5 minutes dataset right?
yeah
set it at 300-350
What should i do after this?
and do i need to turn checkpoint on
install everything
Where?
tf i've just added to my dataset😭
Last update: Jan 13, 2025
can rvc read that?
just follow the guide
i dont know, but I recommend you to put it at .wav
@edgy tangle
Im stuck at 2 B
like 5 or 1, so you can find the best epoch precisely
alright bet
you can only find the best epoch saving every 1
just scroll down the notebook sidebar
every 5 or more it stops being precise and you're just using random epochs

increase batch size
batch size >8
alright thanks
but for 10-15 minutes dataset its not "recommended"
a problem with increasing the batch size is the lack of generalization in the model
so keep in mind that
is there a specific batch size youd suggest?
4
try 8
but if you want dataset fidelity, then 8
noobies recommend 4 for everything below 30 mins
at the cost of worse generalization
After that?
this can be annoying
click session options
okay thanks
also use a index value of 0.5
the voice will sound even closer to the dataset
Then run?
is there any other setting i should play around with when traning the dataset?
Done
im saving each epoch as 1
use batch size 4 for your small dataset
do you already have the ngrok token?
😭 i dont think adding 33 more minutes to the set helped my model
still need to mess with settings to see if i can get a better model
No
i was having the same problem as @burnt saffron cus im using the same vocals as him
uhh the dataset has room reverb? i know that messes up things
create a ngrok account
use batch size 8 then
aight bet
but your model will have bad generalization
no, it doesnt have room reverb
have you tried using a bigger batch size?
what u mean
so use batch size 8, but you'll get worse generalization if your dataset is too short
the model will randomly glitch
when trying to inference things too different from the set
like it will break at trying to create its own voice
oh i get u
good luck 🥹
would batch size 6 maybe help work around the problem'
lower batch sizes are able to do more stuff but usually they start to sound different
and sound horrible at inference when you use a voice too different from your dataset
yep
okay thank u
@analog obsidian is this the glitch you're talking about?
It just sounds with robotic sibilances
glitching as the voice literally breaking
oh i see
lower batch sizes generalize better but sometimes the voice tend to sound a bit different
Done
not just sounding robotic
my bum itchy
this is the last model it made the voice higher than the actual dataset voice
bigger batch sizes are more accurate to the dataset, but the model loses a lot of generalization
After this?
tf am i supposed to do?😭
?
get ai to scratch it for me
rvc is not just "perfect"
Is this the token?
no
smh it needs to be perfect 😠
PLZ 
boy i’m not reading all ts
I donno where is the toKen
ive spent like a week just training one voice and i cant get it good enough 😡
in the sidebar there is a section called my token or somewhat like it
i trained my dog to sit down so
yeah i get u
probably sounds like the pretrain
@latent cypress @edgy tangle @twilit forge bs16 = batch size 16
bs4 = batch size 4
My authtoken?
yes
boy ts big ahh sa
this is generalization
Yeah, it is worse on batch size 16, because its too different
why sound so broken on 16? the dataset is not too long?
ohhhh
batch size 16 was too high for that set
the dataset doesn't get that note
yo how do i vocal remove vocals
ah
too high for dataset
isaak this is a real problem
mvsep
uvr
can i send you a link and you do it
with bs4 it generalizes better and try to fake the voice to reach that note
Just copy the "your authtoken" ?
in the page there is a token
i got u 😈
just copy it and paste it on kaggle
blud
NO
what
I dont see this only command line & config file
i wanted the nair video but it didn’t work
bro
💀
i thought it was nair video
the picture i sent to you its the kaggle page
oh i forgot to change the lr back to 1e-4 
time to train this again but correctly
just copy authtoken and then paste it on kaggle
guys who is this?
i try using f5-tts with german language.
download "marduk-ra/F5-TTS-German" from huggingface.
set the path to the files in the f5-tts webUI.
but the output audio sound like it is spoken reverse or maybe chinese.
any idea what i did wrong ? 🙏
wtf💀💀💀
thats me!
i see, this discord is more for kids 😂
my fault 🤦♂️
boy
no way this mf just left the server wtf
we made him leave boys 😎🥶
Where
Where to paste
read it
What is wrong with this?
you clicked start?
You need to install first
or it will not work
Install what
