#✨│ai-help
1 messages · Page 158 of 1
thanks man works great, do you have any substitute for uvr as well?
Which batch size you used and which was your dataset's length?
Yeah, my friend Eddy did make an UVR Colab.
thanks
You're welcome bud.
- BS roformer 1053 in UVR beta (drum+bass, other)
- demucs4_ft (drum, bass, vocals, other)
- some good models in x-minus (https://uvronline.app/) but it needs pro subscription
what are the best vocal remove app
MVSEP.
does it got limit like 2 times a day or...?
Nope, all u gotta do is register.
alr thanks a lot
Ayo? @eternal yoke level 1 !!! 
how do i use bs roformer
oh i think i found an exe
how long is it gonna be stuck on 14%?
i forgot to select gpu
nvm
great.
note that 1296 and 1297 are usual voc-inst models, and 1053 is the drum+bass model as I said before
it was for vocals and intrumentals
ohh
me dumb
completely ignored it by accident
but anyway you should do separate vocals & inst first, then use 1053 on the inst stem
it doesnt have any vocals
why is it doing this
sometimes it cuts in and out
what are you using
epochs really vary, but for batch size, I'd say go high, 12+
lower will kinda be unstable really
idk how much VRAM you have so
i use mangio rvc is it still good
if you're using T4, go for 16 batch
yesn't
it might be more unstable and might not do well... but eh, it's all testing... if you wanna test that it's all up to you
yeah i guess
go for 250 (or 500 if it's fast enough) and check for overtraining... if not, keep going
if the training is fast
cause it's a 1 hour dataset 😭 can take a while

let's hope it ain't the dataset
does those thing affect the ai sound
@pastel oak My gpu is a 11th Gen Intel(R) Core(TM) i3-1125G4 @ 2.00GHz
you're welcome :)
yes
that's your cpu, btw
@proper shale Intel(R) UHD graphics
Ayo? @reef trellis level 1 !!! 
hi guys, in model training the number decreased to 6.8 but now nothing changes, can u help?
yeah, that isn't gonna run okada
what number are you referring to? give us more details 
Oh ok ,thanks '^^
You're welcome... sad it didn't work out for ya
Now I'm training my model, when I press the button, loading starts, but it doesn't reach the end and disappears
btw previously the final number decreased, but now it doesn’t
Check your command prompt 
I don't see any errors there :< or do I need something else?
Ayo? @fallen panther level 1 !!! 
like that
that's not a good sign
how did feature extraction go
and preprocessing as well
i see "ffmpeg error" in the output information of the 2a step, is this the problem? before that everything was fine..
okayokay
how's your path to dataset look like?
why did the ai sound like it got throat problems any tips
How long does it take RVC-GUI to unpack from zip? Is it a very long time process?
"C:\Users\Grish\Downloads\RVC-beta\RVC-beta0717\ihiwk" this ig?
Are you using RMVPE
if youre a guy and using a female model, you should increase the tune to maybe 10-12 and try around. else idk what you mean
i use mangio
Make sure to use RMVPE as the pitch extraction method/f0
anybody know where i can find a GOOD google colab or just online version of rvc?
i cant install rvc itself
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
anyone have crazy lag when using rvc with vb audio virtual cable?
it's super jarbled and hard to hear but sounds fine if I'm listening to it using my headpohnes
but if i put it through the cable input it sounds terrible
thank you 🙏
Use VAC Lite instead
This interaction has expired, use the command
/guides realtimeif you wish to see it again.
but pls make sure you actually got a gpu that is not like 10+ years old
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
Is it an app?
a setting
I wanna use UVR locally but I am iffy about space, power efficiency
Can you calm my nerves about it
hi, i would like an up-to-date video tutorial on how to use the ai voices
last time i did anything with them was 2 years ago
i forgot everything
last time i made an ai cover was when the google colab thing still existed
Ayo? @rain iris level 1 !!! 
stfu
Have the audio file of your song ready, & let's extract the vocals from it with an audio isolation software.
Ayo? @tidal trail level 10 !!! 
thanks
now i will be able to make a heavy x medic fanfic
honestly id just re-export your dataset in audacity as a wav
i think the audio is screwing up
I wouldn't suggest you watching videos, since these tend to be outdated.
Instead read the docs.
already did and i successfully made a voice cover
soo how do u cough withiut the ai tweaking
Don’t get sick
LMAOO
I don't know if it's intentional or not but Ilaria RVC isn't there anymore
my download custom pretrain thing in the appolio colab keeps saying
HTTPError Traceback (most recent call last)
<ipython-input-6-962fec43dc94> in <cell line: 44>()
44 for url in pretrained_urls:
45 filename = os.path.join(output_directory, os.path.basename(url))
---> 46 urllib.request.urlretrieve(url, filename)
6 frames
/usr/lib/python3.10/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
641 class HTTPDefaultErrorHandler(BaseHandler):
642 def http_error_default(self, req, fp, code, msg, hdrs):
--> 643 raise HTTPError(req.full_url, code, msg, hdrs, fp)
644
645 class HTTPRedirectHandler(BaseHandler):
HTTPError: HTTP Error 401: Unauthorized
how do i fix it
change the custom pretrained cell code
Table Of Contents Introduction Model Loader (Download & Upload) Inference (use RVC AI Voice Models) Settings (Inference) Ilaria TTS Introduction Ilaria RVC Zero, is an RVC (Retrieval-based Voice Conversion) Fork made by Ilaria & mikus, running only on Hugging Face Spaces, it’s called this way...
porcodio <3
How big of a dataset is too big? Right now, I am sifting through a 4 hour Twitch stream. (don't judge my source it is my best option rn)
1 hour (59.9 minutes), more than that it will not start train
sometime i just reduce them into 55 minute
you would be lucky to have 30 minutes because most of the lines get removed during cleaning. Depends on your twitch vod source too
YouTube....
If you're curious, I am creating a voice model of Gabriel from Ultrakill. I am sourcing from the VA's Ultrakill Twitch VOD from YT where he speaks like the character almost the entire time. I would use the original voice lines from the game, but the echo is messing with the quality. Assuming all of the voice models of Gabriel uses the game audio files, it is precisely why they all sound like shit. It's the echo and reverb that are messing with the output. The VOD, however, only uses reverb. I know how to remove reverb. I just can't remove echo. I've decided to make a hopefully higher quality voice model of him using this exact source.
We do have a tool call Dialogue Isolate for echo. I'm assuming you're using RX11 already? I'm doing a twitch vod/youtube myself and I can't bring myself to review the whole entire audio so it's split into parts. At least 1 hour and 45 minutes -> currently bring brought down to whatever length because of unremovable mic noises
oh yeah, that echo is easily removable if you use that. I looked at one of Gabriel's video
I see. I'm just gonna go through the entire thing and knock out similar phrases.
And no, I am not familiar with RX11. I am using the deverb plugin for FLS.
I'll just plug this https://rentry.co/RVC-dataset-RX11#modules-overview. We have entire guide if you exhausted all your options on fl studios
It's a "Paid" DAW like audacity. It's been used by model masters for a long time and there's currently no alternative workflow thats being used to make models
it just gives you plugins, which you can also use on FL
By "paid," you mean it's got a free trial or something?
free trial? 
does anyone know why voice ai keeps crashing everytime i try to speak with my voice changer?
ok your/mother
you can bypass minibatch k-means by removing the code part if you wish
there's no noticable diminishing return until about 2 hours ig, besides the amount of time and effort you gotta have, but 10-30 mins (without silences) is usually enough to have variations
when i download a girl voice model it doesnt sound good at all
Are you using RMVPE
are you sure you aren’t using a troll model
Make sure your using a good tune
I can retrain a voice?
with dataset yes
I only have the .pth and .index file
guys, while i`m downloading model at Google Colab it says JSONDecodeError no matter what model i use. What am i supposed to do to fix it?
Ayo? @warm oriole level 1 !!! 
then no
probably outdated colab
use -colab in this channel for working ones
Ok thx
Ayo? @edgy fjord level 2 !!! 
thx i`ll try
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
any way to apply voice to a recording?
its called inferencing, check the last pinned message here and open Ilaria Zero + the guide
ty..i'll check
Guys, does someone knows why some of voice models breathing sound sounds like a robot?
-
lack of breathing samples in set / damaged breathing due to denoise
-
Crappy models / over or undertrained models
alright, thanks!
I trained model for 500 epochs and found out that 1 epoch = 7 steps. 3927 / 7 = 561. How?
a form of rounding.
provided you use a manual sync approach
tho, that should not happen
never in my training career encountered such issues
maybe RVC Disconnected bug
Cause you did sync it on your own, correct?
or is disconnected using " auto syncing " bullshit
most people don't
and no you can insert your own value of steps per epoch to sync manually
but it's Mangio RVC and I've heard it ain't accurate when it comes to tensorboard logs
if it's Mangio then ye, sadly.
But it's not because of the mangio itself but rather related to what kalo added, auto syncing. it is hardcoded.
so any " manual inputs " are overwritten
@red kayak Hmmm, I think Imma update my fork - specifically, move from mainline base to fumiama's base
noticed some changes here n there, esp for training pipeline itself
after I'm done, will need some testers. you up for it maybe? ( will probs add some different gradient clipping in case of fp16 hence why I'd need testers as I have only so many sets on my disposition
Hello, I want to use V1's voice from ultrakill and I don't know how to use it, can somene explain me how to sue this AI concepts?
Time to use mixed precision and/or bfloat16
see the model checkpoint file names with suffix _eXXX_sXXX (epoch and step, respectively), or you can see it in the checkpoint saving process, also too few steps per epoch may be "inaccurate"
btw you should set 0.987 smoothing
bfloat16 already tested. Mixed precision is already a go-2 for fp16 training ( AMP afteral )
bfloat might be a lil more stable than fp16 cause better hparam compatibility but sacrifices a lil bit of performance ( model quality wise ) but it's too small to notice for casual users and there aren't any gains
fp16 has it's own fair share of problems; exploding gradients, NaNs at the start so, awful training initialization stage and is more prone to single-mode collapses
fp32 is obviously the best choice for quality models, sadly it's naturally the slowest ( unless.. well, you can use tf32 but let's be real, not a lot of people here can use it.
sue??? lol
uhh barely any1 uses v1 models anymore but it should still be useable if i'm not wrong
-docs
- How to use RVC Mainline Colab by Cauthess
- Full AI Voice Model Training Guide (local) by Christopher Villanueva
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
And can you help me knowing where or how to use this models? pease, I just- dont know lmao
Ayo? @unkempt flame level 1 !!! 
there's tuts on how to use the model in the docs
yes d
do
thats how ull get the latest stuff
rvc boss isnt interested in updated rvc really
:p
thanks a lot
so let the community do its thing
is anyone else having problems training with applio?
Is the RVC v2 architecture multilingual/language neutral?
You can train any language and its gonna work just fine
Thanks!
Ayo? @subtle phoenix level 1 !!! 
yup, heard the stuff and read everything
Just concerned a lil about my own integrations and 'older' ported mechanics but we'll see.
but guess people do be taking inspirations from me lol
it shouldnt be that hard
you see, a lot of stuff I had in was slightly rewritten mangio code
to have it compatible with new structures
fumiama further did split a lot of stuff even further and changed some integrations but ye, should be fine
just wonder why they disabled the vhq sox
hm
Fumiama saving rvc development
nonsense
what is that
sox resampler
ahh
set to very high quality
it's commented out for whatever reason
I personally do use it
other than that, envelopes
so most of stuff Imma keep in
but ye, can't wait to test everything out
sox vhq is the best algo you can get really
is fp32 training possible in fumiama rvc?
it's possible in every rvc
yay
if u change the .json file u are going to use it'll always work
and then also the config.json too
ye, cause it's set once per machine
from then onwards, only " inuse " ones are used
so if you change it on your own, that's what it is kept
oh
I still am not sure what was causing optimizers with adaptive mechanics to not work properly
so hopefully it's somewhat fixed in fumiama's revision
Really want to utilizer Ranger ngl
or AdamW with a lil lookahead support n gradient clipping for fp16
those things basically help with stability u say?
yea, gradients won't explode as they're clipped
and so, NaNs also
and that's actually a huge deal
mm yeah
cause sometimes you get NaN during epoch 0 / 1
gradients also wont explode?
that on it's own can happen in fp32 too
so that's just a matter of param updates
or re-sitting
yeah but will the optimizers help
that's a huge "?"
like with everything
Lookahead is meant to be looking ahead with small weights
compared to normal ones
and sorta guide the convergence
but then, yea
a lot of adaptive stuff I've tried before
won't cooperate
which is a shame, cause even simple stuff as warmup or lr warmup was out of options
ig a new gan can help ><
I think it's more bound to how the schedulers were written in there
huge mess of a code ngl
a lot of stuff being " hardcoded "
and taken directly from hparams ( from file
rather than dynamically accessed n stored
iirc, I saw some changes related to that so, having big hopes
ye its meant for the community to cntribute too
oh, wait
you meant the optimizers?
or what repo
repo
yea but, what repo
funima
oh, that
mhm
yuppie
mh mh
@brittle wingimma make when im bored enough
it happens cuz of the learning decay, it has happened once to me b4
😎 👌🏿
basically the length it will use to check for pitch variation. lower means it'll get more but it might crack, higher will get less but will be faster and less prone to cracking
if you're using rmvpe it doesn't change anything
learning rate decay ( scheduling ) does not affect steps whatsoever
steps are directly related to the batch, dataset and so the data loader
it doesnt?
the more u know
steps are directly related to batches and amount of data really
yup
but again, still can't get more " into it " cause the data loader works internally so
i'll ask gpt for more info abt the parameters then
Good idea
-overtraining
You can detect if a model is overtraining if the TensorBoard graph starts to rise and never comes back down. An overtrained model will sound robotic, muffled, and won't be able to articulate words well.
Check these resources to learn more about this topic
- Epochs & TensorBoard from
AI HUB Docs - TensorBoard from 🍏 Applio Docs
Does anyone know any tips to make the ai sound less glitchy and make it not voice break a lot?
-uvr
This guide shows how to separate the vocals and the instrumental of songs using UVR locally
Link: https://docs.aihub.wtf/vocal-cloning-guide/vocal-and-music-separation/uvr
Link: https://docs.google.com/document/d/1M3rEj5RPSE60f17RdM9MaIKKkS2bLrj-dwwuCUOdRaY/edit?usp=drivesdk
what happened to the uvr guide
how to install rvc models? theres many files
i used this
Aliyah Ortega (Irmã da Jenna Ortega) (Influencer)
https://huggingface.co/Pablao0948/Aliyah_Ortega_RMVPE/resolve/main/Aliyah Ortega.zip
Author: walker0384_28183 (@Pachwco)
Created: November 5, 2023 2:52 AM
Type: RVC
Algorithm: Rmvpe
Epochs: 300
where do you want to use it on? because there's a lot of tools tbh
id just use weights.gg if you don't wanna handle more complex stuff, involves no installing shit and is simple af to use
well i downloaded https://github.com/w-okada/voice-changer/, im not sure i came from yt vid but it js said ab index and pth files
oh i see
just unzip and use the .index and .pth files
ignore the others
theres just pth file no index one tho, i was gonna send a pic but it seems like i aint allowed here
Ayo? @brittle wing level 1 !!! 
just use the pth then
Ayo? @proper shale level 109 !!! 
dw
alright, thank you so much!
index is optional after all
you're welcome!
Sorry i got another question, how does widgets.gg work or how to use it? specifically with discord
weights gg is for covers n stuff. they host basically all our models and a bit more
to use okada anywhere else, you need a virtual audio cable
check https://rentry.co/VoiceChangerGuide virtual audio cable section
EOL - No further Updates
Github - Blanc-dot
Discord - Blanc_dot
Despite being end of life, most if not all information has not really changed, so should be very accurate until actual new stuff comes out.
Other Links
Antasma's Local Error Fixes
Antasma's Colab guide
Sushi's useful Links - You need...
alright tysm
guys, what can i do, if some sounds are kind of loopping? i already increased chunk, still have this problem
im using radeon rx5700xt graphic card
-audio
- 🆕 Creating Datasets for RVC using iZotope RX11, by Cauthess
- Perfecting Audio Isolation on Low-End Rigs, by Litsa The Dancer and Faze Masta
- Gathering and Isolating Audio, by SCRFilms :snowflake:
- Vocal Mixing Tutorial, by Roomie
Audio Separation/Isolation
Hey guys, anyone knows if RVC will run on 3.10.14 since im getting this message even tho i always used 3.10.12 and never had that before or just straight up ignored it lol.
2024-07-04 17:45:28,316:WARNING - WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.3.0+cu121 with CUDA 1201 (you have 2.0.1+cu118)
Python 3.10.14 (you have 3.10.12)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
Ye dont think it runs with 3.10.14
should use 10.12
-hf
- Ilaria RVC Zero, by thestingerx Huggingface Spaces
- RVC⚡ZERO, by r3gm Huggingface Spaces
- Applio, by IA Hispano Huggingface Spaces
- AICoverGen, by r3gm Huggingface Spaces
- Advanced RVC Inference, by r3gm Huggingface Spaces
- RVC v2 Huggingface version, by Clebersla Huggingface Spaces
thank ser
oh, sorry I just needed the uvr guide
Oh I thought u were replying to me
So sorry
does anyone know how to add downloaded vocies into rvc?
local or colab
local
thank you
hi
Ayo? @ornate gulch level 7 !!! 
what is the recommended amount of epochs to train for a audio sample that is only like a minute long?
where can i find some audios that i can just dump into applio to test them.
varies a lot, but I'd say 500/1000 epochs on 4 batch is where you'll get good results probably... ofc check the graphs for overtraining
just generate stuff on ElevenLabs or test with memes... idk
oh and Applio has their TTS as well .... if that works
Can i set mvsep to not to have long ass file names
i have a good voice model, but when i try it out, it sounds like shit...
Ayo? @ashen cipher level 1 !!! 
what settings do i need?
in what aspect specifically
making songs
pitch? pronunciation?
pronunciation
mess with "Search feature ratio" in case the accent ain't similar to the person
if that ain't enough it's probably the audio you're using
ok thanks
you're welcome, lmk how that goes
it sounds a bit better, but still not so good. It's probably because of my audio input file. I extract audio from songs and try to replace the vocals with ai vocals, but I think the vocals have a bad quality
Make sure you use wavs instead of MP3s
whats the source of the flac
Yes, just make sure you are getting them from a source where it’s loseless
YouTube will be lossy
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
-audio
- 🆕 Creating Datasets for RVC using iZotope RX11, by Cauthess
- Perfecting Audio Isolation on Low-End Rigs, by Litsa The Dancer and Faze Masta
- Gathering and Isolating Audio, by SCRFilms :snowflake:
- Vocal Mixing Tutorial, by Roomie
Audio Separation/Isolation
It should be okay if I just feed the RVC trainingt Colab an hour long wav file, right? Or should I break it up?
so im trying to download my ai covers but when I click download it redirects me to a cloudflare 404 page does anyone know how to fix this
What tool do you use
weights ai cover tool
Ayo? @agile apex level 1 !!! 
The site? Mmm @willow lichen have any idea?
-uvr
yes the website but my first few times I used the discord bot to make the covers
Mmm, the site seems working fine, i pinged the creator he may knows more
You mean making an rvc voice model?
Check https://docs.aihub.wtf/essentials/how-to-make-voice-models/, use the cloud ways such as rvcdisconnected or applio
Does it work on mobile
The cloud ways yes, its google colab, you are using a google pc gpu to run it so not locally on your phone
Btw can i sing when i train my own voice
Or have a complete high quality
Of a conversation
Wdym with sing? Like using it later for ai covers?
I think you mean dataset, yea in the rvc model you can add CLEANED singing vaw files
Be sure that the audio is cleaned and only vocals
It gave me pc orders
Which im mobile
Isnt there a mobile version of this
Pc orders? Where? You sure u are using the cloud google colabs?
The google colabs can be used completely fine on phones
In Datasets
"Press CTRL + A to select the whole audio."
Could you send a screenshot of where you see this? Or a link of what part of the guide u see this
I am training a model of my voice, and was just wondering what the best training settings I can use, mainly the amount of epochs and batch size... i got a lot of audio 7 files in approx 2hrs and 50mins in wav.
How do i do this
Ob mobile
On mobile
Hello i was wondering if all the voice models were safe or only the popular ones
I would send an image but it won't let me
Ho how do i make a zip file
In mobile
??
How do i clean my database with using phone apps
I dont have pc
@low shard sorry to ping you but I made a new cover and it let me download it pls don't ban me
You can make a folder on your phone and zip it, you can just go to whatever storage manager your phone has built in, and for example for mine i hold the folder and click the 3 dots to compress it to zip
Ayo? @low shard level 94 !!! 
I mean no risk
ok nice
You can use any model you want, it's fine.
Do you mean cleaning datasets in mobile?
Is there any app where it can clean my vocals
ok cause I wanted to try some models in my native language but nobody tested them it seems so idk if it was safe
You can use UVR on PC.
Dw abt ban, also i meant to say that's a problem of the weights.gg site so i dont really know how to help you with that sorry, that's why i @ the creator of it
Which is your native language?
Im in mobile
french 🥖
Ayo? @hasty canopy level 1 !!! 
In that case.. Emm, i'm not sure. Maybe there are some french models you can use.
Not app, sites, theres mvsep, cloud uvr https://docs.aihub.wtf/rvc/resources/vocal-isolation/ and also uvr ui #📰│dev-updates message
Last update: Feb 29, 2024
I mean as long as there's no virus it's good for me
Can uvr make my audio without any disturbing noises only the vocal of my own noise
Well, as i've said no model should contain viruses.
its ok maybe it was just an error or the file got corrupted
but it lets me download new covers
You're welcome bud.
All models are safe, for example on hugging face they are checked to see if there's anything bad
Weird then
bye
How do i do the Noise Gate step
Which requires pc
how do I continue training the same model in the same colab I started training it in if I got the Error
Connection errored out error
use Renegate free vst plugin for that purpose
Yo your here for a moment
huh?
anyone have any tips or tricks for getting an AI voice to work well with natural inflections and pitch changes while talking (realtime), assuming thats even possible
Would an rx 580 be fine for running rvc?
does anyone know how to deal with this error i paperspace?"root@nzdhhlx08h:/notebooks/Mangio-RVC-Fork# make run-ui
python infer-web.py --paperspace --pycmd python
Traceback (most recent call last):
File "/notebooks/Mangio-RVC-Fork/infer-web.py", line 26, in <module>
import faiss
ModuleNotFoundError: No module named 'faiss'
make: *** [Makefile:59: run-ui] Error 1
root@nzdhhlx08h:/notebooks/Mangio-RVC-Fork# "
Does anyone know where A RVC model of ajitani hifumi?
path could have issues if theres spaces or OneDrive or special chatacters
You cant train voice models with amd gpu on windows
You can inference voices
Realtime voice changer will probably have a 1 sec delay maybe slightly more
no yt
Last update: Mar 10, 2024
hey so, it might be a stupid question but i'm trying to make my first voice model for the first time and i trained my model and i can't find it, where does it go to?
Good day everyone! Can somebody please advise, there used to be a desktop app, within which you could convert audio using voice models you downloaded and even search for other voice models withing the software. Had a nice black interface, please advise if it's not up anymore of if you know which one was it, struggling with finding it online 
if you trained local its in assets -> weights
colab maybe sonewhere similar i dont know much about it
you most likely mean old rvc gui or something of that sort
if that's the case, it's heavily outdated and nobody uses it / shouldn't be using
tho wait.. I read it again, hmmm.. it had a search for models built in.. hmm
Applio is the only thing I can think of
it has built in search
thanks! i also wanted help on how to remove voice models to not waste space incase it is terrible
if you saved G and D paths for every x epoch. then go into logs -> yourmodelname and delete any d and g epochs you dont need
else if you saved every x epoch for models theyre also in assets - weights
thank you so much man really appreaciate it 🙏
Ayo? @inland jasper level 1 !!! 
You can search rvc ai voice model:
- #1175430844685484042
- #🔍│find-models
- https://weights.gg/
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://applio.org/models
- https://voice-models.com/
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.aihub.wtf/
You can't convert the voice model, even if not natively, you can use tts with rvc models, some forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)
NOTE: RVC isn't as good as GPT So Vits for tts, but gpt so vits can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese
If you wanna do tts locally (if you got a good pc):
- You can find Ilaria RVC Mainline in #📰│dev-updates message
- While Applio in our docs
If you don't got a good pc you can do it online:
-
Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide
-
Use Applio Colab (with google colab T4 free daily limit gpu)
-
if you don't wanna use edge tts, you could try another tts ai in https://discord.com/channels/1159260121998827560/1212368971127590922 and use the output as an input in rvc
Last update: Apr 01, 2024
Table Of Contents Introduction Model Loader (Download & Upload) Inference (use RVC AI Voice Models) Settings (Inference) Ilaria TTS Introduction Ilaria RVC Zero, is an RVC (Retrieval-based Voice Conversion) Fork made by Ilaria & mikus, running only on Hugging Face Spaces, it’s called this way...
Last update: June 15, 2024
When finetuning or training a pretrain from scratch, the large dataset doesn't have to all of the same speaker right? I checked out the pretrains guide but still need some clarification on this point. Thanks!
will see if that's it now, tysm anyway!!!
Hi, I just started using this today and have been messing around with various voice models with the intention of recording some stuff, everything was working as expected, however all of a sudden in my recording the voice isn't changing, I have tried to remove the application and such but am unable to fix it, does anyone know how to fix this?
I guess Ilаria RVC Zero isn't confidential
Wdym
Which application
It does not have to be of the same speaker correct
when you load your model, dO other users see it too?
Not sure
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
Voice-changer-native-client or realtime voice changer client v1,5,3,18a onnxgpu-cuda is the application
Sometimes, the output comes out choppy and laggy and even with trying out different chunk sizes, I don't get any desired result
Until I restart and spend more time, and then it works? but once I restart the program and try again, it's still choppy so I'm not sure why
is there a way to download illaria rvc to run local on my machine on mac?
I'm not sure if Ilaria Mainline is supported on mac.
are there any alternatives to it then?
Just download Applio or original RVC.
could you by any chance drop a link? :)
-rvc
Suggestions for @idle hedge
- How to use RVC Mainline Colab by Cauthess
- Full AI Voice Model Training Guide (local) by Christopher Villanueva
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
your gpu may not be strong enough
sorry, dont want to be a bother, but i only see the windows and linux installation guides, nothing for macs
Like Intel(R) Iris(R) XE Graphics isn't going to cut it? :"")
Ayo? @fresh thistle level 1 !!! 
I'm not sure if Applio is supported on mac.
You can just use Applio Colab, all you need is a google account.
ah man, i kind of want to have all these things locally, maybe ill find some other option, thank you for your help thought i appreciate it
Ayo? @idle hedge level 2 !!! 
You're welcome bud.
solo para inferencia
aparte que la instalacion es un infierno
Currently, KaraFan gets a lower SDR compared to the BS models. Roforer could be added (it would be somewhat difficult) but the SDR will be slightly lower than using the model directly in UVR5 NO UI or UVR5 UI. So... why not use those?
For a reason cause it gives good results.I wanna ensemble hq4 and bs roformer, Along with other MDX models and I don't have a computer
Hi, I have changed both of my desktop input and dscord input to Dubbling Virtual Devices but when I tested on discord sound and voice, it couldnot detect anything
Other people are asking for that too.
it should also be okada's output
@viscid moss
The input output settings should be like this in W-Okada:
Input - Your mic
Output - CABLE Input (line 1 if you use VAC, or whatever your virtual cable output is)
And in Discord:
Input - CABLE Output (same thing as okada output as well)
Output: Your headphones
basically
Let me try
But you can use UVR5 NO UI or UVR5 UI to separate them using Colab or HF. The ensemble is not ready yet but you can separate several audios at once with one model and then separate all those audios again with another model and so on.
This is not the case other people are also asking for bs roformer on karafan I have done that already its not the same.So you're planning on adding ensemble on your UVR no UI COLAB?When
Karafan is ensemble
@viscid moss
The problem is that KaraFan has not been updated in 8 months, the results are quite good but it uses the UVR5 code from 8 months ago, which will require more effort to implement the new models (Roformers).
KaraFan is a good tool but it is not receiving updates, for now I can only tell you that I will review it but I cannot promise that I will fix everything completely.
I know that KaraFan makes an ensemble with UVR5 models, that's what I'm working on for UVR5 UI and UVR5 NO UI.
Will you add an ensemble cell to the UVR colab you've made?
It's what I'm working on, along with other projects
That's useful
When will it be added soon
So you have to update the code first ah yes I get it
Soon, after the implementation of audio separation using YT and batch separation for UVR5 UI
Cat explotano
In a month or days?
month
you input audio
and it changes the audio
there is a text to speech button too
but that just uses a basic text to speech

Which colab are you using?
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
anyways what does the allign inputs feature do in uvr v5?
outdated colab
use the inference colabs u can find in our docs https://docs.aihub.wtf
Last update: Mar 10, 2024
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
Well then he should use these colabs provided
He can easily use Applio Colab
hmm which collab best for using public? 🤔
How to make text to speech fit song?
📚 All-In-One English documentation
Full AI Voice Model Training Guide (Local)
Link: YouTube
Credits: Christopher Villanueva
Model training with Mainline RVC
Link: Rentry
credits: Raven (ravencutie21)
AICoverGen Colab Guide
Link: Google Docs
Credits: Eddy (Spanish Helper)
Create a model with RVC disconnected (colab)
Link: Google Docs
Credits: Angetyde
The fun part is going to be figuring all of this out, in terms of knowledge of any of this I'm at like...0.5/10 lol.
I've used Applio/have a model downloaded that I want to convert to TTS. It's just figuring all this out. And this could be used for an AI assistant that reads ChatGPT/etc in real time right?
Ayo? @nimble blaze level 3 !!! 
how can i install hina rvc on my pc?
not working 🤔
oops, I need help, I've already downloaded the voice change app, but when I press it to start, the voice doesn't come out, can anyone help me?
I don't mean to comment over the person above (sorry) but I just wanted to ask if anybody has any experience doing batch conversions on Hugging Face using RVC V2? I do a lot of conversions but I cannot work out how the batch converter works - I can never find the files to download once they're done
Ayo? @west cosmos level 1 !!! 
do u separate vocals and instrumentals on this site or no
You'd probably need to use Colab for that since you can't see temporary files on HF iirc
Right, understood. Thank you!
rvc cli stopped working for some voice models for no reason
do models need replacing every now and then?
it'll generate but then it won't make a wav file
guys, do you know anything about applio app? is it better than other apps that use RVC?
lol, and yet you use " ignore outliers "
that's not the approach
mark it off, then evaluate
set the smoothing to 0.70
or less
100 ( smoothing at 1 ) is for general trending recognition
Ayo? @brittle wing level 4 !!! 
looks like a collapse
you should be fine using the most recent ones though, probably
there’s no reason to use over RVC
Use Harmonify Colab
hello when i use rvc a message appear and i cant use rvc someone can help me pls ?Error opening Stream: Illegal combination of I/O devices [PaErrorCode -9993]
yes! That's the only reference I had from your conversation so i'm kinda glad I can pull from that
Yea and I found it fascinating discovering it. It's making me want to start expiramenting with those multi-models
To see what would happen if I trained a base model on tiny's baritone then did another model on his falsetto
I dont know how that works though at all
It's all AI, like if you change the distribution of a particular sample then the accent and quality varies
Wouldnt that be good though?
Because no matter what his resonance and accent between falsetto and normal singing is very differet
And RVC seems to just kinda mash them together between those ranges
I tested a model with monotone speaking and including expressive sample improved the accent by a lot. Variety is needed so yes
On top of that, the baritone voice can't ever belt out anything higher up in his range because its trying to mash into the quiet falsetto
But too much variaty seems to be a huge problem too. I just dont think it understands the tonal difference between headvoice (quiet) and chestvoice (booming)
Tiny can hit the same notes in chestvoice as he can headvoice
But the AI tries to force everything past a certain limit into headvoice
Or some weird inbetween
Yup that's just the limitations with the current RVC model (Hifigan) I have no clue about the Bigvan, evagan stuff
What are those two? ive never heard of them
v3 soon, no guarantees it will improve from v2 but you never know
Oooooo V3!!!
heh it doesnt even look like itll do anything
Why the pretrains?
actually ignore what I said 
well maybe I was onto something because the pretrains have to be trained on hifigan? So if we switched then it would be way better
Whats wrong with hifigan
Also whats hifigan
that, I would have to research more into it for the problems. It's the generator and discriminator that makes up RVC rn
Im not sure what we'd move to though if we stopped using it
You can search up BigVgan or evagan in the server logs. It's fun to read up on stuff that probably wouldn't happen soon
i spelled it wrong 
My only worry is that with things advancing my GPU will become too old to use
Its already going on 6 years old
It was topend at the time but now itsj ust old
It does the same things as the modern RTX cards but is just slower
But if they introduce anything that minimally requires more VRAM ill have issues
oh, it will definitely increase. I don't know where I read that discussion from
(i got the RTX 2080super card)
maybe in model-maker chat
Here's a tiny clip of tiny singing https://youtu.be/8B2qxVbKgV8?si=5T6ehCItdoubLRLZ&t=7
"never hit your grandma with a shovel - it leaves a bad impression on her mind!" XD
Never knew he was an actor too, huh cool
Oh yea he did all sorts of stuff. He was even a clown in a certain horror film
holy shit its the joker
He's the perfect mix of comedy and scary
And goofy
He was plain goofy he couldn't be scary
But his looks could be
This is him when he was young
I wonder if that was heavily inspired for Joaquin Pheonix since he's a more grounded guy
I wonder that too
Tiny was very grounded though
He had his acts but if you listen to him talking he's a very real guy
Catholic and very oldschooly conservative XD
Hello, I'm having issues with discord where the virtual audio (or the rvc voice changer) sounds very glitchy. However when I listen to the actual audio of the virutual audio it sounds perfectly fine on playback on my system. I've turned off all noise cancelation and it still sounds glitchy, I suspect it being the transfer over to virtual microphone to discord playback.
Use VAC Lite instead of vb audio
Got it, let me try it out. Thanks
Ayo? @oblique bay level 1 !!! 
This fixed my problem, thanks!
np
is there any free alternatives?
I just said vac lite
VAC Lite is free
Make sure you download the lite version not trial
I see, I downloaded the VAC trial one
I noticed a lot of the RVC models are singing models - do they still work for speaking?
Yes they should
It is better to use a model that was trained on talking for talking vs singing?
Singing models can have a better range sometimes
Try both and see which one works better
good to know! Thank you
How do I resume a training session?
(non local RVC)
I've started over countless times because I was inactive for too long.
I need someone to train this for me. Please.
can someone help me with the ai voice?
i need help
im trying to use the google collabs thing
when i click on a model
it says
Frequent errors occur. Please check if the model of the framework being targeted is loaded.
can anyone hop into vc with me to help me with starting the training
everythings set up and i have my audio ready fully
lite is free and enough, wont spawn some kind of annoying popups tho
Hi. Is there any way to use RVC with an amd gpu (rx 6600). I want to use inference locally, but I don't need to train my own models.
Ayo? @fair onyx level 1 !!! 
https://rentry.co/RVCRealtimeGuide
perhaps, despite not mentioning offline inference
-local
Suggestions for @fair onyx
- 🍏 Applio, by IA Hispano GitHub
- Mangio-RVC-Fork, by Mangio621 Huggingface
- RVC Studio, by SayanoAI Huggingface
- AICoverGen, by SociallyIneptWeeb GitHub
- Replay, by Replay Team Website
- Original RVC, by the RVC-Project team GitHub
- GPT-SoVITS, by RVC-Boss GitHub
Credits to Faze Masta and Antasma for compiling these links.
It will use your cpu
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
what do yalls use for tts?
how can i save my weights.gg outputs
Yes it is sufficient enough
I made heavy from TF2 sing rainbow connection from the muppets movie and it kind of sounds like he's actually singing it
how can i fix no interface i running right now
Ayo? @wispy pawn level 1 !!! 
each time i load a local url in google colab it say that
-guide
- How to use RVC Mainline Colab by Cauthess
- Full AI Voice Model Training Guide (local) by Christopher Villanueva
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
-spaces
- Ilaria RVC Zero, by thestingerx Huggingface Spaces
- RVC⚡ZERO, by r3gm Huggingface Spaces
- Applio, by IA Hispano Huggingface Spaces
- AICoverGen, by r3gm Huggingface Spaces
- Advanced RVC Inference, by r3gm Huggingface Spaces
- RVC v2 Huggingface version, by Clebersla Huggingface Spaces
-rvc
- How to use RVC Mainline Colab by Cauthess
- Full AI Voice Model Training Guide (local) by Christopher Villanueva
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
-audio
- 🆕 Creating Datasets for RVC using iZotope RX11, by Cauthess
- Perfecting Audio Isolation on Low-End Rigs, by Litsa The Dancer and Faze Masta
- Gathering and Isolating Audio, by SCRFilms :snowflake:
- Vocal Mixing Tutorial, by Roomie
Audio Separation/Isolation
Any help with this? (I tried searching the server but couldn't find anything)
I made sure to reinstall Python just in case, delete and redownload the zip, moved the folder to my main drive instead of my 2ndary, nothing
Ayo? @modest abyss level 2 !!! 
It's been awhile since I've done this (last time I tried making a model was when Google Colab was still the top option lol)
I downloaded RVC and want to use this voice https://discord.com/channels/1159260121998827560/1258842728813559909 but when I talk, I can't hear myself and the voice isn't being detected.
HELP
using RVC GUI for the first time. How long does it usually take to convert the voice after clicking convert button?
Depending which GUI you're using and how long the audio is. For me the average song takes around 30 minutes, but all my stuff is lower end so it may be faster for you
ok. I have a low end laptop, no gpu. Model is Taylor Swift Red Era v2, the song file is an acapella of "Empire" by Of Monsters and Men. I set 'crepe' as the F0 method. crepe hope is at 128
felt like it's been 15 to 20 minutes so far and it's still converting.
what settings do you recommend for low enders?
Ayo? @dawn dew level 1 !!! 
Not sure about rvc-gui, but at least rmvpe as f0 would speed up the process a bit
ok it just finished
it seemed like 40 - 45 minutes. The next one I do I'm going to time it.
i finsihed training i think, but my assets/weights folder has no added_(smth).index file
i only see the .pth
click "train feature index"
tysm
i have a question what does the segment and overlap do?
Is there anyway to make the AI voicr sync with your laughing without it sounding like wind
Or is it just not possible
Apparently laughing is kinda not possible with RVC rn...
But I'm not sure if that's a limitation or just the lack of laughter in datasets
when i launch the tensorboard in applio rvc it just keeps reloading, is there something else i need to install for it?
Do you think it would be possible in 2 years time
Where gpu's like the RTX 5090 already being released
Well... only the future shall tell
Nvidia is pretty much carrying AI
the link is above
This has nothing to do with gpus and generally there's a big misunderstanding here in terms of laughing or such sounds in rvc or models I see
Laughing compared to speech / singing is more airy, it contains more " noise " type of data compared to phonetics, so, formants and so on
base models rvc uses are done on vctk dataset which barely if even had laughing-like pattern samples in it
rvc is just not accustomed to handling such well
And so, it is not bound to tech or gpus, it is just how the base models were made
Laughing is definitely do-able but not in all cases. It's a lil more tricky than that
Oh yea, not to mention contentvec rvc uses for " recognition " of what is the content of input audio ( for inference ) - I believe - has some harsh time or issues detecting such " unusual " sounds like that
i cough, laugh and moan daddy bulldozer on rvc 🔥
So really, it all comes down to many variables.
Contentvec, base models' dataset used and obviously, quality of models
Yea, it is do-able like I said, but it's def on a harder difficulty spectrum compared to speech or singing
So in theory if someone add laughs to contentvec and a pretrain with laughs, would be possible to get perfect laughs in inference?

there always be artifacts or irregularities
maybe if some mid-way step was to be added
such as diffusion denoising or whatever
then maybe it'd be very very close to what we'd call perfect
I just feel like people expect too much from tech that's been with us for barely few years
ye ai is new tech
not really AI, but voice cloning and reconstruction in a way we do it
normally, in fact, you'd want to literally train from scratch your own voice or a dataset
not using any " universal " pretrains
but having your own end-to-end finetuned
so that's also to be added to the mixture
People expect way too much from something that was meant to just work for broader audience
O yea i remember someone said our models are actually all finetunes
That's right
that's the reason we use pretrained bases
our models are just " snapshots "
of finetuned generator and discriminator networks - kinda, simplifying it.
which were previously trained from scratch on vctk dataset
What I think could work is doing a 2 run training but I haven't tried it myself yet.
but that'd require you to have a rather big dataset for your model
first training is the fine-tune training, you'd then take your G and D from model's folder, and use them ( provided you can adapt hparams properly ) and train again
on same voice / person but different set - and that'd be your main training but let's be real, not a lot of people here can do it
This would require a lot of audio hours?
Not really
1st run you'd do like normal, but preferably 10-25 mins
2nd run would be 15-35
but I think 10-15 could work again
just each run has different set ( or different section of the set provided the consistency is kept )
but if I had to be honest, first run should be as good as you can get it to be
'll be your foundation for the main model
This sounds interesting, uhm, i believe this would be the best thing to do if you’re recording a dataset yourself
Definitely
Imma try this and see what I can get. Sounds very interesting
Go for it
tho when I mention that 2nd run should be a different set ( in ideal conditions ) I mean it
as the first run ( first model, so, your base ), provided you legit trained it to the best of your possibilities, will be already accustomed to the voice
so the risk overtraining is drastically increased
Different in length and what's in it? Or just length
content of it
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
the voice should match but the content should be, ideally, new
or at least, much more diverse than the base
Oh i believe this would also work in characters that have hours worth of talking
Quite possibly
Ight, will post my results here when it's done 🙂
Neat
worth to try whenever I finish preparing my dataset 
yey. More testers the better.
Im getting an error: 'FeatureInput' Object has no attribute 'device' when trying to extract features
whats the fix for this?
are you choosing rmvpe_gpu during the step?
and foremostly, what's your gpu
i chose crepe and have a 2060ti
gpu's detected?
mentions running half precision / fp16 in the log?
says running fp32
just in case, did you tried this on applio?
yea its applio rvc
ahh
use codename's fork if u want to use mangio
ight is mangio better than mainline
Mangio is just crepe but allows you for adjustable hop
ight cool
nono, what are u using is called mangio-crepe
You'll find more about it in the fork's faq tab
- some more technical info
if that's within your interest
applio renamed it to just "crepe" for simplicity
btw, I doubt Imma fully move to " new " fumiama's base
yes pls
the hash / encryption and such, is a pain in the ass
it worked fine on 3.0.9/3.1.1 and the cpu threads set to 2
it tries to read info from models despite most of em made on other rvc
and some of that, such as author or few others
aren't present
yeah its bugged only in the recent versions
and if they're not, it spills errors
and I can't make it cooperate with older mangio's fork code structure, the hash
hey as long you can add some of the new changes of fumiama mainline is fine
So most of the changes, including training part and few others, will be added
but Imma stick to the core of what I have
ye codename has them
there u go xD
Yea, made quite a lot of these back in the day
was tired of constantly repeating myself lol
yeah sure, yw
tho if I had to be honest with you
aside of some qol changes, code restructure and few differences here n there, there's not that much of new stuff
sad
so with that being said, gonna stick to new fumiama's structure
but most of what I deem necessary, in the core, will be kept as it is
its fine
after fp32 gave me better results i stopped using fp16 haha
also i noticed that im not getting those NaNs at the start
after switching
totally worth waiting a bit more
Yep, exactly what I meant
ah gl
Good evening
I have a doubt, what a "Strong Version" of a model is? Like, its just better vocals or smth?
wdym " strong version "
there's no such a thing
In https://huggingface.co there's some versions "strong". Like Shawn Mendes, for example.
Perhaps it's meant to represent the style of a particular voice
so like, strong vocal, soft vocal etc or maybe speech
cause as long you're referring to rvc models, there's no official or community accepted / agreed terms as " strong " version of model or such
so it must most likely refer to the model's speech / singing style it was trained on
Does anyone know where I can find that typical AI voice that’s used for Instagram philosophy quotes/ideologies?
How do I remove harmonies in UVR?
Or something like that, i serarch and try but got nothing
Ayo? @worn river level 1 !!! 
How do I remove harmonies in UVR?
Could I get help with this?
Fixed the previous problem, but now this happens, and then crashes shortly after when trying to load Ilaria RVC Mainline
I had a question, how can I make a voice model?
In the context of RVC, the dataset is an audio file containing the voice the model will replicate. It can be either speaking or singing.
It makes sense, thanks so much for your help ❤️
yw : D
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, modded by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], credits to Eddy, Hina and Gdr for translating and fixing Google Colab
- Ilaria RVC, by thestingerx Google Colab
- EasyGUI, by rejects Google Colab
- UVR5 NO UI for Google Colab, by Eddy Google Colab
I got it to work but MAN is this slow
Maybe I did something wrong but I'm only getting 1 epoch every 6 minutes, meaning it would take me 5 whole days to train this model!
I'm trying to use the voice changer but it says I don't have a gpu, is it because i have a radeon gpu?
I dont think it should matter too much since i have a pretty decent cpu
but my gpu is way better
I think you may have installed the wrong version for your GPU
EOL - No further Updates
Github - Blanc-dot
Discord - Blanc_dot
Despite being end of life, most if not all information has not really changed, so should be very accurate until actual new stuff comes out.
Other Links
Antasma's Local Error Fixes
Antasma's Colab guide
Sushi's useful Links - You need...
Get the AMD version
You're welcome! Let me know if you face any issues :)
For the gpu options, should i choose gpu 0?
Ayo? @shut abyss level 1 !!! 
Most likely is your GPU, but check Task Manager > Performance tab, it should have the number there
ohhh, thank you
You're welcome
Do you know how to help my issue?
I got the RVC Mainline and am trying to train, but it's taking forever, nearly 6 minutes per epoch. Do you know of any way to fix that?
What's your GPU, current batch_size in training and how long is your dataset?
GPU is NVIDIA GeForce GTX 1060 6GB, current batch_size is 6, and my dataset is a little under 7 minutes long (was going to make it longer but wanted to use the shorter version for a test thinking it would go faster)
Verify whether you see " half precision " or " is half " during rvc opening ( in log )
You could probably cache dataset
if it doesn't say anything of that sort / says fp32 / full precision
or anything like that
Won't do
Tbh the 10xx cards are slow as heck sadly
Someone said it was because of the FP stuff 😭
GDR iirc
That too but caching additionally shouldn't be done on less than 12 gig vram cards
Ah
Ayo? @glacial pollen level 17 !!! 
I don't see that anywhere no
oh yea, then that'd make sense
You're training in FP32 mode ( single precision / full ) better quality models but performance ( speed wise for training ) takes a hit
1000 and 1600 series have horrible fp16 performance anyway. It shouldn't be used on them and I think rvc mainline disables fp16 for them
How do I change it? The last time I trained a model was about a year ago when it was as simple as importing the sound you wanted, pressing a button, and have a pretty accurate model in an hour
it's already in fp32 mode
If you do not see half precision or anything like that in the log ( which you do not ) then it fallbacks to fp32, which is the case for you
tl;dr.: Just bite through it ~ it's worth it.
is it better to have the sensitivity threshold higher or lower?
see what works better for u really
Guess I'll wait here for about a week for this to process then
I have it at max right now
what's the amount of time per epoch for ya
but sometimes it will pick up noises of me like sighing etc
enable sup2 as well
About 6 minutes, with the last one taking 7:20 and the previous ones taking about 5:30 each
Then you're fine
fp32 for me on rtx 3060 ( 12 gig ) - 10 min set - 16 batch size takes 4-6 mins
(( but then, I am using cuda memory fallback so ))
But still, you're fine
Alright (the reason I say about a week is because I did the math and it would take at minimum 4 whole days to process the model at its current rate)
Ayo? @modest abyss level 3 !!! 
I mean, you're not blindly " Imma train for 1000 epochs to have the best model 😄 ! "
right?
I wasn't making that decision blindly, that's the settings I used back then when, like I said, it was as simple as just throwing an audio file at it and having a pretty damn good model in an hour
Then it is blindly " calculated "
unless you're doing an exact same dataset, same parameters
is there any optimization for response time i can do while maintaining quality
Each dataset is unique, each model is unique, each needs different hparams, each has own training curve and initialization
and each requires own tensorboard supervision
because its taking about 7000ms right now
It isn't the same exact dataset but it's the same length as most of my other datasets that only took an hour back then
Like I said, this is not how it works my dude
RVC or Machine learning is not linear
you can't train X1 with similar data length and expect X2 to follow the pattern
in fact, in rvc, 1 slight hiccup in initialization ( during model's initial stage of training ) or a taking away of few secs from the set can drastically change the outcome
But anyways, you do you.
All I can say, your time per epoch is fine for fp32 and your hardware
Maybe I'm just getting confused then because of those older methods which seemed quite linear (not on the inside I know, but on the outside it was simple to calculate how long it would take)


