#✨│ai-help
1 messages · Page 38 of 1
there's a guide at https://rentry.co/VoiceChangerGuide
i have duplicated space, i got the voice model i want, i put a mp3 as my track, but it still says error? (huggingspace)
But you should have a .index file with your .pth model
you can keep it empty if there somehow isn't one
@small wolf
what is the error
How much training do you do on a 16-minute long dataset? (RVC2)
it's still uploading the mp3
how big is it?
Once I get my server working I have an idea I'd like to try... trying to train the base RVC model from scratch but adding my voice to the dataset
OH
how do i know when its done?
even after that it says error
¯_(ツ)_/¯
the space might be broken, not sure
Ayo? @small wolf level 4 !!! 
try using colab
since the RVC base model is actually trained only on VCTK...
or maybe I'll just mess with my own multi-shot architecture ¯_(ツ)_/¯
this isnt for rvc but im trying to split the backround vocals from a song and it just wont work can someone please help me. pwetty pwease 🥺
UVR?
help me pls, What's the problem?
[VCClient] wait web server...230 http://127.0.0.1:18888/
How does RVC work, like how can it learn a target voice without any source voice as input?
surely at least the base model was trained with some kind of source voice input at some point?
like during training, what is it even using as a source voice to derrive loss from???
im a totally new to this whole rvc thing, but i can not get to the part where i can put my voice in, i just wanna make a voice model of my voice and make christmas covers, but everytime i download the rvc file from github, there is no go-web.bat file and if there is one(which there has been since i tried different ones) it says "server not found" and "press any key" and if i press a key it just ends the window, Please help i have windows computer, am i dumb or something like that? am i missing something?
Oooh I think I know why it screwed up.
You should've downloaded it from Huggingface and not GitHub
Therefore you only got the code and not the usable application
rvc is built on a hubert with a huge dataset (i dunno how many, prolly a couple hundred hours at least) which uses pretrained models, that are trained on 50 hours dataset
^
interesting... so do you know if there is a way to fine-tune the base RVC model or even a trained RVC model but using audio pairs instead of just target audio files? (like older parallel voice cloning models)
uhh, i dunno actually, but i think pretrained models are kinda doing that. Like hubert is just a huge mess of different harmonies on which we can build sound models, and pretrains usually help rvc to understand what should the sound sound like. our ptretrains are trained on voices, so rvc can easily understand the voice of a human being even without enormous amounts of data.
yeah... but problem:
RVC models consistently turn out horrible when using my voice
(Yes I've tried pitch shifting, all the advanced parameters, etc etc)
ugh, at this point I might just make my own model from scratch 🤣
hmm, can you send an example?
because i've made multiple voice models of my own voice, and they all sounded fine
probs in like ~10 mins
away from main PC atm
notably weights.gg sounds somewhat better
also I'm using applio, if that impacts performance (it probs shouldn't)
I meant when using my voice to run inference on a model, ie: Cave Johnson
it just sounds a lot worse for most models than it should
nah, all rvc's have the same bases
yg
yeahhhh :/
maybe you have a bad mic?
possibly? Just using standard laptop mic...
any preprocessing I could do to my input audio to try to improve?
nah because I've had a friend use it and it sounds spot on lol
oh, yeah that's problematic xd
XD
to get a good inferencing you also need to provide an rvc with a good audio input
so it's better to buy a microphone and use that
oh thanks ill try that
Ayo? @median surge level 1 !!! 
- have a good volume settings, so that it won't pick up much noise background and will pick up your voice
fair...
but what exactly makes sound like... "good"? (opposite of an audiophile here)
I can guarantee there's no background noise in the recording lol
complete silence here
and + you yourself should have a good voice with correct pronounciation of words, and have a good intonation control
what's intonation control? (yes I tried googling it)
ngl I'm kinda leaning towards just mangling together two HuBERT models in an autoencoder config atm...
it won't help
and it will take a lot of time
little to no noise (check the spectrogram for that 💀)and no silences between words/phrases (you can do this after recording, in Audacity)
I mean, it could, given matching audio samples from my mic and desired output it could probably overcome the noise
Anything else just works
right... lemme check
our russian community has already tried to implement a russian hubert base with pretrains for a month
still no good results
how's uhhh... how's it going?
:/
how big is their dataset atm?
also what are they training on?
it's not as easy as training a regular model
obv lol
Ayo? @small wolf level 5 !!! 
finally
1300 hours of russian speech for hubert and around 300 for a pretrained model
but
yeah
you won't be able to make your own pretrain
or hubert
it's easier to just buy a microphone and practise your voice, because the program works completely fine
ok so itried it now but it didnt work, i downloaded from hugging face, and put the file into the windows file(system) and when i oppened it with an extention"breezip" and it still doesnt work. im sorry please help
ok... so maybe training HuBERT from scratch not good idea LOL
but surely there must be a way to finetune existing model with audio pairs?
fair
nah, probably there's no way
makes no sense tho
like how can RVC even calculate loss when training?
what is the input that it's getting when calculating loss
like any other machine learning stuff
well yeah but from what I know loss calculation basically takes some of the dataset (or from the test split), puts it into the AI and compares it with the expected result
download the second one if u wanna train
except the RVC dataset consists only of the target voice, no input so to speek
probably uses the pretrains to compare? hmmm i actually should research about this lol
lol
discriminator takes content from given data, shows it to generator, generator tries to fool the discriminator trying to generate stuff based on the dataset, it does that till discriminator believes it and then uppers it's expectations, and they do that till they reach some immovable point of learning
makes me wonder if I could use
matching audio pairs
to calculate loss
[that's my endgame here it's just somehow training or finetuning using my voice as an input...]
what is "content"?
part of the given dataset
ok I see a problem here
if the dataset consists only of the target voice
then it's just learning to turn target voice into target voice
right?
then I assume it "just works" because of the base model, which was trained with the other source voices
either way I suppose my goal now would be to figure out how to change this... get it to calculate loss using some other source...
ye
uhh, no, it tries to generate something resembling that data, remember i told you that rvc is built on a hubert which is basically a mess of harmonies
generate...
from nothing?
hubert is like a dough, the dataset is like a recipe to bake a pancake
dataset is the pancake
like loss is basically calculated by inferring the model as-is?
RIGHT?
I feel like I've misunderstood something fundamental here...
what i meant was, it's trying to fool discriminator with pretrain samples
that'd make more sense
ye if it's basically running inference with the audio samples from the VCTK datseet that would make more sense idek
traditionally GANs just input random data for training is this... also the case for RVC?
Hello, this TikToker makes his covers with RVC disconnect, what is rvc disconnect?
And anyone know how to use a polio
RVC disconnect is a colab Notebook
RVC Disconnected is for training voice models.
There's a guide in #1159513888199540817 if you wanna use it.
Can you get models from there?
You make your own models there.
There are also guides for this too.
You can train your own models here by uploading suitable dataset
Cool!
To train your own models..You have to prepare a detaset!
Btw, I have made my first cover, and I feel like it could sound better, how do I make the voice stay as the same tone?
As in like the movie/show that it’s from?
I used replay btw
I might try applio
What do you guys use?
To make a Ai voice cover. i use RVC
yeah both replay and applio use rvc
there is a transpose/pitch change option in there
You can also do that with any DAW software(s) . U can change the pitch!
just increase or decrease the pitch according to the voice you wanna replicate and the voice of the song
In replay?
But there’s many variations of rvc, like google colab, huggingfacd rvc
Yes ... Use anyone. U like
Do all of them have the same sounding quality?
I think it depends on your model.
im so sorry if this is a dumb question but if i use rvc disconnected to make voice model what is a dataset?
That is true, but I meant like how it would just convert in general
Thanks!
Its the voice data.. audio files
np
Just glad I’m not using one that lessens the quality
hahahah thanks!
idea:
synthetic dataset
But there is something which you have to remember
I'm trying to use model extraction from checkpoint processing but I'm getting an error code, does anyone know what the error is or how to fix?
File "C:\Users\Public\Mangio-RVC-v23.7.0\train\process_ckpt.py", line 64, in extract_small_model
ckpt = torch.load(path, map_location="cpu")
File "C:\Users\Public\Mangio-RVC-v23.7.0\runtime\lib\site-packages\torch\serialization.py", line 791, in load
with _open_file_like(f, 'rb') as opened_file:
File "C:\Users\Public\Mangio-RVC-v23.7.0\runtime\lib\site-packages\torch\serialization.py", line 271, in _open_file_like
return _open_file(name_or_buffer, mode)
File "C:\Users\Public\Mangio-RVC-v23.7.0\runtime\lib\site-packages\torch\serialization.py", line 252, in __init__
super().__init__(open(name, mode))
OSError: [Errno 22] Invalid argument: '"C:\\Users\\Public\\Mangio-RVC-v23.7.0\\logs\\Depeche-Mode\\D_2333333.pth"'```
huh
used the wrong file? D and G are training files
not the usable pths
where would i find the right file? the website only says "Model extraction (enter the path of the large file model under the ‘logs’ folder). " and i don't know what file it's refering to
wait is this a sovits model?
do you not have a pth named "depechemode.pth"
or something similar
oh no should use the one in weights
thanks ill give it a try
none for now, thanks for the help
np! good luck ;) hope the model's great
actually sorry to bother again, it's still giving me the same error, i tried bothing the topmost file and the bottom one
wait what the hell is this
i have never used this
checkpoint processing tab because i didn't finish doing all of the epochs
but uh
i think i'm supposed to have a .ckpt file somewhere
you already have all of the models in between the epoch you stopped on
so I don't see why it's useful
how do i turn them into a usable model? i don't have any index files
the pths in weights are already usable models
u can go back in the train tab to uh
train the index
yeah?
It should be in .wav format
It should be of minimum 10 minutes
It should be clear with music and background noises!
There should not be any pause(s)
You have saved my life, I understand how all of this stuff works now 🙏 thanks for your patience, the model needs a bit more training but now I know where to get all the files I need and where they go
ok but how no pauses?
Pause meaning. When someone speaks or sings .. he/she may Stop for a couple of seconds. Which will increase your dataset length without any need. your Main motive should be to cover maximum data in a short length.
Cut it by using DAW such as (audacity, fl studio)
Also remember there should not be any other speakers in your dataset !
Is it normal to get triangles?
No thats a mode collapse
Perhaps your dataset has too many silences?
Maybe, I'll cut more silences from it then
Should I restart training from 0 or just continue it?
Ayo? @severe rapids level 2 !!! 
probably your training session gets interrupted and resumed? mode collapses are supposed to be hard dip on <15 value, and silence aren't always only cause of collapses, it can be if dataset length is too short, etc.
The dataset is 10 minutes long
there's like a 1-2 second silences between sentences
you can do noise gate & truncate silence using Audacity tho
Can you tell me where's the truncating?
Audio is clean, there's no noises so I don't think I need noise gate
why would it get interrupted?
INFO:saturn:Train Epoch: 57 [0%]
INFO:saturn:[3080, 9.927757679628145e-05]
INFO:saturn:loss_disc=nan, loss_gen=nan, loss_fm=nan,loss_mel=21.914, loss_kl=7.541
INFO:saturn:====> Epoch: 57 [2023-11-30 15:21:35] | (0:01:20.316749)
INFO:saturn:Train Epoch: 58 [2%]
INFO:saturn:[3136, 9.926516709918191e-05]
INFO:saturn:loss_disc=nan, loss_gen=nan, loss_fm=nan,loss_mel=21.711, loss_kl=8.367
INFO:saturn:====> Epoch: 58 [2023-11-30 15:22:59] | (0:01:24.328599)
INFO:saturn:Train Epoch: 59 [4%]
INFO:saturn:[3192, 9.92527589532945e-05]
INFO:saturn:loss_disc=nan, loss_gen=nan, loss_fm=nan,loss_mel=21.252, loss_kl=8.694
INFO:saturn:====> Epoch: 59 [2023-11-30 15:24:24] | (0:01:24.975206)
INFO:saturn:Train Epoch: 60 [5%]
INFO:saturn:[3248, 9.924035235842533e-05]
INFO:saturn:loss_disc=nan, loss_gen=nan, loss_fm=nan,loss_mel=22.173, loss_kl=7.824
Console logs
alright
which RVC are you using? neither of my latest original RVC, mangio and applio have that issue
I remember having all nan values issue when before upgrading GPU from 1660 super to 4070 and before updating the driver, but now it works completely fine
I have a 1060 lol
how about try original RVC or applio, the former uses a bit less VRAM for the same settings and batch size tho
Okay, i'll try
gtx 1xxx issue
Fix is turning off "half precision" or "fp16_run". Unsure if you'd be able to train on a 6GB card because of that though
@severe rapids
I don't recall how you're supposed to do it though
In stable diffusion I believe you use --no-half-precision or something similiar
does it also affect on 4 gb GPUs (e.g. 1650)?
Will probably be the same
Any gtx 1xxx series cards so yes
Here you gotta edit one of the config files by hand before doing anything related to training the model
Or straight up edit all of them at once, but I can't check where the files are atm
I think I found it
I setted is_half to false, will try now
I believe it'll be reset according to settings defined in other .json files
Also that might not even be the correct toggle but unsure
RVC Guides (How to Make AI Cover)
Translation by country
It's been working fine for the past 10 minutes, I guess I'll just start over with the new dataset and see how it goes
So how do you import models from weights.gg to huggingface?
u can upload them to your drive and copy that share link (make sure everyone w link can see else it won't work) and it should work
nvm, it's at it again....
is this g/total? what the hell
yeah
Ayo? @severe rapids level 3 !!! 
Great! So which area of the discord do I upload it to?
wait is it ur model?
No. I downloaded it from weight.gg and uploaded it to my Google Drive.
no need to upload it here then
Except the model isn't available in the voice-models area of this discord.
You could ask the model maker to upload it then
why i do get this in RVC V2 Disconnected
Basically, I have a 1060 and 10xx series don't support half precision.
Just now I found a way to turn it off, but I had to sacrfice 3 batch size down to 1 (maybe 2 might work, I'll try now).
Now it looks like it works fine, yet instead of a minute, it takes 2 per epoch
nvm, i got the solution
the program has been updated, it's the first time I've used it, I want to use it for TTS, but it's changed, does anyone know how to start it?
oof
I think installation failed, because the name "applio tts" has a space.
no applio tts I created it I did the installation inside
- Applio is not a TTS - RVC needs sample audio to work
Either way, remove the space on the name.
Done! but how do I start the program?
I guess you should try reinstalling Applio, delete the files on the tts folder and then try again
Might work now
I'll try, thank you very much!
Np! ;)
@proper shale Is it suppose to flatline that soon? It's been training only for 1h, 34 epochs
does Ilaria still work?
are mac (silicone) users cursed with RVC? only trying to do inference as colab seems better for training then doing it local on mac.

is W-Okada the way to go in this case?
Yes
for inferencing, no.
hmmm yeah, sadly
depending on batch_size
hmm so there is a mac silicone version on the w-okada page. its not good/working?
i may just try it and see. if it works are there any drawbacks from using the webUI inference? maybe less tweakability?
it's lower quality because it's made for realtime.
therefore, an inferencing fork would be better, like mangi
mangio
i see, makes sense yea. ill check that out thx!
Alrighty, any more questions?
How good is it for the first model?
How can I improve it?
Should include more of a different pitch in the dataset?
im having error
Ayo? @languid swift level 1 !!! 
it gives me this error, with go-applio-manager.bat I activated the resolution of the problem but it remains the same, I even reinstalled everything and updated the app, does anyone know the problem?, I have amd, intel as the processor,
[VCClient] waiting for the web server...200 http://127.0.0.1:18888 /
[VCClient] waiting for the web server...210 http://127.0.0.1:18888/
Backtracking (last call):
File "MMVCServerSIO.py ", line 258, in <module>
File "subprocess.py ", line 1209, File in standby
mode "subprocess.py ", line 1506, in _wait
KeyboardInterrupt
[10884] The 'MMVCServerSIO' script could not be executed due to an unhandled exception!
Wrap the package filling [Y(yes)/N(no)]?```
help :(
Ayo? @ancient sinew level 1 !!! 
is it known whether or not setting is_half to false in the configs actually makes everything run in full precision or just the training?
training a model locally vs google colab on the same dataset to the same epochs, the locally trained model sounds worse and has a really thick accent
It didn't changed anything for me (not for the training at least), had to manually go in the configs directory and change fp16_run To false in jsons
It might actually run inference in full precision and leave training as is
Just a guess tho
Hello, is there a version of the voice changer for windows 11?
How do I not let rvc disconnect when training
A model
Cuz I don’t want it to disconnect when AFK
any good and fast rvc colabs for inferences? i was trying to use Ilaria's but it takes too long to inference, like 10 minutes for a 2 minute inference?
Bet
Do I do that rn while training?
Or before
you can rn
Thank u 🙏
doesnt really matter when you do it, its an autoclicker in javascript lol
your welcome
btw if u ever do it on mobile theres also #📑│making-models message to put the code in the msg i sent before to not make the model training process disconnect
wdym? do what?
did the model disconnect already?
then its not too late
Ctrl+ Shift + i to open inspector view . Then goto console and paste:
function ClickConnect(){
console.log("Working");
document.querySelector("colab-toolbar-button#connect").click()
}
setInterval(ClickConnect,60000)
after u pasted it js do enter
I think I did it
okay good then
I clicked enter
And thank u bro
Hopefully this model comes out good
your welcome,js next time ask in #📑│making-models if you need help related to models as this channel is for rvc (making ai covers)
Fs Fs
Ayo? @noble dawn level 3 !!! 
Gotchu
right fp16_run is what I meant
hey guys how can i use a downloaded model in rvc?
pth in weights index in logs
attempting applio cli
Traceback (most recent call last):
File "/home/default/Applio-RVC-Fork/infer-web.py", line 1686, in cli_navigation_loop
execute_command(command)
File "/home/default/Applio-RVC-Fork/infer-web.py", line 1667, in execute_command
cli_infer(com)
File "/home/default/Applio-RVC-Fork/infer-web.py", line 1377, in cli_infer
split_audio = True if (com[16] == 1) else False
IndexError: list index out of range
You are currently in 'INFER':```
no idea what next
there is no index file in the zip, just a .pth and config.json
I cannot figure out if whether the roboticness is due to not enough training, or something in okada
I downloaded it from hugging face. Where do u get ur models?
or here
lemme change that since idk if that link is good or not lol
it didn't embed.
so i got magio running on a mac and am trying to infer but it sounds cursed af. its getting the pitch right but something is scuffed af. does this sound like a familiar artefact of any bug or wrong settings?
is it free
Ayo? @limber steeple level 1 !!! 
trigger warning 💀 squidward bussing a nut
GUYS
DOLLS
for many years it has
I can do it on phone?
Probably not. there aren't that many good ones
😭😭😭
weights.gg models are quite bad no?
weights come in 2 varietys, wysiwyg and diy
Elevenlabs @radiant flare
I can do it on phone
But i need for anime voices
In french
did the a.... show up for me for everyone. or does it show my full thing to you
isnt it better just to train ur own model?
the models on weights.gg all sound so weird
I mean. this is a model i found on weights and sounds pretty dang good.
mind you i am a male
You can hear the roboticness yes. and it would be best to do it yourself
my only problem. i don't know how lol
check dis guide https://www.youtube.com/watch?v=Y8IxVVQBEpc&t=1s
its simple
do you have a sample of the one you tested? i just want to hear how different it sounds.
I just started out but yes sure. The quality depends on the quality of the input files as well as the settings you use
I see
so
This guide is just getting out clear audio from samples...
Ayo? @limber steeple level 2 !!! 
here is the result
nvm im dumb
it's pretty good tbh
this is the original vladimir voice I used to train the model: https://www.youtube.com/watch?v=sMJjT2Teeyo&t=1s
I dont know why the first thing that came to mind from there was "DECEPTICON"
what do u think of the quality / resemblance?
But now i hear the source it sounds good. it sounds a little higher pitch if that makes any sense
Ayo? @boreal grotto level 4 !!! 
following the guide. I dont have just plain "powershell" only windows powershell
Yeah. Following this guide. I do not have plain powershell only windows powershellhttps://www.youtube.com/watch?v=Y8IxVVQBEpc&t=1s
yk how to add normal powershell?
windows powershell is fine too
crap.
i dont know if installing this under my Administrator user profile while I use a default user profile will cause problems
attempting / fighting with applio cli
go infer
logs/weights/TomWaitsRV.pth assets/audios/vocals.wav wav logs/added_IVF558_Flat_nprobe_1.index 0 -2 harvest 160 3 0 1 0.95 0.33 True 8.0 1.2 1 0 50 1000
error of the moment
FileNotFoundError: [Errno 2] No such file or directory: 'assets\\rmvpe/rmvpe.pt'
Hey i was wondering if someone could help me because i found this tutorial on youtube for making models in google colab but whenever it makes a model it just has such a bad quality and i dont know if i just have bad audio samples or is this google colab just bad. This is the Tutorial i was following: https://www.youtube.com/watch?v=79K9k8OSpIA&list=LL&index=8&t=1273s&ab_channel=LearnwithDev
What epoch count are you using?
well i first try 250 than another one on 300 and then another one on 500 and they all sounded very bad
Which colab
Ayo? @honest veldt level 1 !!! 
Now i have problems
Trying to just test out the removal of vocals
Keep getting an output error
Is your dataset clear
what do i need to download on uvr5 to separte harmonies and background singers in vocals
Ayo? @fierce isle level 1 !!! 
this is a 38 min hq dataset, batch size 4, trained up around 1250epochs so far. If anyone recommend me some points on where to test the model, I'd really aprecciate it. Pretty sure its in overtrained territory right now
I think it's getting close to overtraining now.
Mel and KL haven't gotten down in a while
200-220k steps is good I'd say
Here is the output error. Any ideas?
rap god eminem clip for testkdqgole6.aac.reformatted.wav->Traceback (most recent call last):
File "C:\Python310\lib\site-packages\librosa\core\audio.py", line 155, in load
context = sf.SoundFile(path)
File "C:\Python310\lib\site-packages\soundfile.py", line 658, in __init__
self._file = self._open(file, mode_int, closefd)
File "C:\Python310\lib\site-packages\soundfile.py", line 1216, in _open
raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening 'D:\\AI\\RVC\\Retrieval-based-Voice-Conversion-WebUI\\TEMP/rap god eminem clip for testkdqgole6.aac.reformatted.wav': System error.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:\AI\RVC\Retrieval-based-Voice-Conversion-WebUI\infer-web.py", line 374, in uvr
pre_fun._path_audio_(
File "D:\AI\RVC\Retrieval-based-Voice-Conversion-WebUI\infer_uvr5.py", line 64, in _path_audio_
) = librosa.core.load( # 理论上librosa读取可能对某些音频有bug,应该上ffmpeg读取,但是太麻烦了弃坑
File "C:\Python310\lib\site-packages\librosa\util\decorators.py", line 104, in inner_f
return f(**kwargs)
File "C:\Python310\lib\site-packages\librosa\core\audio.py", line 174, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
File "C:\Python310\lib\site-packages\librosa\core\audio.py", line 198, in __audioread_load
with audioread.audio_open(path) as input_file:
File "C:\Python310\lib\site-packages\audioread\__init__.py", line 127, in audio_open
return BackendClass(path)
File "C:\Python310\lib\site-packages\audioread\rawread.py", line 59, in __init__
self._fh = open(filename, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: 'D:\\AI\\RVC\\Retrieval-based-Voice-Conversion-WebUI\\TEMP/rap god eminem clip for testkdqgole6.aac.reformatted.wav'
(dont mind the song name its what i had on hand)
is it because i have spaces in my filename?
what do i need to download on uvr5 to separte harmonies and background singers in vocals ( if you cant get rid of them lmk aswell)
thank you
try removing it
but i wouldnt say thst would solve it
Here is what caused me to believe that
What is the most straightforward way to make these AI voice models into like a song for example?
well i think
Yep. tjat was it
The dataset should have no background sounds, 2nd speakers, or music
oh yeah maybe
for singing stuff, check AICoverGen
in the pins of the channel, the one posted by @glad zealot
Alright thanks mate
yep i does not have any of those
this all seems so confusing. what do i do after i downloaded them all?
Ayo? @oak widget level 1 !!! 
-colab Try RVC Disconnected
Suggestions for @honest veldt
- AICoverGen-NoWebUI, by Ardha Google Colab
- RVC Disconnected, Kit Lemonfoot Google Colab
- easyGUI, by rejects Google Colab
- RVC-HF, by r3gf Huggingface Spaces
- AICoverGen, by r3gf Huggingface Spaces
- RVC v2 Huggingface version, by Clebersla Huggingface Spaces
- Advanced RVC Inference, by neuclya Huggingface Spaces
You can find more info on the #1159513888199540817 channel. If you can't find your answer, feel free to ask for help in #✨│ai-help. Credits to Faze Masta and Antasma for compiling these links.
Read the guides in #1159513888199540817, and be specific about questions so I can try my best to help
alright I'l try it tomorrow
Applio cli / go infer :
FileNotFoundError: [Errno 2] No such file or directory: 'assets\rmvpe/rmvpe.pt'
rmvpe.pt file exists in assets/rmvpe/
no clue what to fix this
anyone can help me? "contains nan"
how to i turn a hugginface zipfile to a link?
@daark how long is the sample?
something about updating a py and changing a bit of the python code?
¯_(ツ)_/¯
30 minutes
idk if that is long enough 😄 i just know the posted issue says 6 minutes is too short
time
no, I didn't change any code
i use python 3.9 from microsoft store
¯_(ツ)_/¯
anyone?
how do you mean link?
so you need to upload it
or serve it
the zip is a model?
to be uploaded to some collection?
you mean you want to upload your zip file or js tryna get the download link zipfile of a model?
hi there ❤️ I was curious if any of you have found a program that works on Mac. My PC is busted atm, so when it's fixed I'll just use RVC, but I'd like to practice with something in the meantime.
When I go to do my extractions, I end up detecting empty audio. giving the error "countains nan". Can someone help me?
I tested it on Mangio and Applio, both are giving the same error
did you try the code change?
reinstalled mangio 2 times (mac) and all outputs still have this weird disgusting sound 💀 its like the pitch is translated well but everything else not. what could be the issue, this is weird af since it seems to be running well?
No, I don't do that
its what i would try
Ah
-rvc view the guide for making a model with RVC Disconnected
Translation by country
- AICoverGen-NoWebUI, by Ardha Google Colab
- RVC Disconnected, Kit Lemonfoot Google Colab
- easyGUI, by rejects Google Colab
- RVC-HF, by r3gf Huggingface Spaces
- AICoverGen, by r3gf Huggingface Spaces
- RVC v2 Huggingface version, by Clebersla Huggingface Spaces
- Advanced RVC Inference, by neuclya Huggingface Spaces
You can find more info on the #1159513888199540817 channel. If you can't find your answer, feel free to ask for help in #✨│ai-help. Credits to Faze Masta and Antasma for compiling these links.
rvc on deez nuts
i need a copy of the old applio notebook before it got deleted
and I need it asap from anyone
Won’t work
Google will detect gradio and remove your runtime
why not im not informed and this ngrok shit is pissing me off
There's a working version of applio colab but idk when they are gonna release it
Just use the colabs without a ui
-colab
- AICoverGen-NoWebUI, by Ardha Google Colab
- RVC Disconnected, Kit Lemonfoot Google Colab
- easyGUI, by rejects Google Colab
- RVC-HF, by r3gf Huggingface Spaces
- AICoverGen, by r3gf Huggingface Spaces
- RVC v2 Huggingface version, by Clebersla Huggingface Spaces
- Advanced RVC Inference, by neuclya Huggingface Spaces
You can find more info on the #1159513888199540817 channel. If you can't find your answer, feel free to ask for help in #✨│ai-help. Credits to Faze Masta and Antasma for compiling these links.
Or use hinas
Well yes but no, most of the time yes.. sometimes no
Colab is weird
im limiting my options by this but I dont like the non web uis
Wanna try the kaggle version?
Found my applio colab
im a simple man that does not like to relearn something when i have feasible and simple things to use, local stuff is fine but never gives a good quality from when I used it, kraggle is so bloody confusing, i only really need to infrence as opposed to making models cuz i really made what i need ngl
Ayo? @placid talon level 1 !!! 
its just by prefrence
im a visual guy not words
do i need this ngrok shit
Not really confusing at all
Yes cause colab hates gradio sometimes
Just go to the ngrok website and register, on the top left there should be "your auth token" just press that and copy the token it gives then put it on the part where it asks in colag
What Gpus are compatible? I have a MSi Rtx 3050
aight
I was hoping i could have used my GPU because this thing is going to kill my CPU
and my AIO cant keep up
How'd you get it running?
It should be detecting it
Im separating vocals rn
Im not training yet
ok the vocals separated but it passed some crap to console
nvm i know why
pip install torch without telling it to fetch from the cuda releases ?
opt/instrument_My no such file or directory
but it worked. And the reason for that is i gave it a sample that had a filename of My Video.mp3
anyway. How can i get training to work if it refuses to take my GPU 0 as valid
For context I have a RTX 3050
what is the best python version for the applio or mangio?
Ayo? @eager vine level 2 !!! 
anyone know why my volume is showing up as 0.0000? im using the correct imput device.
3.9
Did you click start
yes
AMD or NVidia GPU?
Got a question for ya. What are supported GPUs
I thought that a 3050 which is still fairly recent would be supported
Yep
Should be
Not sure
Is there a way to fix it?
This is what console says right before launching the site
As soon as I go to train and select "0" as the GPU. It says unsupported
Is your RTX 3050 GPU 0?
Thanks for trying to help though. Maybe it's due to complications where I have RVC installed under a Data drive and not the C drive.
It's the only GPU I have. So I would assume so. That's is also what device manager and task manager tell me
Although process explorer doesn't recognize a GPU at all
Should work?!
I don’t know why it’s not working
Nope. It just says I supported and launches in CPU mode. And the site says it is unsupported
Ayo? @boreal grotto level 5 !!! 
Maybe I should just try installing it again?
You can try
There was something that I got recommended to downloading. nvcNN
I didn't since I'm not part of that program to be able to install it
Even though I'm part of Nvidia developers
Hi I need some help with appilo. I keep getting this error when trying to make a model
t like...shows that little "Error" oval in the window where the output should be.
From the looks of it. subprocess.py either is corrupted or doesn't exist.
Can I ask though. Do your sample voice files have spaces in them?
Yes, they both do.
Ayo? @crude siren level 2 !!! 
I know a problem for me was that the spaces needed to be replaced with underscores.
Maybe try that and relaunch.
kk we'll give it a shot
Okay I got rid of the spaces, but now the Train process is throwing a TypeError, and the ouput box on the site just says it finished training.
And I don't know where train.log is.
Is there a folder called training where you installed RVC?
I don't have applio if I remember
checking
The Applio-RVC-Fork folder that was installed contains a folder called "train", but not one called "training."
Check that folder
k
I don't think there is anything useful in here.
Unless there's something in one of these files that gives more information on what the error was about.
@boreal grotto hello?
My fault. I thought my message sent
I don't know much about Applio to be able to help more. It could be in the models folder or something. I would just use File explorer to search the entire Applio-RVC-Fork folder for a file called train.log
Hey guys. I need some help here. I want to get the model maker role so I can get access to submit my model on voice models. But something is wrong. When I put in my audio file in the demo part, It does this. It’s not working for some reason. Can I please have help?
Check weights folder maybe
Not possible
You don’t
@violet heron heres what im working with
ive uploaded like 5 models and they all sound like complete robots
That’s w-Okada
Change crepe to RVMPE
is this the worse program
Okada is used for real time conversion
RVC is for covers
Note really
ty
ty
Does your audio files contain a space?
The space digit
in name?
File name. Yes
no ;/
Hey guys um
That's weird. I guess reboot the computer, rename the file and try again
it's 30 minutes
Is there a Google Collab link which I can use to make a model for RVC?
Ayo? @timber girder level 1 !!! 
I would trim it in various files of 5 minutes each
-colab
oh
Suggestions for @timber girder
- AICoverGen-NoWebUI, by Ardha Google Colab
- RVC Disconnected, Kit Lemonfoot Google Colab
- easyGUI, by rejects Google Colab
- RVC-HF, by r3gf Huggingface Spaces
- AICoverGen, by r3gf Huggingface Spaces
- RVC v2 Huggingface version, by Clebersla Huggingface Spaces
- Advanced RVC Inference, by neuclya Huggingface Spaces
You can find more info on the #1159513888199540817 channel. If you can't find your answer, feel free to ask for help in #✨│ai-help. Credits to Faze Masta and Antasma for compiling these links.
oh ok
I just remember the old one made both models and covers, before it was banned
Thanks
You are welcome
same with 5 minutes
Maybe is the model that is giving the errors
Try using another one
If not then reboot the computer
I'll good to bed now
I've already tried restarting many times
the worst I tried ;(
ok, thx
You don’t have to
All my datasets are just 1 big audio clip
I tried both ways
how can i make models with pitch guidance disabled?
Can anyone tell me about any RVC that is working free via colab today??
There should be a checkbox on top
portersona, you have a great name
Ayo? @radiant flare level 5 !!! 
Applio soon idk when they are gonna release the working version
i mean a colab or something o train a model, i tried rvc v2 disconnected but it still has pitch guidance
Applio I don't think it's in disconnected but also it just sounds weird most of the time without pitch guidance
Do you know anything that could be causing this?
I've been trying to fix it for 3 days
i want to make talking models
No idea
Yes you don't really need to disable it
I know, but it sound robotic
That's probably on your dataset
All my audios are talking and without noise, the only thing i need is disable pitch guidance but i cant find a method
You also removed the silence right?
Yes
How many epochs did you train it with and how long is the dataset
tried with 300, 500 and 1000, same result
5:38 min dataset
Weird no clue ngl
hmm maybe overtrain?
do you have any method to train without pitch guidance?
Ayo? @sonic agate level 2 !!! 
Wanna try with applio colab?
it works?
I tried 2 months ago but it wouldn't let me disable the pitch guidance
I made a working version
how?
what did u change in ur colab? 
stuff
can u give me link for ur colab?
thanks
rmvpe
any way to fix nan loss during training? ive had this randomly with certain datasets and it seems to cause the model's output audio to become horribly distorted
i assumed it might be due to the current version of mangio RVC fork being a bit buggy, bc it didnt happen when i still used the june 18 version
im using a gtx 1080
there's a way to disable "half precision" somewhere
Since gtx 1xxx cards have issues with it
not sure where though
yeah i already did that and it still shows nan loss occasionally
Ayo? @urban viper level 1 !!! 
Honestly I think you are better off running on colab at this point
the performance loss from not being able to run fp16 on a card that's drastically slower than the tesla t4 is probably not worth bothering
like i changed all the fp16_run to false in all of hte JSON files in the config folder, and it still does that
yeah ik the 1080 is kinda slow but i can live with that. it takes like 2-3 hours on average to train a model until the loss starts to crawl back up
planning to sell the card and get a 3060 12gb instead but that's still months away
google drive link doesnt work
needs to be a direct link to the download, like when you press it, it automatically downloads
uploaded to huggingface and resolved
is it ok if i use audio sample with duration more than 3 hours? or just less but represent the 3 hours?
from what i've tested it seems like too long of audio length will simply give you diminishing returns. i've tried that with 18 hours of paimon's VO extracted from genshin's game files (basically longer than RyanSpeech dataset at 10 hrs but shorter than LJSpeech at 24 hrs) and it really didnt do much, and more extreme pitches are still inaccurate.
IMO 15-60 mins is the sweet spot. i hardly have issues with audio files with that length.
how much epoch was that?
i actually only tested it with 3 epochs bc the dataset was so long that it took so many steps just for a single epoch to finish
is it true too much epoch cant do out of the sample?
Ayo? @cinder roost level 1 !!! 
it already hit 1755 steps on the 3rd epoch
if you overtrain it (i.e. too many epochs) then the model simply doesnt get any better. usually i just stop as soon as the loss goes back up (use tensorboard to check)
is there any sweet spot for steps?
steps per epoch are determined by the audio duration. IIRC for 15 mins of audio you just have to do 200 epochs, 30 mins you can do 100, and so on
Ayo? @urban viper level 2 !!! 
no wonder its better when i tried 60 epochs with 1hour+ than 500 epoch with 10mins sample
there's workaround you can do, in infer_web.py, find the code part with kmeans thing like this and remove it, so the index file won't shrink (will be 600 MB+ for 1h+ dataset as supposed to be)
i tend to just eyeball it really, i dont usually worry too much about overtraining bc if you save every like 10 epochs or so, and it starts to overtrain too much you can just fall back to the previous one before it overtrained and use that model
ah, so thats why if i try to put audio files that are too long, the index file becomes very small (i did this with stanley parable narrator, and then hu tao VOs and they switched to this other feature index training method when it does that). thanks for the tip
does anyone know a TTS for local vrc?, I have this: Applio-RVC-Fork, but booting doesn't work, do you know another one? Thank you
really thank you very much, thank you for your availability
no, I'm just trying to have some voices to dub a film I made myself, Lego Movie style, but my voice isn't beautiful and doesn't fit any model I cloned, so I try to use tts, thanks
Thank you
sorry those link isnt work. i cant see the image
check pinned
how to use rvc nvidia?
elaborate
how do I use the unpacked dataset zip?
Any idea why I get this?
google drive link doesnt work
Yeah but why this can be?
imma try adding it soon ig
cause google drive link isnt a zip file
doesnt automatically download as a zip file
Okay, so what should I copy paste in this?
If not a google drive link?
hugginface model
Thank you
Can I help you?
anyone can help?
Ayo? @eager vine level 3 !!! 
nah its good it shouldnt take 30 mins

@meager terrace can you send the google drive link? need to test
aight works now, btw i dont think that model will work
why that ? 😮
whats the best model for removing background vocals in uvr?
vocal FT i think
But thats strange caus I received only D & G, nothing else
its in the weights folder
look for a weights folder
Yeah didn't found it, will try once more
thankss
Why is it so glitchy
w-okada?
What's that
the voice changer
This thing
yeah i think that's w-okada
No but when I say something it doesn't come out as anything
Me or you. What
me
How the absolute hell
me
I HAVE NO IDEA
either i like missclicked and steam opened
but like
install nvidia cuda toolkit then try again
idk
cuda is installed though
Should i just start reinstalling with the script i did?
this is the only local one ik how to use and i'm too lazy to download other ones
ok then ¯_(ツ)_/¯
try it
kk
I get to this
so Cuda was reinstalled. But what the hell is cuDNN
I don't really know what that is, but I think you should
NVIDIA CUDA Deep Neural Network (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. It provides highly tuned implementations of routines arising frequently in DNN applications.
so its Somethign from Nvidia for Neural networks
Huh. access denied
nvm I just applied for DOCA and got it
But yeah. This honest is a Very straightforward process from the video I was recomended to
yo @proper shale how much do you reckon batch size affects quality? For example like batch size 4 vs 20?
Ayo? @remote verge level 5 !!! 
im currently doing a 15min hq dataset with 20 batch size and getting like 9 seconds per epoch, so it is definitely a massive upgrade on speed, but not sure if this'll make the overall model turn out like shit
mind sending the script ?
might be borked
Reinstalled it. Still no GPU detected and cuda AND cuDNN are installed
any more ideas?
I'm guessing cuda version mismatch ?
script downloads 11.7
You have 11.8
But damn that script looks good
Except for the run as admin part
The script installs 11.8...
.
Yeah but it installs AI stuff from https://download.pytorch.org/whl/cu117
Which expects 11.7
I know how you could maybe fix it, requires modifying a few lines by hand
Ayo? @boreal grotto level 7 !!! 
Nah it is not
Im still a bit new to python but if I am reading that properly. That is searching for allocated GPUs with the numbers of 16?
or am i dumb
It fails at the start which just attempts to see if there are any nvidia cards that it can support
It does that by first checking if CUDA is operational and then if there are any devices
If it fails either checks it'll print out "No supported nvidia gpu found"
the "16" check is for some other stuff. GTX 1XXX cards have issues if something isn't changed
Ah that makes sense. Since in the video Im watching the person is using a 30 series card as well (RTX 3080 ti to be specific)
so should i just force an install for cuda 11.7 with npm?
Just got the readme.md file.
want me to send it to you? it is partly in Chinese
Sure
Oh yeah
Alternatively, you can just get the releases from the project directly
without any third party scritps
Link is found under the "spaces" icon on the github repo, it has some of the files and also releases for people to download and run
So that would be what I need?
and if I were to install stuff there. would it cause problems?
sorry. But why the hell is this file Larger than RVC itself
per epoch, 20 is worse than 4
How big is RVC itself for you ?
4.8gb
but if g/total has a big drop its okay ig
The zip contains the code, as well as all the python libraries, and AI model files
I'd recommend using lower though
Back to this. I did find something in the ps1 file thats called with run-rvc.bat
Im assuming this is where you saw it and what you were talking about?
That's something also different
That's the code related to the displayed GPU list in the UI
It does some weird stuff to see if it thinks a GPU is supported, and if it decides it isn't, it won't display it. But the GPU can still be used, it's just that it refuses to display it in the UI
Ah. but i believve my issue is the problem where it refuses to even launch and run with a gpu
Yeah
upon startup it says that no GPU is available
Same script. I havent tried the other oen yet as it is still downloading
Can anyone send me a complete step-by-step tutorial?
I hate my "1gbps" cable speeds when i only get 10mbps from it
for what?
fr bro
local rvc
check here #1159513888199540817
my router gets 400 mbps. and my modem returns 1200 mbps. and i get 10mbps over supposed "wifi 6" wireless
https://www.youtube.com/watch?v=Y8IxVVQBEpc&t=1s This is what I have been following.
as it was recommended to me
I just opened the requirements text file
joblib>=1.1.0
numba==0.56.4
numpy==1.23.5
scipy==1.9.3
librosa==0.9.1
llvmlite==0.39.0
fairseq==0.12.2
faiss-cpu==1.7.3
gradio==3.14.0
Cython
pydub>=0.25.1
soundfile>=0.12.1
ffmpeg-python>=0.2.0
tensorboardX
Jinja2>=3.1.2
json5
Markdown
matplotlib>=3.7.0
matplotlib-inline>=0.1.3
praat-parselmouth>=0.4.2
Pillow>=9.1.1
resampy>=0.4.2
scikit-learn
starlette>=0.25.0
tensorboard
tqdm>=4.63.1
tornado>=6.1
Werkzeug>=2.2.3
uc-micro-py>=1.0.1
sympy>=1.11.1
tabulate>=0.8.10
PyYAML>=6.0
pyasn1>=0.4.8
pyasn1-modules>=0.2.8
fsspec>=2022.11.0
absl-py>=1.2.0
audioread
uvicorn>=0.21.1
colorama>=0.4.5
pyworld>=0.3.2
httpx==0.23.0
#onnxruntime-gpu
torchcrepe==0.0.20
huh
where do you see the fusion model in the applio UI? or any collab link? i want to try it out 😄
SO should i Delete my RVC download and use this other archive instead?
You need mainline RVC for that
Yeah
Unless you really want to fix the script
i see, is it in collab?
Ayo? @fluid horizon level 2 !!! 
I dont really feel like it. My only thing. does the other archive have a web ui
alr. time to uninstall and reinstall lol
You can find the download link here: #1163883017446629508
It's just that you are using the developers' own release rather than a script someone made
Yeah. so bugs are apparent if it is actively being commited on right
Bruh. im going from my C drive to my Data drive. an M.2 to a SSD
rmvpe
without GPU?
Yeah. I dont know what the rmvpe_gpu is but what I assume it is, is that it will run across multiple gpus instead of one
ok lets try
My god
i just got like 40 of those messages for all different files
mainly .pyd files
Could it be that the windows archiver for 7z is broken ?
I dont even know when I installed 7zip
Yeah that's the default windows support for it then
It's fairly new, it might be broken
Oh wait. I think im unzipping through windows and not 7zip
yeah I wasn't I was using default windows file explorer
Abort and retry with 7zip?
Yeah
Is this 7zip or still windows explorer ?
extract files
i got to make it work in applio! but the resulting model is raspy and breathless XD
hey i have a rvc voice model, how can i make my voice recording sound like it??
which pitch algorithim sounds the best?
rmvpe
@molten pecan
What is this that i got when I installed https://huggingface.co/lj1995/VoiceConversionWebUI/blob/main/RVC1006Nvidia.7z
The YouTube tutorial teaches you how to fix that issue too
-egirl
And also how do I run the above link i sent for that distro
