#✨│ai-help
1 messages · Page 220 of 1
U prolly was using mangio fork
How long is ur dataset
8 would work fine
kk im gonna try it again now, Im pretty sure it was the audio sample rate or something
30 min
when you download audio from 15 year old answering machine with shitty dac
Hello guys, Im new here and Im just wondering about the best app or open source AI that I can use to clone some voices
more like 20-year old
i remember I had one and could not even understand the messages people were leaving on it
Hey guys i need help ... what is this effect called ??? Where can i generate similar kind of effect for my images ... Pls help TT
https://youtu.be/o1OsDWT_DUc?t=29
Hina Mod doesn't seem to be working, it's not giving me the gradio link
Like it's not working at all
idk what this means
nvm
@potent badger @odd shale Can someone help me with this?
it means the requirements have not been installed
How do I fix it?
i imagine this would happen if you just ran a clone of the app and skipped ths step that installs the required libraries
but since this colab is not in the list of workin colabs, I assume it is dead
I tried hitting Run all and I even tried them in order.
There isn't other cloud based solutions that have mango-crepe. Only just crepe.
mangio crepe is crepe with flexible hop length
RVC erroring in inference.
“AttributeError: 'NoneType' object has no attribute 'tobytes'”
Any ideas why this is happening?
which does not fking work anyway
How so?
full error dump pls
What does that include?
the whole stack of errors so we can see what started it
I can only guess you're trying to infer a file that ends up as None object that gets sent to the pipeline and that is not good
pitch, pitchf = self.get_f0(
File "/Users/me/pinokio/api/rvc.pinokio.git/app/infer/modules/vc/pipeline.py", line 149, in get_f0
self.model_rmvpe = RMVPE(
File "/Users/me/pinokio/api/rvc.pinokio.git/app/infer/lib/rmvpe.py", line 563, in init
self.model = get_default_model()
File "/Users/me/pinokio/api/rvc.pinokio.git/app/infer/lib/rmvpe.py", line 544, in get_default_model
ckpt = torch.load(model_path, map_location="cpu")
File "/Users/me/pinokio/api/rvc.pinokio.git/app/env/lib/python3.10/site-packages/torch/serialization.py", line 986, in load
with _open_file_like(f, 'rb') as opened_file:
File "/Users/me/pinokio/api/rvc.pinokio.git/app/env/lib/python3.10/site-packages/torch/serialization.py", line 435, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/Users/me/pinokio/api/rvc.pinokio.git/app/env/lib/python3.10/site-packages/torch/serialization.py", line 416, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'assets/rmvpe/rmvpe.pt'
Traceback (most recent call last):
File "/Users/me/pinokio/api/rvc.pinokio.git/app/env/lib/python3.10/site-packages/gradio/routes.py", line 437, in run_predict
output = await app.get_blocks().process_api(
File "/Users/me/pinokio/api/rvc.pinokio.git/app/env/lib/python3.10/site-packages/gradio/blocks.py", line 1349, in process_api
data = self.postprocess_data(fn_index, result["prediction"], state)
File "/Users/me/pinokio/api/rvc.pinokio.git/app/env/lib/python3.10/site-packages/gradio/blocks.py", line 1283, in postprocess_data
prediction_value = block.postprocess(prediction_value)
File "/Users/me/pinokio/api/rvc.pinokio.git/app/env/lib/python3.10/site-packages/gradio/components.py", line 2586, in postprocess
file_path = self.audio_to_temp_file(
File "/Users/me/pinokio/api/rvc.pinokio.git/app/env/lib/python3.10/site-packages/gradio/components.py", line 360, in audio_to_temp_file
temp_dir = Path(dir) / self.hash_bytes(data.tobytes())
AttributeError: 'NoneType' object has no attribute 'tobytes'
FileNotFoundError: [Errno 2] No such file or directory: 'assets/rmvpe/rmvpe.pt'
you dont have the rmvpe model for some reason
using mangio(?) instead of applio on mac?
and pinokio of all things?
yeah, good luck with that
Ha! Well. I’m trying to run mainline RVC on Mac.
okay, so whatever reason it failed to download the f0 extractor model
As I’m having issues with Applio sound quality on Mac.
Only file in the rmvpe folder is rvmpe_inputs.pth
you can grab it here and put where it needs to be https://huggingface.co/IAHispano/Applio/tree/main/Resources
Okay, I have done that. But still giving me an error message.
Send the error message
I don’t get an error message. It just stops at “2025-03-31 22:36:58 | INFO | infer.modules.vc.pipeline | Loading rmvpe model,assets/rmvpe/rmvpe.pt”
Did u install all the dependencies
Not sure.
Recheck
And if they’re all installed it might be a vram issue
How do I verify they all installed correctly?
as long as it does not throw 'module not found' it is fne
How so
The file it produces is crackly and not of as good quality as running the same inference model in RVC Mangio on Windows.

[W socket.cpp:697] [c10d] The client socket has failed to connect to [localhost]:28625 (errno: 99 - Cannot assign requested address).
i recommend using one big file as a source and let Applio cut it
because whatever you're doing is not working
show me the contents of dataset folder, there must be something wrong
screenshot the folder contents in imjoy file manager pls
there should be like /program_ml/assets/datasets/YourModel that should contain wav file(s)
ok I died
let me do everything again
as said above, a single wav file is recommended for the applio slicing method
yeah, I just simple slicing
using one big file
I think its a problem with kaggle or idk, because I do the same on local and works fine
I just notice that in kaggle doesnt create .spec.pt files
I think that its the problem
spec pt are created on the fly as the files get loaded into the loader
in your case nothing got loaded
oh
no that is in the processed model folder, I meant your dataset folder
show the content of filelist.txt
ok wait me a sec
but if like so, it should be fine
I just put this on dataset creator
can you show kaggle ui for preprocess and extract features?
and log for extract features cell
seems like you forgot to do feature extraction
or see the log if you have done it
Im using crepe, but with rmvpe gives me the same error
when training
then check if the index file has already been created
I can create index
not "can" but the whether the index file has already been created
no, in the model folder in file manager
there it is
okay, so extractr features runs, but the filelist is still empty?
well
then try start training
Yeah still empty
is it in index training process or not?
no
aaah sorry i dont understand
when I press start training gives me this
are u using applio main branch? (without -3.2.8)
yeah
eh
wait me a sec
I removed it
but I think I removed something else with it
--branch 3.2.8-bugfix
y removed this
It is ok?
it is the release branch, not the main one which is more recent
repeat after me - you can not train with an empty filelist.txt
but it should not be empty
if the extracted file is full of files, it means the process ran
just did not save the list for some reason
Can someone help whenever i open go-web.bat it says "D:\RVC 2025>runtime\python.exe infer-web.py --pycmd runtime\python.exe --port 7897
'runtime\python.exe' is not recognized as an internal or external command,
operable program or batch file.
D:\RVC 2025>pause
Press any key to continue . . .'
there is a space, go rename the folder to RVC2025
Now what?
there shouldn't be empty filelist.txt in the main branch
Yeah, but I dont know why applio its keeping filelist empty
still saying the same thing though

try doing manually
How? Im new to this
please show the error message again
after you rename the folder to D:\RVC2025
D:\RVC2025>runtime\python.exe infer-web.py --pycmd runtime\python.exe --port 7897
'runtime\python.exe' is not recognized as an internal or external command,
operable program or batch file.
D:\RVC2025>pause
Press any key to continue . . .
if the python.exe doesn't exist, redownload the whole installation
k
how should I generate the index from colab correctly?
I don't get what you mean, please show the screenshot with error message
and make sure you have done previous steps without issues
hello, I don't mean any error, my question is what is the correct way to get the index when you are training a model for the first time.
pd: i cant send images
please read the guide in this section https://docs.aihub.gg/rvc/local/applio/#3-train
Last update: Apr 01, 2024
thanks a lot!
whenever I create a model, and generate its respective index, I get an error when I use it in playground for example, in the applio web. Is it because I generated it after starting the training?
you didn't mean any error and then you mention the error? 
!give-media-perms 1h @atomic sparrow
no, the doubt before was precisely because, from previous times, the index had given me an error XD
this is the error, any message, only this
Generate index
It's download model
no, I am showing the section to add files, which is just below, the top is something else.
because you asked how to "generate index" I suppose you mean for training models
for existing models to add, it should have the pth file, and the index file within the zip
some older models might not have index file, and there's no way to generate it, except by the model maker with the model's preprocessed files
the inference can work without index file tho
the pth, I add it in the normal file and the index I have to put it only in a compressed zip?
ok I didn't know about this, thank very much
How long does it usually take for the colab to hold without finishing the execution with the training? or if there is any way to lengthen that time?
yo guys what voice changer for this is the best
depend on time per epoch and total epochs (the cell can be stopped during the training session whenever needed)
go to #🔍│help-w-okada then read the pinned guide there
no, it just needs to have the index file name in a specific format
MyModel.pth and MyModel.index should work, I think
im gonna do this but i'm gonna feel insane
like i'm literally talking to myself
it's easier in vc cause i'm actually talking to someone 😭
yo i cant find the pth file using Applio, how do i generate it or where do i find them?
Can u elaborate
i have absolutley zero pth files on logs
Are u training a model or just to infer
training
should it they be on the Applio backup folder
Ss the colab file explorer
If ur using google colab
They should be at program_ml/logs/modelName
In ur case it’s the nett1-PossoMigliorare-1
which one is the right one?
Use tensorboard to find out the best epoch
Lowest point on the g/total graph
because i got this error so i thought it would help me
what should i do?
I don’t see an error
Restart the training without the overtraining thingy
Click refresh
Yea
ok but the tensorboard doesnt show me anything even when i reload it
ayt bet
Are u sure
yeah im going to restart now and imma tell u if it works
Word
there was a bug that if you ran training with save every epoch = max epoch, the final model did not get saved
so basically don't use overtraining detector?
the overtraining detector is pretty much useless
but it was not the issue
the issue was that you did not save models every 10 or 20 epochs
anyway, I've fixed it in the latest code
so what should i do differently sorry bro but im a dumbass
and dont bother with overtraining detector, use the tensorboard
what if i max out to 1k the total epoch is it bad?
What's a recommended amount of training data for a voice model? 10 minutes?
depends in how expressive and diverse the dataset is
a 15 minute expressive dataset can beat a 30 minute monotone one
but also a 30 minute expressive dataset can beat a 15 minute expressive one
do i need to download this if im using the collab?
Already. I got like an hour total of training expressive good quality audio, I just thought it'd be too much lol
1 hour is already enough to train models, it is not recommended to use more than that 🦈 🤙
bro help me out too
U don’t need to download shi if u using colab
Everything is ran in the cloud
ok its working now bro, Big thanks fr
y'all last thing is it possible to OT at epoch 38 with 20 mins of content?
guys i got this error again
idrk how to fix bc i just came accross this but try kaggle if its not working on colab
unlikeley but depends on ur batch size too
I'm not sure but I only use kaggle one with working tensorboard
can someone teach me how to make an ai voice model?
start from this guide https://docs.aihub.gg/essentials/how-to-make-voice-models/
In the context of RVC, the dataset is an audio file containing the voice the model will replicate. It can be either speaking or singing.
what services do you guys use to implement the zip file
guys what iss the best custom pretrained
the default pretrain if not sure, or this https://discord.com/channels/1159260121998827560/1339155300720054316
😮
oh and
if i already split my dataset
into smaller files
how can i prevent applio do it again
is disabling cut_preprocess help
10 minutes is a recommended minimum
our docs have a good amount of useful info https://docs.aihub.gg/rvc/resources/dataset-isolation/
15 minutes of audio, where the person uses their full vocal range to get decent results
increase that to 30-40 minutes to get even more natural results
better not do so, or join into a whole file then use the applio slicer
I made vosk stt api and don't know how to change voice like anime girl =))
I use hugface as api
thx guys
err,
errm, i forgot where can i find the output of the trains
so that i can download in case overtrained
i used rvc disconnected before, but this is my first time use applio
Applio/logs/yourmodel
and the tensorboard file (tfevents) is at the eval folder
wait till it saves some checkpoints
in Applio it should be in the same model logs folder
you didn't enable save_every_weights
you have to start over training (delete D_2333333.pth and G_2333333.pth files first)
and also eval folder
;-; damm i have already trained like 3 hours.. are there any chance that i can continue train from the current status, but with save every weight ?
switch to another account, then load the preprocessed model files
again, since you missed the model checkpoints you should start over training
its okay, i mean i just want to prevent OT in the future
it still ok for now
if now i press this button, can i export current model and continue training after that ?
the better way to tell if the model is overtraining or not is to test several checkpoints besides the tensorboard graph
😮
can i do this ._. ?
dammm i stopped and now need to train from 0 again
there's resume training cell
that loads files from back-up back into the training folder
so you can resume from the last saved weights
he didn't check save_every_weights and he missed saved model checkpoints in the last training session, so I'd recommend starting over
can always resume from d/g weights and do more epoch with the setting checked
or simply extract the model from g
-colab
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
link rvc?
can rvc be used real time in voice chats like discord or gta?
nope, only prerecorded ones
but if you want that, go to #🔍│help-w-okada and read the pinned guide there
thanks
10x!!!! Worked perfectly!
well, inference works, trainining wont
hey I have two AI voice models (.pth and .index) that I trained separately, and I was wondering if there’s a way to merge them into a single model in Applio.
Is it possible to combine both .pth files and .index files to create a single voice?
?
pth is the model
index is the accent
you cant mix them
if you have two pth files then yes
i want to mix the 2 pth files
U can use the index of ur choice
Between the 2 pths
So, how can I merge two .pth files to create a single model? Can it be done within Applio?
the two models must have been trained on the same sample rate
the first model is the main model
the second model is the secondary model
blend ratio is how much data of the main model do you want to keep, for example a blend ratio of 0.5 will retain only 50% of the information of the main model, the other 50% will come from the secondary model
How do I upload my rvc model here?
anyone have a tutorial on how to create an ai song with these models?
nvm i got it
Not available yet
Does anyone know what's the best ai cover maker that I can run locally?
update gradio to 5.23.1
what is the difference between crepe and crepe tiny im confused, i was told crepe is best for male voice but only crepe_tiny is avalible, how do i download crepe?
Who have link for Collab training?
First, what's your PC GPU?
Anyone else having trouble getting Ilaria to work?
Because every single time I use it with any model, I get an error instead.
I've uploaded my own voice models five times in a row now, testing them with different files to try and get it to work, yet nothing has changed.
Phone
you didnt mention the error message, so I'd have to speculate:
- "GPU quota exceeded", try using smaller audio file, or wait for another hour, login your account
- audio file not readable, try re-exporting using audacity or use another software with "include metadata" disabled
- invalid/corrupted model pth/index file?
- 500 internal server error or random error in some internal code, you're cooked at this moment
Looks to be a 500 or a random error, because I tested the model with Applio and it worked just fine.
tell:
- your pc gpu
- what u want to do
- what link are you using
oh damn
It's Ilaria, not local.
Train (make) RVC Models on cloud:
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
- Be sure to know about the tensorboard
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.com/ which ofc uses RVC
RVC Inference (use models) on pre-recorded audio on Cloud
You can use either:
- Weights.com: Easiest Possible Ever Automatic
- Ilaria RVC Zero: Fastest free on cloud
- Applio UI Colab: RVC Fork with some extra features like TTS
- RVC AI Cover Maker UI: Automatically Separates the Vocals and Instrumentals, converts the voice and mixes them back
@low shard Yo something wrong with Kaggle applio I'm not sure what
yes, and I asked for your pc gpu, and what link are you using
i asked your pc gpu so i could tell you if it’s good enough to run it
and i asked your link to check if u are using the updated one
I meant good enough to run it locally, since local doesn’t have cloud limits
-spaces
Huggingface Space by Eddy and Ilaria
Huggingface Space by thestingerx
Huggingface Space by r3gm
Huggingface Space by IA Hispano
HuggingFace Space by Nick088
Suggestions for @fathom reef
- UVR5 UI, by Eddy and Ilaria Huggingface Spaces
- Ilaria RVC Zero, by thestingerx Huggingface Spaces
- RVC⚡ZERO, by r3gm Huggingface Spaces
- Applio, by IA Hispano Huggingface Spaces
I'm not even using Colab.
I'm using the HuggingFace space, exactly as you mentioned.
When restarted
"If you are going to train models, upload your dataset to your Google Drive storage" what is a dataset
How I get it
Is the audio file?
you should understand the term from the guide
https://docs.aihub.gg/essentials/how-to-make-voice-models/ https://docs.aihub.gg/rvc/resources/training/
In the context of RVC, the dataset is an audio file containing the voice the model will replicate. It can be either speaking or singing.
Last update: Dec 24, 2024
Thx
@simple ore
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/gradio/queueing.py", line 625, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/route_utils.py", line 322, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 2137, in process_api
result = await self.call_function(
File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1663, in call_function
prediction = await anyio.to_thread.run_sync( # type: ignore
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 967, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 890, in wrapper
response = f(*args, **kwargs)
File "/content/program_ml/core.py", line 605, in run_model_blender_script
message, model_blended = model_blender(model_name, pth_path_1, pth_path_2, ratio)
ValueError: too many values to unpack (expected 2)
got this error when i tried to blend
I just tested it works
what’s your model download link and pc gpu? I’m asking your model download link so i can test it myself, and your gpu in case it’s good enough to use local tools instead of cloud which is limited
Sorry, that model is private.
there is another program running your ngrok token, maybe delete the notebook and retry all the guide
And for the last time, I'm not giving my PC GPU, since Ilaria runs on the cloud.
models have to be the same sr/type
they are?
I exactly did that 😭
there can be an issue if one model has sr 40000 and other older model has 40k
show the end of it
it looks like gradio/pydantic issue, a fresh clone should work
It has a .index and .pth right?
you sure there aren’t other useless files?
i can’t help much with little info and without testing but alright
Of course it does.
I'll show when I got home
I know what I'm doing, and it just started randomly working all of a sudden.
I don't know what happened, but that was weird.
I'm sorry if I'm being rude, I'm in a really bad mood rn.
I just asked so you could run it locally if it’s good enough, there’s local tools that run on it and don’t have limitations, unlike, for example, huggingface zerogpu used by ilaria rvc which limits how long your file input can be, and google colab that has max 4 hours daily of gpu that can be even less. There are limitations because those run on a remote good pc gpu, which are expensive
Sayung the gpu name publically isn’t a unique unifier, there are thousands of pc with a gpu like yours, im saying this if you were worried about being tracked by your gpu name
Anyways, your choice
i hope I was clear enough, anyways I will respect your choice and understand
does it not upload at all, or just not inference?
Fine. I don't think I'll ever be able to run anything at all locally, considering my GPU is an RX 580.
it’s fine, everyone has a bad day, sorry if i seemed rude too
It's resolved now, but there was a random error when using inference.
it worked?
are these public models?
it could actually run it locally via applio on zluda iirc (which is a cuda emulator) on windows, except that it will be slower,
I asked you because we have seen users with high end rtx or 1k dollars gpu using google colab, because many think that they need 100k dollars pc to run it locally or think ai is only in sites
ohh, that’s great to hear
I hope you have a great day then, for any other issues be free to ask there
I hope you understand I was just trying to be helpful, and sorry for any inconvenience,
About that random error, it might be zerogpu sometimes having issues, it’s a service that ilaria pays for her huggingface space, which basically uses shared high end gpus for multiple huggingface spaces
tl;dr the zerogpu spaces are usually meant for "trying stuffs/demos". so as said above it's recommended to use applio zluda or run in cpu mode (slower but still feasible)
Ohh it’s prolly dat cause one is trained on applio and the other on mainline
There isn’t a way to make it work?
delete these two lines in rvc/train/process/model_blender.py ``` if ckpt1["sr"] != ckpt2["sr"]:
return "The sample rates of the two models are not the same."
it may still fail with a very old model that has no settings saved
need to check the .pth values
-hf
- UVR5 UI, by Eddy and Ilaria Huggingface Spaces
- Ilaria RVC Zero, by thestingerx Huggingface Spaces
- RVC⚡ZERO, by r3gm Huggingface Spaces
- Applio, by IA Hispano Huggingface Spaces
Huggingface Space by Eddy and Ilaria
Huggingface Space by thestingerx
Huggingface Space by r3gm
Huggingface Space by IA Hispano
HuggingFace Space by Nick088
somebody forgot to connect the notebook to the gpu
I’m getting a lot of static /metallic/rattling sounds on my model. Any ideas why?
Ss sounds in particular.
Apart from that, it sounds good.
robotic esses happens due to either overtraining or the dataset not having enough sibilants
I did 350 epochs. 15 to 18 mins of clean audio.
help its not working and i can upload a photo to show my problem
The system cannot find the path specified.
a bit like this @simple ore
ot?
click off the middle button, then fit the whole thing in the chart
how download RVC on MAC OS ?
RVC as in realtime voice changer or as in retrieval-based voice conversion?
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
suboptimal epoch, "overtraining", poor lossy source and/or muddy voice
Source is good and clear. I recorded it myself through a condenser mic.
As I said, I did 350 epochs with 15 to 18 mins of audio.
Too much epochs? Not enough audio clips?
try comparing it with lower epochs
what speed do yall think my 3060 ti can run on without having a monitor plugged in or anything
only the voice changer
Mate, this isn't where you looking for W-Okada. For W-Okada, go to #🔍│help-w-okada. This #✨│ai-help is all about RVC programs and related tools.
How much lower?
im having trouble when dropping a index file in applio it just say error can someone please help me
not a good idea to ask anyone before trying to see
error what? pls show it
just says error
error what?
but it was just that model i think
screenshot pls?
!give-media-perms 1h @wind fulcrum
don't forget the console window
im really new to this
So there isn’t a recommended average of epochs to use? Because at the moment, I’m just guessing.
the tensorboard graph can be the guidance but something unpredictable could happen
So I have to run again, and then look at the tensorboard?
Theres alot more to it but basically yes
Best amount of audio to use?
Like The mod said, use tensorboard. You can ask some mods how to 'Find the best epoch / detect for overtraining'
Which you mainly do with tensorboard and by ear
This is the console window for Applio. It can be either CMD or Windows Terminal.
This is the actual Tensorboard. https://cdn.discordapp.com/attachments/1159290139609137264/1355196249476960519/image.png
kaç yapmam gerekiyor?
what should i do my voice just disappears and sounds strange?
Are you talking about realtime voice changing?
If so
Wrong channel
yes
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
and a voice delay of 5+ seconds
Then elaborate your request in #🔍│help-w-okada
Are there websites like Weights.gg or anything cloud based stuff that i can do for free since my GPU is a bit poopookaka
Where you can train a model using the cloud?
I remember there being a command but i forgot it xd
- Applio Notebook, by Vidal Kaggle
- Applio Notebook, by Shirou Kaggle
- Music Source Separation, by Shirou Kaggle
- UVR5 NO UI, by Eddy Kaggle
- Original W-Okada's Voice Changer, Kaggle
- Modified W-Okada's Voice Changer, Kaggle
- 🆕 UVR5 UI, by Eddy, ArisDev & Nick088 Kaggle
- 🆕 RVC AI Cover Maker UI, by Shirou & ArisDev Kaggle
- 📖 How to use RVC Mainline on Kaggle by Cauthess
Note: Kaggle limits GPU usage to 30 hours per week.
Kaggle is a Cloud (Remote Good PC) Service that offers 30 hours of GPU weekly, but needs a phone number verification
by Vidal Kaggle
by Hina & Deiteris
Kaggle
by Eddy, ArisDev & Nick088 Kaggle
by Eddy Kaggle
by Shirou & ArisDev Kaggle
by Shirou Kaggle
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Hi, I have an audio generated that sometimes is read fast and sometimes slow. Do you know if there's any tool, possibly AI-based, that can automatically recognize those parts and retime their speed? Perhaps by first establishing what a normal speed should sound like, and then making the fast and slow parts coherent with that standar
there's google colab, kaggle and ligtning ai, those aren't as easy as weights.gg tho
all 3 got a free tier
btw weigths has a free tier too
if you're looking for a site that gives you remote good pc 24/7 for free without any limitations, nope that doesn't exist, gpus are expensive
Hey! I tried blending two voice models in Applio, making sure both .pth files use the same sample rate. I set the first one as the main model and the second as the secondary
But I’m getting this error:
File "C:\aplio3\Applio-3.2.8-bugfix\core.py", line 605, in run_model_blender_script
message, model_blended = model_blender(model_name, pth_path_1, pth_path_2, ratio)
ValueError: too many values to unpack (expected 2)
the two models were trained in applio? or one was trained in applio and the other in mainline?
back then someone had the same problem due to one model being trained in applio and the other in mainline
I'm not sure actually. I trained my model in Applio, but the other one I downloaded from weights.gg, so I don’t know which training method was used for it
if the model was trained using weights trainer then it's using mainline
use extras/model info
on both files
and see what they show
oh yeah thats a good way to tell the trainer used for each model
mainline is RVC1006Nvidia.7z ?
yup
not what I asked
in applio there's a screen that you put the model path in and it shows the info
?
looks like the first one was trained using mainline
makes sense why u can't merge it with the applio one
32k vs 32000
So, do I need to retrain one of the voices with mainline? can I merge them using mainline ?
would be better to train both voices in applio because mainline havent got an update since almost 3 years now
but just in case, don't use ai outputs as dataset
get the actual real voice and train that instead
Ty, but I can’t retrain the voice since it’s an AI model specific from weights.gg 😦
What do you mean by that?
for example, training a dataset using audio made by AI will give mediocre results compared to a dataset with real human voice audio
I think I make content for people who are kind of messed up, like in the most beautiful way 🫶🏼✨💜 (myself included) 🦋
To support this channel:
paypal.me/amandasilverabiz
Amazon wish list! : https://www.amazon.ca/hz/wishlist/ls/H16E1PMZ70K6?ref_=wl_share
Bookings: Kelsey@nuancemgmt.com
IG: @llost.n.foundd
amandasilverabiz@g...
she's the voice of the duckus model
train that instead of using the ai model's outputs
option 1) fix model blender to remove the sample rate check
option 2) fix the model and change the sr value in the model
you need to do the opposite with '32k' model
^ this also works btw
How can I do that
edit this file
its work ty🫶
Hi. I saw a guy on youtube that did some fun, and really accurate ai voice trolling in overwatch with overwatch characters voices. And he said that the tech used is RVC/so-vits-svc-fork. I thought i wanted to try to do something similar. How?
can someone give me the best examples of setups rvc. I want to see its full potential bc it seems to me there are way better voice swap AIs(not free) - even 5$ elevenlabs does a way better job and requires no setup
sovits is so old
Any tts recommendation?
English. Not really, I'm gonna use voice model if available
I just want to create a voice tag that uses my fav character voices
for Orpheus TTS, also needs LM Studio to run the model
nice expressive reads
less expressive, but very fast - kokoro tts
Thank you
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
wha'ts this server about, guys selling voice ai models?
whats wrong with my rvc??
-gui
what do i use then bro
i found rvc to be better than weights for that specific model
guys i using gtx 1650 super.. can you give me link the lastest version for this my gpu?
-rvc
- How to use RVC Mainline Colab by Cauthess
- AICoverGen Colab Guide by Eddy (Spanish Helper)
- Create a model with RVC disconnected (colab) by Angetyde
Anyone has tutorial for RTX 50 series (blackwell)? All of the RVC are not updated yet.
for mainline rvc runtime\python -m pip install torch==2.7.0.dev20250311 torchaudio==2.6.0.dev20250312 torchvision==0.22.0.dev20250312 --index-url https://download.pytorch.org/whl/nightly/cu128
but it wont work until you go around all modules and also fix torch.load calls and add 'weights_only=True' parameter
^@brittle wing
thanks
what happend for this?
try the applio one
alr
Try Applio. Not really suitable for voice model training with this GPU, but fine for doing AI cover. https://docs.applio.org/applio
Try Applio instead. 
I using windows not macOS
tell me sir what voice changer device is suitable for my gpu gtx 1650 super
What do you mean? Are you looking for W-Okada the realtime voice changer and not RVC program?
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
go to #🔍│help-w-okada and read the pinned guide there
Yes am looking for real time voice changer
#🔍│help-w-okada is for voice changer
For W-Okada, go to #🔍│help-w-okada . But if you're looking for RVC the regular voice changer, stay here.
Okay sir thank you
No, W-Okada and RVC (Applio for example) are two different programs.
So? w-okada for real time voice changer?

😭🙏🏻
Next time, better write more context on what you're looking for instead of "I have a good GPU, but where can I download this thing?".
instead i asked what voice changer is suitable for ("my gpu is gtx 1650 super")
Am a new ,so i dont know
Still.
still not reading the guide in #🔍│help-w-okada ?


which voice changer is good for real time changing with GPU 4070ti?
can u provide link download please thank u
please go to #🔍│help-w-okada and read the pinned guide there
can someone help me?
i wanna backup the models and retraining but when i start training, is not continue model, is make new model... what should i do?
delete the existing model folder before loading the backup
unless u mean to save to a backup
okay
oh sorry wrong translate, i mean i wanna continue the model btw i have already the backup but when starts training, is not load the backup and start make new model
you just need to do the "load backup", but not the preprocessing & feature
since the latter would create a new designated model folder
alr
i keep having this problem when opening the voice changer it keeps saying failed to load URL
again please go to #🔍│help-w-okada and read the pinned guide there
and you another one
#🔍│help-w-okada
mbmb i didnt see channel
if there are 3 inputs and I have 3 hidden layers, will one neuron, for instance, take all 3 inputs but increase the weights of 2 inputs and not the third, while the second neuron focuses on increasing the weights of the first and third inputs and reduces the weight of the second, and so on? Is this correct?
I checked and I have 11.26 minutes of good audio. Can I train with that? @knotty moth
hidden layers are layers between the input layer and the output layer
talking about a "perceptron" or a neuron in a neural network. If you have 3 inputs (x1, x2, x3), for example, one perceptron might focus on the first and third inputs (x1 and x3) and give them high weights (e.g., 0.9 and 0.8) while giving the second input (x2) a very small weight or zero (e.g., 0.1 or 0). Meanwhile, another perceptron might focus on the second and third inputs (x2 and x3), giving them high weights (e.g., 0.7 and 0.9) and reducing the weight of the first input (x1) to something close to zero? is that right?
weights are usually learnable, you only control how many neurons on the input layer contribute to the calculated weight for the neuron on the next level
so what i said is right?
technically you can set the initial weights
so the training ai means change the weights
the training means the model adjusts the weights so you get the result you want from the input you provide
so what i said is right
?
not exactly
how do i make models
Any idea why Applio on Kaggle keeps erroring on me? @simple ore
Need to update gradio, I guess?
"An error occurred launching Gradio: When localhost is not accessible, a shareable link must be created. Please set share=True or check your proxy settings to allow access to localhost."
no idea
Not sure what I'm doing wrong. 🤷♂️
i dont understand
isnt there like a website to put my audio file and make it make the model
!rvc websites
#websites
bro please some one help i want the link for training in colaab
-collab
wtf
-applio
Not available yet
-kaggle
- Applio Notebook, by Vidal Kaggle
- Applio Notebook, by Shirou Kaggle
- Music Source Separation, by Shirou Kaggle
- UVR5 NO UI, by Eddy Kaggle
- Original W-Okada's Voice Changer, Kaggle
- Modified W-Okada's Voice Changer, Kaggle
- 🆕 UVR5 UI, by Eddy, ArisDev & Nick088 Kaggle
- 🆕 RVC AI Cover Maker UI, by Shirou & ArisDev Kaggle
- 📖 How to use RVC Mainline on Kaggle by Cauthess
Note: Kaggle limits GPU usage to 30 hours per week.
Kaggle is a Cloud (Remote Good PC) Service that offers 30 hours of GPU weekly, but needs a phone number verification
by Vidal Kaggle
by Hina & Deiteris
Kaggle
by Eddy, ArisDev & Nick088 Kaggle
by Eddy Kaggle
by Shirou & ArisDev Kaggle
by Shirou Kaggle
what?? u dont need to overreact
?
tnx tho
It's with 1 l
Also -colab exists
Have I tried checking if ur PC is good enough first btw
AI training never worked like that, this is open source ai, not ChatGPT
The only site that does this is https://weights.gg/ which uses RVC with a simpler version that automatically detects overtraining
Btw RVC can also be used locally on your PC GPU if it's good enough, weights.gg is a cloud remote good PC service
mb gng
hello do we have to cut the sample vocal to 59 sec for better results while converting (infer) the sample vocals to the model anymore?
the ilaria rvc mainline google colab is broken forever #📰│dev-updates message and got replaced by the ZeroGPU huggingface space, Ilaria RVC Zero
what is it based on?
@simple ore can you look at my tensor graphs and tell what you think?
Hey 😊
I have a, probably really dumb, question.
I can hear myself talking when I have the program open, but it's not being "piped" to the actual output?
I assume I activated some kind of setting by accident, considering I haven't had this issue before.
Does anyone have a quick idea ? ):
Why does my TTS applio result has breathing sound? It's from TTS there's no breathing, it's not from real person audio
example?
I can't send a file in this channel, mind if i DM you to send the audio?.
Nevermind the advanced settings has filter radius to decrease respiration and it works just fine with other model.
breathing may be coming from the model, usually there should be no such thing with tts
Sparse Connections make the input such that a group of inputs connects to a specific neuron in the hidden layer if, for example, you know a specific domain. But if you don’t know that specific domain and you make it fully connected, meaning you connect all the inputs to the entire hidden layer, will the fully connected network then focus and try to achieve something like Sparse Connections can someone say that im right or not?
how do i change the batch someone send me a video please
!howtoask
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
Which RVC program/link are you using?
That's not RVC program. That's a link to an RVC voice model.
oh well it keeps making a weird noise well whatever ill just stop using it
I'm asking you which RVC link you're using. Applio or W-Okada for example.

Crazy. Instead of continuing to solve the problem, you're just giving up at this point. 
ye more likely the model itself, perhaps trained using the old mainline/mangio with improper silence truncating and lack of denoising/noise gate, so it could hallucinate silences with some random breaths and noises
you should denoise the input audio before inferring
what are you trying to do? anyway, a fully connected network can set some weights to 0 during training, thus acting like a sparse network
I'm using Mainline on Kaggle. Is there a more up to date version I should be using?
Thanks. I'll try that again.
But last time I tried it, it kept giving me errors.
Where do I put the dataset to train with?
"TypeError: argument of type 'bool' is not iterable
An error occurred launching Gradio: When localhost is not accessible, a shareable link must be created. Please set share=True or check your proxy settings to allow access to localhost."
I keep getting this error with Applio on Kaggle.
mine is still rock solid, so try copying these codes to the Install and Start cells respectively (in the Start code, don't forget to set your ngrok token)
u can also try to delete the kaggle and redo the tutorial
all youtube tutorials are outdated, use ONLY written guides
what tutorial link did you use?
what's your pc gpu?
what do you want to do?
Thanks. That worked.
300 epochs about right for 12/13 mins of audio?
Also, is there a custom pretrained I should use, or will standard be okay?
unfortunately you can't rely much on experts for predicting the next stock prices
so you have to test the results and decide
Got it. Okay, I have started training and turned on overtraining preventer.
Bro gave up before I even ask him about his GPU.
@low shard are there anyone else having the similar issue on applio kaggle? if so, you could have my working script pinned here or in #📰│dev-updates
Just checking I understand this correctly -
"New best epoch 300 with smoothed loss_g 21.384 and loss_d 3.517
Training has been successfully completed with 300 epoch, 6900 steps and 33.725 loss gen.
Lowest generator loss: 20.398 at epoch 276, step 6325
Saved model '/kaggle/working/program_ml/logs/my-project40kapp/my-project40kapp_300e_6900s.pth' (epoch 300 and step 6900)
Saved model '/kaggle/working/program_ml/logs/my-project40kapp/my-project40kapp_300e_6900s_best_epoch.pth' (epoch 300 and step 6900)
my-project40kapp | epoch=300 | step=6900 | time=13:15:31 | training_speed=0:00:35 | lowest_value=20.398 (epoch 276 and step 6325) | Number of epochs remaining for overtraining: g/total: 50 d/total: 99 | smoothed_loss_gen=21.384 | smoothed_loss_disc=3.517"
Meaning 276 was the best epoch before overtraining and so I should use "my-project40kapp_300e_6900s_best_epoch.pth?"
I haven't seen anyone else having issues on applio kaggle
did you modify anything on the code?
it looked like to me that the user just did a copy of an old applio kaggle time ago
it was since the first 3.2.8 release was out and I did some tweaks
why still using overtraining detector?
What do you mean?
I used the overtraining detector in order to avoid overtraining.
what?
@fading vault @unreal lichen do -colab to check a list of google colabs in #🤖│bots ,and you will have sapphire's message
also, reminder that google colab is a cloud service (remote good pc) for people with a bad pc, and you should prob check your pc gpu first to see if it's good enough to run RVC or Wokada
google colab is a cloud service, an entire platform where anyone can share notebooks that contain code to run ai programs
the site itself is up, but i don't know which google colab notebook you're talking about, you should send the link
there can be colabs that are old, some other not working, some other works
send the link here, also remember to check #📰│dev-updates
it does not work the way you expect it to work, it only check that g total does not go up for x epochs
use tensorboard and test the actual model
@simple ore could u check if the kaggle applio works
Hasn’t applio selected the best epoch at the end?
i dont use kaggle
no
it selected "best" using the metric above, which does not work properly
Okay, so what do I check?
you check the tensorboard charts
For?
inspect the graphs and also test several checkpoints
so it looks like this
norm_g - stays below 1000, fm - flat or going down, mel - should always be going down, total g - converging at some value
if it looks like that, then you train until total g stabilizes
use charts from the scalar tab, avg 50 ones
since your fm is weird, you may want to play with smaller or larger batch size
or a different pretrain like klm 4.9 for hifigan
I did 4 batch size.
where do I get this?
Thank you.
So I should try again with KLM pretrain. And different batch size, higher or lower?
how big is your dataset?
could you update it with @knotty moth 's tweaks? i just checked myself and the kaggle notebook is acting weirdly
13 minutes.
not only that but also how diverse the dataset is (rather monotone or more expressive)
expressive as the person using their whole vocal range
It’s of me singing. It covers all my ranges , high and low. For example, me singing Iris by the goo goo dolls is one of the files.
Also, other songs with good ranges.
did you change learning rate or not?
I set batch size to 4.
I assume you didnt modify the other configurations
the only other setting I changed was to 300 epochs.
Which graphs are these?
does this looks like good dataset to start work on it ??
So I'm going to try batch size of two with 250 epochs.
very interesting
this the the raw !
is the raw also looks chewed ?
if yes then I got huge problem
you have noise at the bottom
yep under the 100 you mean right ?
but I didn't understand where you guys know the diff if its mp3 or compressed becuase both were .WAV
need to use a basic highpass filter to cut the noise
the difference is mp3 eats low level values
I'll try to find examples for this
cus I dunno what other Applio staffer to ask 
you can see chunks eaten out of the sibilant
it is okay to use mp3 for inference, but for training it is no ideal
yeah , I'm into training
I think my raw data already compressed that's why
like here it has the same gaps you marked
just denoise it
@knotty moth @simple ore batch size 2/250 epochs/KLM pre-train
How is this?
anyway to make the local version of AICoverGen work on m1 mac?
batch size 2 ? you lack of VRAM or you aim to get realistic audio ?
@serene horizon
bs 2 is worse
so I go higher?
minimum 4
so you're using klm 4.9?
Yes.
yeah, dont go less than 4 batch
not sure if sounds a bit muddy but not bad after all
idk how muddy sounds like so I can't really tell
😂
pls help voice cuts
Are you talking about realtime voice changer for calls/games?
If so, wrong channel
oh yea sorry
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
Use #🔍│help-w-okada and elaborate it here
It's fine dw brother
@knotty moth @simple ore This with batch size 5, 250 epoch, and custom KLM pretain. Is this an improvement?
give it a test
with the 250 epoch file?
yeah
Okay.
I'm getting alot of static before and after each line of singing, but the actual singing sounds clean.
is it okay if i add audio from youtube that has been upscaled using AudioSR to the dataset?
no
just use the original audio
so, is it fine if my dataset is a mix of lossless and lossy audio?
no idea, i never tried that
why was MPS for mac ever canned on applio? i downloaded the older version that still has mps support and set the threads to 6 and everything works flawlessly (even after the first inference)
im getting 30 second inferences on an m4 with 24gb of unified memory (usage peaked at about 22gb which might be a concern? started at 13gb)
i think i was chatting with you two about it last time @simple ore @low shard
sorry for the @, but in case you wanted to know for anyone else whos wondering how to run these in apple silicon or something
regarding overheating i cant say that happened to me either, the fans didnt kick up after 4 inferences in a row
mps torch blows
and whatever the math libraries there are hang up
there were multiple attempt to fix it, but the best we could do was just set opm_num_threads to 1 to prevent it from hanging up
you can manually re-enable mps
right okay i see whats going on:
Hanging after the first inference: Memory leak/lack of garbage collection. During an inference python reserves 7-13gb of ram which keeps getting bumped up by about 4-5gb per run, which would cause weak 8gb mac configurations to hang right after the first one. On my 24gb machine i get to 20 ish on the 4th inference
Overheating: People think they can do AI on a fanless laptop and then complain when it gets hot, works fine on a mac with a fan
theres no inference running rn and its staying at 20 with nothing going on after starting sub 10
hope this info helps anyone who comes across it 
I did add torch.cuda.empty_cache() after each inference, but it did not help
something else is leaking
manually unloading the voice does make it go down by a fair bit again
afaik the mainline rvc inference doesn't spawn some child processes upon starting inference, so it would bump the memory usage after a long audio inference
Does Applio Colab work without the cell that needs to be created or do I still need to insert it?
!uv pip install pydantic==2.10.6
hey, i have a lot of time that i dont use it rvc. and i noticed that i cant no use it in colab, i shall download it in my pc using phyton for try models, not for training
?
Should I just keep this the same?
Iirc it should be already fixed
You can use RVC in colab, it's just many colab notebooks are having issues
Google colab is not a platform dedicated for cloud (remote good PC) RVC, it's used for machine learning
The issue is related to google updates and the rvc notebooks creators, not colab
First, what's your PC GPU
You should use cloud only if your PC is bad
mi gpu is NVIDIA Geforce RTX 2050
.-.
Can u send me a link of a rvc colab notebook that is working?
still enough for inference
and better performance than the comparable GTX ones
are you looking for training or inference?
Since Unwa's big mel roformer beta 4 and Mel roformer karaoke doesn't work like expected, does anyone remember the old recommedation of models for vocal isolation & cleaning?
Those new models just doesn't right
not the older one, but try Gabox vocal Fv4
there is no better karaoke model yet, or consider BVE v2 in uvronline.app (formerly x-minus)
in mvsep.com there is also male/female (duet) separation
is copying the latest release file enough to update Codename’s fork, or do I still need to run run-install.bat?
Well, the problem of Mel roformer karaoke is it removed some important parts of the vocals
That's why I need another model
what "important parts"? some choruses, keyboard synths or guitars?
The lyrics to be exact
I'm sure the vocals may not be centered
which is supposed to be removed after all
since it is called backing vocals
So which model should I use? I cannot upload the vocal here since it might against the rule of uploading copyright
you don't
So I just abandon it?
I remembered there was an old model that I can use that used to appear in the aihub docs
But I cannot remember what it is, really
Wait, is it HP Karaoke?
Yeah, I think that
Yeah, that worked
Hello, where can I train a better ai voice models?
read the docs https://docs.aihub.gg/
Last update: Oct 21, 2024
so i have an issue right my rvc used to work with a response time of 1 sec now it doesnt work at all but it says its responding in 2.6 sec but even if i use my mic it doesnt work and ive tried changing version but it didnt help
please go to #🔍│help-w-okada and read the pinned guide there
alright thanks
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
by IA Hispano
Google Colab
by Hina
Google Colab
by Eddy
Google Colab
by Eddy
Google Colab
by Deiteris & Hina
Google Colab
by Shiro & Eddy
Google Colab
by Nick088
Google Colab
by Nick088
Google Colab
by Jarredou & Makidanye
Google Colab
yeah nothing there seemed to help its just like when i start it my voice barely works and even after i stop it it keeps going for about 30 sec to a min
you need a capable gpu as mentioned in the system requirements
i have 4060 Ti it was working great a couple days ago and for cpu i have i5 13th gen 13400F
uh mb if it is rvc inference
if the ui seems stuck, try restarting the program
its not that its like if i try to start it right and i talk its really laggy no matter what settings i use and when i stop it it keep going for about 1 min or 30 sec and i cant do anything with the program while its still going on
make sure you're not running any cpu intensive game/program
also close/debloat unnecessary background processes
already tried that didnt really help anything i think it might be becuase of something with my pc becuase a couple diffrent apps have been weird aswell so i dont think its an app issue so im planning on factory reseting soon to test that aswell
ill dm u
i dont know how to download voice to my pc for okada any helps?
3060ti - i5 12400f
from weights
RVC = Retrieval-based-Voice-Conversion, the best Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models
Wokada = uses RVC for realtime inference
which one do you want? wokada?
because this is help-rvc, so if you want wokada, you should ask in #🔍│help-w-okada
oh mb

Why did it says no file exist with the name gradio
elaborate
- what's your pc gpu
- what you want to do
- what tutorial link are you using
- what's the specific issue
How?
real
replying to what I asked you
Colab on applio
elaborate to all my questions please
requirements were not installed properly
i assume you're using the UI colab
hi, does anyone know the answer?
what latest release file?
this file
it is old and not compatible with latest refinegan
but if you want to just update what you have, yes, you can download the source and unzip it over
it is very unlikely you need all the advanced stuff from it anyway
use latest applio main
so i don't need to run the run-install.bat right?
no
in most cases if you already have an installation, you just need to overwrite the code
in rare cases an extra module may need to be installed with pip
okay, thanks for answering my question, kinda confused to update the fork, but now i got the answer
if that happen to me, i'll just go straight to the applio main branch😅
i cant get rvc gui no matter what i try
we need to make this a command
rvc gui is outdated and old use something more up to date
whats your gpu and what are you trying to do
is it normal that kaggle uses less steps despite using the same batch size?
first pic: batch size 12 local
second pic: batch size 6 kaggle
2 gpus?
yes
so you use batch x2
yes
thus less steps
but i used batch size 6 on kaggle
hm
wouldn't that have the same steps as bs12 local?
please check you did
so i've found that kaggle does this if you want to use both gpus
first pic: 2gpus
second pic: 1 gpu
it doesn't extract the features of the whole set
unless you only use 1 gpu
local
kaggle
im using the latest main in kaggle, not 3.2.8
local is also the latest main branch
both have checkpointing enabled (because i only have 8gb of vram in local)
im gonna try using only one gpu
also got this when i stopped the training in kaggle
so you have 329 files in sliced_audios_16k on both local and kaggle?
yes
hm.. could be just some loader shenanigans
using only 1 gpu matches local steps
i hope this gets fixed
🙏
so it's only a visual bug?
I guess? is the result of the training much different?
yeah, I think it is fine
you have 15 steps on one node and 15 steps on another note working in parallel, total is 30 steps
which is rounded/ceiled from 329 / 2 nodes / 6 batch
thats good to know
but this does screw up tensorboard logging?
its less accurate?
only the main process logs the values
so technically it will be using every other step to calculate averages and skip even steps
not a big deal

just text to speech using a model but the models here are not safetensor so i don't know any other program, and my gpu is a rtx 4060, with 8 gb of vram
Last update: Dec 12, 2024
Sorry I ment to say I want to make AI covers, I have found songs I want to have a different character sing
Last update: Apr 01, 2024
Ai covers is called Inference on there
K thanks I'll take a look
Man, this used to be as simple as just booting up AICoverGen
Now no matter what tool I set up, they just give one error or another and refuse to boot up
Doesn't matter if it's old or new, local or cloud, none of them work
Tried Codename, that's a whole mess in itself, would post a screenshot of the console but I don't have the perms
if you're using something not up to date, there''s very high chance some libraries got updated and no longer compatible with 1-2 year old code
That's in reference to the fact things just worked in the past; this is all with the dependencies being downloaded fresh
Is HP Karaoke still usable?
nah
So which model works similar to that?
I don't wanna use any new karaoke model since it just meh
it's just your point of view, what kind of track where the older models work better than the new one?
remember, you don't need to send the audio, just tell me what kind of it is
if not sure, listen carefully whether the vocals are centered or panned, and see the spectrogram for pitch difference
btw I bet you haven't tried UVR BVE
Can I maybe name the song?
Well, to be honest, I don't know where to find that model
perhaps
It's Goodbye Sengen covered by Ayame
Can someone help me? Every time I download the voice changer, it doesn't open, and when I click on the start, the command prompt opens and closes instantly.
not sure but im used to most 70's-00's songs that dont have difficult harmonies and adlibs
unfortunately there are no decent polyphonic separation models better than melodyne afaik
please go to #🔍│help-w-okada and read the pinned guide there
For W-Okada, go to #🔍│help-w-okada . This channel #✨│ai-help is about RVC programs and related tools, not the realtime voice changer.
Make sure you don't ask anyone to ask you back and then going away with no more response. 
melroformer dereverb by anvuew is really good at removing backing vocals, even though the model is called dereverb
I tried, and it doesn't work for the specific song
it actually extracts the centered part of lead vocals, though may not exactly work like center channel extractor
yes, every model can work perfectly depending on the song
it's very difficult to extract lead and background vocals, because what we have now is an old model that hasn't been updated.
the difficult parts are usually harmonies, adlibs, and perhaps some overlapped "medley"
^tell me if that song has those parts
at last, you can try melodyne (polyphonic detection mode) and do cover on it despite possibly damaged quality
is it the Mel Band DeRverb v2 ?
yes



AI HUB Docs