#✨│ai-help

1 messages · Page 199 of 1

latent kettle
#

i didn't understand that. Just tell me it matters or not. Is it good for model? Should I continue training

simple ore
#

you can discount the "raising" fm

#

it is not really raising, not with the average numbers like that

latent kettle
simple ore
#

when the average is mostly flat like that, or slowly creeping up, (+1.0-1.5) it is nothing to worry about

hallow thistle
latent kettle
#

is is good ??

#

@simple ore

#

Should I worry about it ?

#

Other losses are still going down 📉

analog obsidian
#

as long the rest are going down you should be fine

#

for g/total u should look if the graph is not too noisy
and also it should not be too flat

latent kettle
#

i think it is looking fine ??

#

Smoothing 0.999

analog obsidian
latent kettle
analog obsidian
latent kettle
#

Batch size is 6
Dataset length (in minutes) 23
Total trained epochs 449/500

analog obsidian
#

best approach to this is not to make it more harder and confusing

#

keep things simple for finetuning

analog obsidian
worn river
#

can we train models in portuguese?

knotty moth
#

as it said it is for experimental purpose

patent musk
#

Hi, can someone tell me where to download the freaking GUI and package for the latest rvc inferance? I can't find it

latent kettle
cinder grotto
#

Hello question I have a model that I want to continue training but when I put it and give it start apple I only get that and I can not continue training my model.

steel pivot
#

!help

dull ironBOT
#
Wally Commands

-# The prefix for commands is !

Select a category from the menu down below to view all related commands

tawdry gullBOT
# steel pivot !help

luna LunaBot 🌙 is the perfect music bot! Feature rich with high quality music! And Custom Playlist

You can start listening music by just joinning a voice channel and typing: /play [song name or link] (Remove brackets).
We support only Spotify, soundcloud, bandcamp and more!

To view more help on a specific command or category, run
/help <command> or /help <category>

Important Links:
home Support
Premium Premium
luna Invite

Command Categories:
🎶: Music
💰: Premium
⚙️: Utility
📕: Admin

Select A Page From Dropdown Menu Below

analog obsidian
brittle wing
#

-guides

azure marshBOT
marsh coral
#

quick question, is it possible to use the rvcc to change already existing audio? like a youtube video or something? Or does it have to be real-time speaker?

analog obsidian
marsh coral
analog obsidian
#

-rvc

azure marshBOT
analog obsidian
#

also rvc does not stand for realtime voice conversion

latent kettle
#

RVC is not realtime voice changer. there is different thing for realtime

latent kettle
analog obsidian
#

w-okada is just rvc inference in realtime

marsh coral
latent kettle
analog obsidian
#

w-okada is not rvc

latent kettle
marsh coral
#

merci

latent kettle
#

should i submit my model now ??

#

i think 500 epochs are enough

analog obsidian
#

and here a tutorial on how to convert audio files to your model's voice

#

pretty easy and quick

#

place your model in the logs folder of applio

marsh coral
#

awesome! Thanks

latent kettle
analog obsidian
latent kettle
analog obsidian
#

normal behavior

latent kettle
flint solar
latent kettle
#

i normally use UVR to process my data and i use best modles for isolating the vocals

analog obsidian
#

but if the model was trained with noise, nothing that you can do

analog obsidian
latent kettle
#

lemme send a sample from dataset and an output

analog obsidian
#

u can try it by converting the og pretrain to a small .pth file and inference a clean audio
the result will have noise regardless if the input audio is denoised

latent kettle
#

please stay here ill be back in a minute

analog obsidian
#

having a bit of noise is not problematic anyways @latent kettle

#

a model being able to clone noise is good when inferencing noisy audio

#

too clean might cause the model to add weird sounds instead of noise

latent kettle
#

@analog obsidian @flint solar

flint solar
latent kettle
#

what to do now ?

#

i use MDX23CInstVoc HQ

flint solar
latent kettle
#

to isolate vocals

latent kettle
flint solar
latent kettle
flint solar
#

u using the wrong models to clean ur vox

latent kettle
#

guide said.. thats why i used it

flint solar
azure marshBOT
flint solar
analog obsidian
latent kettle
analog obsidian
latent kettle
#

should i send a sample of my dataset ??

analog obsidian
#

ah wait

flint solar
#

@latent kettle why are u training on 48khz

analog obsidian
#

nono don't send dataset samples here

#

only model output

#

and again don't be too afraid of noise, is not really a bad thing

latent kettle
flint solar
latent kettle
latent kettle
flint solar
analog obsidian
latent kettle
analog obsidian
#

😭

analog obsidian
analog obsidian
#

as long he selects 32k training}

#

for inference if you use applio you have to enable this option

latent kettle
latent kettle
analog obsidian
#

so is fine

knotty moth
latent kettle
#

ya i did it

latent kettle
analog obsidian
#

because is mp3

latent kettle
#

i used songs and then i put itto UVR 5 and selected FLAC

analog obsidian
#

yes but you converted it to mp3

#

so u compressed it

latent kettle
latent kettle
analog obsidian
#

but is your actual dataset in .flac?

latent kettle
analog obsidian
#

converting a mp3 to flac is not gonna remove the compression

#

hmm, as long your dataset was always .flac and never were .mp3 converted to .flac everything should be fine

#

training on compressed audio just makes the model a bit lower quality

knotty moth
analog obsidian
latent kettle
#

so basically from starting. i downloaded an audio from an XYZ website in MP3 320kbps and then i processed it into UVR and it gaved me FLAC. and then I put That FLAC into FL Studio And Just Cut bad Sounding Portions. After That i Export in In FLAC

analog obsidian
#

what u did, you compressed the audio heavily (.mp3) then convert it the same heavily compressed audio to .flac

#

you did not increased the quality doing that

#

the same compression is in the flac

latent kettle
#

idk what to do.. i thought FLAC is Good. Because It Is LossLess 😭

analog obsidian
analog obsidian
latent kettle
analog obsidian
#

you need .opus

latent kettle
analog obsidian
#

after that you convert the .opus to .wav

latent kettle
#

wav is good or not ?

analog obsidian
#

and now you have an audio that is not heavily compressed

analog obsidian
#

convert the .opus to .wav

latent kettle
#

what if i download wav from cobalt tools?

#

is it same thing ?

analog obsidian
#

thats how youtube works

#

choose best audio quality

#

this downloads .opus or .webm

#

then you convert it to .wav

latent kettle
#

okay. now what about noise ?

analog obsidian
latent kettle
#

i also use De-EchoReverb to process my dataset

knotty moth
latent kettle
#

why there is noise in every model which i train 😭

analog obsidian
#

in inference

#

and also you can clean your inference audios, removing the noise from them

latent kettle
analog obsidian
#

that doesnt damage the model

#

the original pretrain was trained with very noisy audio

latent kettle
#

1 mistake>: i have used compressed audio. 2nd ? how do i process my dataset

latent kettle
analog obsidian
#

but again is not a bad thing 😭

latent kettle
#

okay so my model is good to get Moddel Maker role 🥺

long dirge
#

how to make w okada work on whjatsapp

latent kettle
long dirge
#

but on settings u cnat change mic

analog obsidian
long dirge
#

sad

analog obsidian
#

believe me, reviews are not harsh on new model makers

latent kettle
long dirge
#

😭

flint solar
long dirge
#

can soemone gimme a realistic girl voice model for trolling 😭

latent kettle
analog obsidian
flint solar
analog obsidian
#

this channel is for rvc, not w-okada

analog obsidian
flint solar
latent kettle
#

is it normal ?

analog obsidian
#

you are supposed to first use dereverb

#

then de-echo

flint solar
latent kettle
analog obsidian
long dirge
crude flame
marsh coral
#

getting a:

RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

Can anyone help?

azure marshBOT
marsh coral
#

I don't know, sorry 😭

analog obsidian
flint solar
latent kettle
flint solar
flint solar
#

here

latent kettle
flint solar
latent kettle
latent kettle
#

i have a decent GPU

analog obsidian
flint solar
#

never use dat bs

analog obsidian
#

big beta 5e is more clear but adds too much noise 😭

knotty moth
#

even worse than Apollo/Lew enhancer

analog obsidian
#

worst case sceneario he should use the compressed audio

latent kettle
analog obsidian
#

rvc already does audio upscaling

#

no need to do it twice android_cry

analog obsidian
latent kettle
#

is it on UVR 5

analog obsidian
flint solar
latent kettle
#

i dont want to use any web

analog obsidian
latent kettle
#

is there any local alternative?

analog obsidian
#

i personally use mel for my models

flint solar
analog obsidian
latent kettle
#

no colab. the actual problem in uploading and downloading speed with limited internet. i dont have a wifi at home

analog obsidian
#

i repeat: rvc already upscales your audio

flint solar
analog obsidian
marsh coral
flint solar
analog obsidian
# latent kettle is there any local alternative?
GitHub

Repository for training models for music source separation. - ZFTurbo/Music-Source-Separation-Training

How to install & inference with ZFTurbo's Music Source Separation script (incl. GUI)

0:00 1. Install Python: https://www.python.org/ftp/python/3.11.6/python-3.11.6-amd64.exe
0:22 2. Install Microsoft Visual C++ 2015-2022 (x64): https://aka.ms/vs/17/release/vc_redist.x64.exe
0:38 3. Install Microsoft C++ Build Tools: https://visualstudio.microso...

▶ Play video
analog obsidian
crude flame
lavish lintelBOT
#
Congratulations Razer by Weights!

Your Grotle is now level 29!

knotty moth
analog obsidian
flint solar
crude flame
analog obsidian
crude flame
analog obsidian
#

is the more clear, legit sounds like an actual raw sample sometimes and you forget its isolated lol

latent kettle
analog obsidian
marsh coral
analog obsidian
#

sorry

marsh coral
#

all good

analog obsidian
#

if it can run cuda applications it can do inference

flint solar
analog obsidian
analog obsidian
latent kettle
#

i will watch just DM the links and names of models and other things

analog obsidian
#

but more clear than bs roformer at least

knotty moth
latent kettle
#

if you are free yes you can

#

im confused now what to do 😭

latent kettle
#

💔

#

please help me

analog obsidian
latent kettle
analog obsidian
#

its normal

latent kettle
#

do i stop using UVR

analog obsidian
#

follow the tutorial

#

or

#

i can send you an more updated version of uvr

flint solar
analog obsidian
#

this is the latest uvr update

#

delete your old version and install this one

flint solar
latent kettle
scarlet cedarBOT
#
sietangaingu
Server Avatar
weak cipher
#

No

analog obsidian
shell holly
#

Hi, I’m trying to train an AI voice model but ran into this error:
C:\ia nvidia\RVC1006Nvidia/logs/testtest
load model(s) from assets/hubert/hubert_base.pt
move model to cuda
no-feature-todo

Any idea what’s causing it?

crude flame
analog obsidian
knotty moth
analog obsidian
#

noted, im going to remember this in case i need a dereverbboolin_pepe

knotty moth
#

anyway both Sucial's and anvuew's are for stereo reverb

crude flame
#

sucial leaves some reverb in ( not much )

#

cant really hear a difference tho

#

so

analog obsidian
flint solar
#

I used anvuew mel dereverb

dim jewel
#

Hi guys, I have a question regarding dataset.
Let's say I have singer. They have 1 song that I want to make cover off, which would be the same song, with the same singer, but with different lyrics. Is there any point in training model on the other singer's songs, if model would be used for this one only particular song?

glacial pollen
#

In this scenario, generalization to unseen data ( aka. Model's capability to adapt to songs / content it wasn't exposed to throughout the training ) losses the meaning

#

pretty much

dim jewel
#

What if it's the same scenario, but the whole album?

glacial pollen
#

If you intend to use the model on the data it was exposed to ( Again, during training), having it " full of variety " ( the dataset ) kind of losses the meaning

silk hearth
#

/create

#

wow that worked

dim jewel
#

But wouldn't it still needed to learn variety to adapt to changed lyrics?

glacial pollen
#

But it isn't 100% a strict rule

#

All comes down really to how well you model generalizes ( It's ability to adapt to stuff it did not see during training )
And generalization is a matter of: Good training, properly picked batch_size ( smaller promotes better generalization, typically ) and naturally, dataset's diversity

#

But let's not make it any huge or extreme deal really

dim jewel
#

As always, thank you

glacial pollen
fair pelican
#

yo i need help with setting up the voice changer to work in discord, the output virtual cable is in input and the input virtual cable is in output how do i fix it help

valid stream
#

i put my chunk at 640 but its still like 3 seconds delayed how do we fix it

low shard
weary pond
low shard
#

Wokada is the program to use RVC (Retrieval-based-Voice-Conversion, Speech To Speech) Models in realtime for calls

There's the fork (modified version), the deiteris fork which has better performance

weary pond
#

my bad

low shard
#

Dw

unique rock
#

what is this for?
cache_all_training_sets

simple ore
#

cache training data on GPU, provided small performance improvement, as long the as the dataset is not too big to fit into vram

#

@unique rock

humble river
#

anyone on rn

#

does not even mention its an error

#

just says this

#

2024-12-20 21:56:06 | INFO | fairseq.tasks.hubert_pretraining | HubertPretrainingTask Config {'_name': 'hubert_pretraining', 'data': 'metadata', 'fine_tuning': False, 'labels': ['km'], 'label_dir': 'label', 'label_rate': 50.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'pad_audio': False}
2024-12-20 21:56:06 | INFO | fairseq.models.hubert.hubert | HubertModel Config: {'_name': 'hubert', 'label_rate': 50.0, 'extractor_mode': default, 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': gelu, 'layer_type': transformer, 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.05, 'dropout_input': 0.1, 'dropout_features': 0.1, 'final_dim': 256, 'untie_final_proj': True, 'layer_norm_first': False, 'conv_feature_layers': '[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2', 'conv_bias': False, 'logit_temp': 0.1, 'target_glu': False, 'feature_grad_mult': 0.1, 'mask_length': 10, 'mask_prob': 0.8, 'mask_selection': static, 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_selection': static, 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'conv_pos': 128, 'conv_pos_groups': 16, 'latent_temp': [2.0, 0.5, 0.999995], 'skip_masked': False, 'skip_nomask': False, 'checkpoint_activations': False, 'required_seq_len_multiple': 2, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False}
2024-12-20 21:56:10 | INFO | infer.modules.vc.pipeline | Loading rmvpe model,assets/rmvpe/rmvpe.pt

#

then at 5.3 secs the gradio thing says error

simple ore
#

@craggy wyvern this is the right channel to ask your question

#

the answer is - depends on what you're trying to install. Likely some outdated build.

hallow thistle
#

-gui

hallow thistle
#

A batch file for launching up webui went missing from this most recent OG RVC GUI, but instead got this batch file for launching up the "realtime" RVC pre-installed. Of course, I don't think it gonna work well.

low shard
hallow thistle
#

The first time I clicked and run this batch file back in 2023, I was like surprised by how this GUI looked. Then later I found out it wasn't meant for "audio conversion" thing, but rather the "realtime" one. trolley

#

I'm surprised by how people still think RVC has another realtime program. Could it be this thing? catblush

glacial pollen
#

not sure if there's any other tho

brave ermine
#

hello

#

i got an error while generating index

azure marshBOT
brave ermine
#

im using local applio and it says v2 extracted file is not found

#

anyone know what i am doing wrong

#

nah i found it i just didnt press this button how silly lol

humble river
#

3070ti 13700k btw

glacial pollen
#

ye

humble river
#

oh

#

ok

glacial pollen
#

what's up then

humble river
#

no error in the logs

#

just that its loading rvmpe

glacial pollen
#

Well, 2 things

humble river
#

alr

glacial pollen
#
  1. You should provide logs using " > ", like so:

asdfghjkl

#

and 2... what release / fork you use? ( of rvc / applio )

#

This is by far the most important part / info you didn't provide I believe

glacial pollen
#

Actually nah

humble river
glacial pollen
#

Just tell me what release you use

#

where does it come from

#

applio github? or official rvc one

humble river
#

offical

#

sry not that good eng

glacial pollen
# humble river offical

Yea, tbf, instead of trying to debug og rvc which can at times be problematic or headaching
I'd actually go for Applio

glacial pollen
humble river
glacial pollen
#

Applio is easier to run and generally to get running

humble river
glacial pollen
humble river
#

oh

#

do u have the link

#

wanna make sure i download the right one

#

or the user who made it

#

will do

glacial pollen
#

You're a newbie right?

humble river
#

yes

glacial pollen
#

Has some nicer descriptions / easier to get descriptions I added

humble river
#

alr

glacial pollen
humble river
glacial pollen
#

1st goes the install .bat and then run .bat files

humble river
#

oh

glacial pollen
#

just, simply run it with no " run as admin " or whatsoever

humble river
#

ok

glacial pollen
glacial pollen
#

yw man

#

( ps. Make sure to unpack the fork folder / the folder from archive to C drive / os drive directly, if you can afford some space on your drive )

#

Like so

glacial pollen
#

Neat, in that case best of luck~

humble river
#

YOO UI IS FIRE

idle osprey
#

nword

#

nword

#

nword

#

nword

humble river
#

@glacial pollen where do models go?

#

just directly in models

#

or

glacial pollen
#

logs folder and in there, per-model folder

humble river
#

ok

glacial pollen
#

This is also the case when you train a model, .pth models appear in there ( index too )

humble river
#

ty

glacial pollen
#

yea just saying in case

humble river
#

that can possibly help

#

there is no per model folder

#

should i make it

glacial pollen
humble river
#

so creare folder

glacial pollen
#

per-model as in, each model ( pth and index ) gets a folder

humble river
#

OHHH

glacial pollen
#

per is like " for "

humble river
#

so should i make one with the pth and index

glacial pollen
#

Yea, applio searches up for models in logs location

humble river
glacial pollen
#

and organizing it in folders makes it easier

#

that's all there is to it

humble river
#

ah

#

alr ty

jaunty talon
#

One message removed from a suspended account.

low shard
jaunty talon
#

One message removed from a suspended account.

dense drift
#

Please help

tame mica
#

?

#

what are you trying to do

lavish lintelBOT
#
Congratulations kar@shin padoru 🎄!

Your Dewott is now level 34!

New move!

Your Dewott can now learn Aqua Jet!

dense drift
#

To create an ai song

glacial pollen
#

That is not rvc tho ( I mean yea, but weights manages it

dense drift
#

Soo where should i ask for help

glacial pollen
#

So like, if yt dls are disabled

#

you gotta dl the vocals on your own

#

Not sure how weights manages it tho

#

Whether they isolate ( the vocals ) or not

#

Well.. You'd have to dl the vocals / song, if it's a song then isolate / separete the vocals using mvsep or uvr and only then, upload the vocals to weights

dense drift
#

There is no option to upload

glacial pollen
#

I ain't associated with weights nor I use their services so, I possibly couldn't know how it's done in there

dense drift
#

Ohh alright

glacial pollen
#

Alternatively, screenshot the whole ui

dense drift
#

But thanks for helping

glacial pollen
#

and show me how it looks

dense drift
glacial pollen
#

alr then

torpid prairie
#

can i use a voice model with only pth file

low shard
#

may not sound the best though

torpid prairie
#

alr

severe sand
#

Does anyone have experience getting "sh", "ch" and "sch" sounds to be pronounced properly? like in the German "Ich"
Everything I try, every model completely fails at those sounds even if they exist in the dataset

#

Unfortunately its pretty important since we use them for music but I just can't find any solution

severe sand
#

about 15-20 minutes of talking and singing on average

crude flame
#

if possible you could try making it longer

severe sand
#

nvm the main model I'm looking at is trained on 32 minutes of talking and singing

#

mostly singing, I'd think thats way more than enough, especially considering the models are completely unable to pronounce "Ich" like its not even really close

#

they always turn it into "isch" if they pronounce it at all

crude flame
#

so it has "ich" in the dataset and in the inference audio?

severe sand
#

yes

#

it seems a bit like most of the time rvc even thinks that the ch is supposed to be breath noise and supresses it entirely

crude flame
#

how noisy is your set? little bit is fine but if its loud enough it could make a difference

severe sand
#

No noise, studio environment

#

not sure if you can pronounce the german ch but do you think you could find a clip were its being pronounced correctly? I can try making one showcasing the issue

crude flame
#

welp, you can blame vctk for not having any german in it making that "ich" suck

severe sand
#

whats vctk?

crude flame
#

the dataset the default pretrains were trained on

#

it sucks

severe sand
#

ah I see, yea I was worrying that this might be the issue

#

I've tested some other pretrained models but they seemed mostly terrible

#

so I didn't even keep those models

crude flame
#

if you want you can grab several hours of german speech and make a small little finetuned pretrain for german

#

and hope it makes it better

analog obsidian
severe sand
#

Anything specific I have to do for that? Like maybe I have to somehow reduce the learning rate or something?

crude flame
#

just train it like a normal model

severe sand
#

And do you mean using the pretrained models and finetuning it into a better one or make a new model from scratch

#

I do kinda hope that this is the issue now, else I will put a bunch of time into that without fixing it xD

#

I will try some other pretrained models first, this one could be interesting

crude flame
severe sand
#

And you are dead sure that those kind of noises are results of bad pretraining? Because I feel like even english noises like "shark" are kinda struggling with the sh

lavish lintelBOT
#
Congratulations Razer by Weights!

Your Grotle is now level 31!

crude flame
#

it sucks

severe sand
#

welp damn, now where am I gonna find hours of german. It probably has to contain male and female voices right? Since I will need both

crude flame
severe sand
#

Okay I will try around with that a bit, thanks

#

Oh another question, do I have to fear overtraining when creating a pretrained model? And a general idea of how many epochs I will need for multiple hours of data? I assume it will take quite a while even for 50-100

crude flame
severe sand
#

okay thats much easier than I assumed then, good to know

carmine hearth
#

Hey guys, is there anything written about audacity dataset cutting settings? I'm looking for it for people using RVC Mainline or RVC Disconnected (that's me!), but my inexperienced searching skills have yet to find it...
I'm using machine translation. Sorry if it's hard to understand!

placid heath
#

i just had a question, to make songs with ai, like to rap as different things, what is recommended? like settings and everything. im new to this so please be patient with me!

simple ore
#

3-5 seconds, no more than that and overlap 0.3-0.5s

carmine hearth
#

Thank you for your help, I will try that.kittyaww

mental spade
#

Think I fixed it

hallow thistle
#

-colab

azure marshBOT
# hallow thistle -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

hallow thistle
knotty moth
#

also they could have video editing magic, so it'd look too good but not in reality skullfacedistorted

low shard
#

This is the wrong channel, and you're using old software

low shard
unique rock
#

Can someone explain to me the Tensorboard graphs? Following the Applio guide I only learned to see the total loss g, but they say that I should not only take into account that graph but also more.

simple walrus
#

For rvc-gui I create a voice model for Ai Cover. When I use the voice model I create in any song, whether male or female, no matter if I make the pitch negative or positive, the voice I use in the song is not the same as the voice model I created in the song. Why?

low shard
#

Do u mean this one

#

It's super OUTDATED

#

What's ur PC GPU

#

There's a better program for realtime voice changer

hallow thistle
#

-gui

low shard
#

@simple walrus You looking for ai covers or realtime voice changer?

simple walrus
#

@low shard I want to use my own voice in the rvc-gui but when I use it the voice does not change to my own voice in the song

unique rock
#

Can you tell me which is the best pre-train?

low shard
#

sooo, just do ai covers

#

or realtime voice changer for calls?

low shard
#

it depends

hallow thistle
#

RVC is the audio conversion program, while W-Okada is the realtime voice conversion program that uses RVC voice model.

low shard
#

^^^

simple walrus
#

@low shard There is no alternative program can you recommend me an ai cover program where I can install on pc without connecting to websites

hallow thistle
#

The "RVC-GUI" can refer to the OG RVC GUI program, which has been long outdated.

#

-rvc

azure marshBOT
low shard
#

if so, that's OUTDATED, DON'T FOLLOW YT TUTS

#

Tell me your pc gpu

hallow thistle
#

You still didn't answer us about what GPU your PC has.

#

Applio is a recently developed fork program of RVC GUI, one of the only RVC forks AI Hub by Weights recommended.

simple walrus
#

My notebook cpu is Intel® HD Graphics 500

hallow thistle
#

Is it only GPU 0? If so, that's mean your laptop doesn't have a dedicated GPU.

low shard
#

The program won't even run on integrated gpu, it will run on your cpu

#

making it very very slow

#

It's suggested to use Cloud, your CPU is SLOW

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
    • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
    • Mainline: The original RVC
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):
    • Ilaria RVC Zero: fastest and simplest that you can get for free
    • Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
    • Applio Colab: max 4 hours, not granted, of GPU

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

#

if you really want to do it on cpu slow, you can locally via applio

lavish lintelBOT
#
Congratulations Nick088 [ITA/ENG] by Weights!

Your Charizard is now level 76!

low shard
#

but It's NOT suggested

simple walrus
#

Applio I used it but it is a bit complicated it connects to website is confusing

#

Thxx for Suggestions I will try

steel forge
#

applio hands down the best

#

imo

hallow thistle
#

Applio doesn't connect to the "internet", it hosts locally on your PC localhost port. Unless you set Gradio to share to around the world.

steel forge
#

yeah that too

#

runs without an internet connection but requires a network

#

all RVC GUIs will be web-based. Unless someone wants to cobble together a modern RVC-GUI trolley

low shard
#

they use Edge TTS API

steel forge
#

no tts

low shard
#

without it yea u don't

steel forge
#

bet]

#

based applio

hallow thistle
#

An internet connection is needed to download voice model online. Baffled

low shard
#

The local applio doesn't unless you do TTS

#

There are different versions of the same program, cloud and local

hallow thistle
#

-colab

azure marshBOT
# hallow thistle -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

low shard
hallow thistle
#

Google Colab is a cloud service.

simple ore
unique rock
placid stone
#

hi, I have a question, I can only switch between GPU0, CPU, GPU1, GPU2 and GPU3 in the voice changer and my voice changer is also lagging a lot. can someone help me?

azure marshBOT
hallow thistle
sage heath
#

my voice changer is capturing my pc voice too and its eco what do i need to do

sage heath
hallow thistle
#

Please read my earlier message above.

#

There's no way your PC has four GPUs at once. Unless you've downloaded the old OG W-Okada, which can be tricky to cause its GUI to show more GPU than one, and each GPU could all be picking up CPU.

low shard
left crow
#

I have a question I wanna make ai covers but someone told me to use rvc is it only pc or is it on mobile too?

low shard
left crow
#

I have hp laptop but my mom uses it for work so I use mobile

#

And what's a GPU im very dumb in this kind of things lol

low shard
low shard
hallow thistle
left crow
#

Ohh

#

It's fine I only wanted to make lads ai covers

#

But I don't know the website lol

#

There's a lot of website idk what to choose

hallow thistle
#

Weights.gg is a website that can do AI cover for free.

left crow
#

Ohh I use that but idk some part of the song breaks but it's good ig

#

Maybe it depends on the song lol

low shard
#

Weights.gg uses RVC but in a easier way for users

Other sites use RVC too, but they make you pay for it

charred drum
low shard
charred drum
opal kelp
#

-colab

azure marshBOT
# opal kelp -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

alpine bough
#

hi, i downloaded the ngrok file, but cannot open it, can u help me?

#

cannot find or open /Users/batokaraevzafarbek/Downloads/ngrok-v3-stable-darwin-arm64.zip, /Users/batokaraevzafarbek/Downloads/ngrok-v3-stable-darwin-arm64.zip.zip or /Users/batokaraevzafarbek/Downloads/ngrok-v3-stable-darwin-arm64.zip.ZIP.

#

i tried through collab, but there is a 403

#

по русски можно

#

-rules

#

rules

#

@acoustic scarab

alpine bough
#

помоги

#

пэжэ

low shard
alpine bough
#

so i was trying to set up through the collab

#

and firstly i was getting 403 errors so i tried using ngrok

#

and getting erroro there too

alpine bough
#

?

low shard
#

you shouldn't download ngrok,that gets downloaded on google colab not ur pc

upper tusk
#

Hello there, I have a question, I would like to try and make an RVC model based on sherry birkin from RE 2 remake, but I have only maybe 2-3 minutes of dialogue from her that is usable, is that enough to work with or is it just not going to work ?

brittle wing
#

Hewwo! I was wondering if anyone can help me to get my voice onto discord and games?

#

just struggling a lil, got the virtual audio cable installed etc, just dunno how to do it

brittle wing
#

also, when playing some games, the ping and total MS skyrockets, that normal?

low shard
#

RVC ≠ Wokada

tranquil raven
#

!help

dull ironBOT
#
Wally Commands

-# The prefix for commands is !

Select a category from the menu down below to view all related commands

woeful canyon
# tranquil raven !help

Available Commands

!ping - Check bot latency
!help - Show all commands
!status - Show bot status

tranquil raven
#

!help

dull ironBOT
#
Wally Commands

-# The prefix for commands is !

Select a category from the menu down below to view all related commands

tranquil raven
#

I'm tryna find the link for this version of rvc

hallow thistle
#

I'm not sure why you're looking for this specific RVC version when there's a recently developed real-time voice conversion program available, which it works better than that.

open stag
#
  • colab
azure marshBOT
# open stag - colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

hot sonnet
#

!howtoask

patent trellisBOT
# hot sonnet !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
hot sonnet
#

what is the best download for amd :(

low shard
low shard
hot sonnet
# low shard for what

to be honest I don't know, because i thought okada was rvc but i think rvc is okada

hot sonnet
#

idk

#

the difference

low shard
hot sonnet
#

i just want a smooth voice changer

#

idk ive used this before

low shard
hot sonnet
#

ok ty

tough stone
#

my audio is around 3 minutes long and its separated into like 10 shorter audios, is THIS normal?

#

Also my epochs are 250 and save time 30

#

rmvpe_gpu, no other settings touched

#

rtx 2060 S

#

its been around 20 minutes, no changes/updates in the console as well

latent kettle
tough stone
#

Nvm its solved, I just had to uninstall Python

flint solar
#

😭

tough stone
#

idk bro

#

It caused some issues

#

most important thing it works

supple salmon
#

Any idea what I should go about this? :o

#

I believe the page has been taken down

low shard
#

link me the discord post where u got that model

supple salmon
supple salmon
low shard
low shard
molten fog
#

is this option any good?

#

assuming not i havent heard any talk of it

pallid ocean
#

Is the applio-rvc colab ver. working for anyone? I seem tobe getting an error related to circular imports upon training the model (index seems to be training without any problem)

dire vapor
#

how to use rvc v3 in applio colab or other colab?

silent stratus
low shard
#

Officially

#

Since like 2 years

#

There's some unofficial forks that are experimental like @glacial pollen 's fork but idk if there's a colab

silent stratus
silent stratus
silent stratus
low shard
#

anyways i pingd codename, he prolly knows

low shard
#

looks like smt isn't installed properly

#

you told noobies about it?

silent stratus
#

noobies dosent even know why its happening

simple ore
#

i said you may need an update to colab

#

Vidal did fix some torch imports

#

you can just manually re-install torch

#

!pip uninstall torch torchvision torchaudio -y
!pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --upgrade --index-url https://download.pytorch.org/whl/cu121

silent stratus
#

okay thanks

low shard
#

coloab moment boohooh

molten fog
#

if my dataset has like multiple singing tones (calmer singing tones and then more almost "yelling" singing tones where the artist puts more emotion in their voice), is that what this is talking about when it says "diverse"? shoudl i use a bigger batch size?

azure marshBOT
analog obsidian
#

for example a non diverse dataset would be one having monotone speech, with repeated words in between speech, similar sentences, basically 0 variety

#

if your singing dataset 90% of the time is singing in the same tone then is not diverse enough

#

for batch sizes you can try 4-8
4 for small datasets (10 minutes and below)
and 8 for 30 minutes and above

#

choosing batch sizes is more complicated than that but for training models these are the most used values in rvc

molten fog
#

it is pretty diverse in terms of tone

#

btw

analog obsidian
molten fog
#

its longer than 10 minutes but not thirty minutes long

analog obsidian
#

up to you

molten fog
#

i have no problem wating as im training locally and im patient

#

waiting*

analog obsidian
#

sure batch size 4 will work fine in this case

#

as long the graphs are not extremely noisy

molten fog
analog obsidian
#

follow codename's suggestions

molten fog
# analog obsidian yup

do i need it enabled to 0 when i train the first epoch or do i enable it after and resume training

silent stratus
#

but it seems to be going up

silent stratus
clear oasis
#

How do I train a model? I'm just starting out

low shard
brittle wing
#

10k is the Lowest point mhm

low shard
#

Open "extracting vocals from songs"

flat yoke
#
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-bd2dc64d26a0> in <cell line: 2>()
      1 #@title Save the model
----> 2 from mega import Mega
      3 import os
      4 import shutil
      5 from urllib.parse import urlparse

ModuleNotFoundError: No module named 'mega'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------
#

can someone tell me why i get this error?

#

im trying to save this model:

#

nvm

low shard
flat yoke
low shard
#

And if possible, your PC GPU

#

(I'm asking your PC GPU bc I seen people with actual good PCs using cloud

flat yoke
#

rtx 3050

flat yoke
low shard
flat yoke
#

no

low shard
#

6gb?

flat yoke
#

wait yeah it is the 4gb one

low shard
knotty moth
low shard
#
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):
    • Ilaria RVC Zero: fastest and simplest that you can get for free
    • Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
    • Applio Colab: max 4 hours, not granted, of GPU

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

#

Ilaria rvc zero and weights.gg are prob your best options

flat yoke
#

yeah using weights rn

#

wait

#

are they using the same models tho

#

?

low shard
#

So unless you're comparing 2 different models, they prob are the same

#

They still are RVC v2 ofc

oak edge
#

hi, to get best quality in rvc model should batch size be higher or lower? (I'm using colab free, applio, and T4 gpu)

knotty moth
oak edge
#

i got a 20minute single file datasset also what is fp

knotty moth
oak edge
#

i mean where to configure it

knotty moth
#

go to Settings tab in Applio

#

I'd recommend the latest version 3.2.8 bugfix

oak edge
#

ohh see it

#

fp32 always better than fp16?

knotty moth
oak edge
#

okkkk

#

thanks a lot

mental spade
#

So it seems now virtual cable died?

#

got it to work last night

#

Computer crash nwo nothing

surreal spruce
#

-colab

azure marshBOT
# surreal spruce -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

lunar quartz
#

how can i download rvc? any link

low shard
lunar quartz
#

radeon rx580 and i wanna do covers

low shard
# lunar quartz radeon rx580 and i wanna do covers

Your AMD GPU is good enough to do inference (use models) locally (on ur pc), you won't be able to train (make models) but use them

You can:

  • Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
  • Cloud (remote good pc, easier and faster than ur PC but it's limited):
    • Ilaria RVC Zero: fastest and simplest that you can get for free
    • Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
    • Applio Colab: max 4 hours, not granted, of GPU

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio

low shard
fathom reef
#

What is the most high quality pretrain type?

foggy cedar
#

What is vocoder and checkpointing?

#

I just saw it.

simple ore
#

you saw it, but you did not read the notes? 🙂

simple walrus
#

I have a question: How can I change words in a song? Is there a free application or program that can do this? What methods are there?

light pelican
simple ore
#

are there actual logs?

foggy cedar
simple ore
#

vocoder is a new generator (MRF HiFiGAN and RegineGAN), no pretrains for those yet

#

Checkpointing - save vram at cost of slower training, can use larger batch sizes

low shard
#

@craggy stratusthis is the right channel for ai covers

  • Cloud (remote good pc, easier and faster than ur PC but it's limited):
    • Ilaria RVC Zero: fastest and simplest that you can get for free
    • Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
    • Applio Colab: max 4 hours, not granted, of GPU

Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero

#

You could also do it on phone cpu locally, but it will be harder and slow asf, not suggested

craggy stratus
#

the voice i want to achieve (AI cover of yukari i found on tt) and i just want to kinda figure out what settings to set, cause when i tried to do it, the voice was off

lavish spruce
#

enhypen

latent kettle
#

38 epochs dataset length about 40 minuts. batch size 8. D loss is going up. do i stop traning ?? and change the batch Size ??

pure tangle
#

how to download model

latent kettle
simple ore
#

or 200

latent kettle
#

i set to 250

simple ore
#

keep going

latent kettle
#

so should i keep traning ??

#

thank you sir

simple ore
#

as long as you have a good clean set with variety of content it will be more than 38 epochs

simple ore
#

i mean it takes longer than 38 epoch to exctact everything useful from a 40 min set

#

provided the dataset is of a good quality

latent kettle
#

okay. thank you again

#

it started increasing again

simple ore
#

3.62 to 3.63 is not the increase that you should worry about

#

3 to 36, yes

latent kettle
#

okay.. i see

onyx crater
#

Guys im an ekitten now

#

ill get them eboys for money

latent kettle
#

Are there any symbols of overfitting ??

#

D loss is going down but G loss seems little bit increasing @simple ore can you please help me ??

low shard
#

That's not RVC

#

RVC = Retrieval-based-Voice-Conversion

#

It's used only for inference on pre recorded audios and training modela

#

You have wokada instead

#

Did you download this from a YouTube video

#

You downloaded an old version then

latent kettle
#

@low shard can you help me

#

if it is overfitting

low shard
#

It doesn't seem really increasing much

latent kettle
#

This picture is on 175 and now it's 200

glacial pollen
#

especially given you're not using averaged loss

glacial pollen
#

yet you'll be able to recognize it once it happens. it's quite apparent when it happens
Differs too much from normal graphs, the behavior

charred portal
#

Hi, I was wondering. Does anyone know any tool to train AI models using samples that would support multilanguage? I want to use it for AI voice changer.

low shard
# charred portal Hi, I was wondering. Does anyone know any tool to train AI models using samples ...

Does anyone know any tool to train AI models using samples that would support multilanguage?
RVC (Retrieval-based-Voice-Conversion) is the tool used to train the best Speech to Speech models
But you can't make a model support EVERY SINGLE LANGUAGE, else you would have to train the voice of that guy/girl of him speaking EVERY SINGLE LANGUAGE EXISTING IN THE WORLD

However, you can train an english model and technically use it for other languages too, in rvc context, the model is made of a pth (actual voice) and added index (the accent), you could use it in the voice changer lowering the index ratio so it doesn't have the accent it was trained on

I want to use it for AI voice changer.
Be sure you're using Wokada Deiteris Fork and not some YouTube tut one

charred portal
#

Speaker is also Czech

low shard
#

Is it your first time

charred portal
low shard
#

what's ur pc gpu first?

charred portal
#

RTX 4060

low shard
#

and is it laptop?

knotty moth
#

should be the same 8 gb

charred portal
#

It is Laptop

#

Only 8GB

low shard
#

could be doable

#

For Locally (runs on ur pc):

  • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
  • Mainline: The original RVC
#

Train RVC Models on cloud:

  1. Prepare the Dataset
  2. Setup RVC:
    Choose a cloud way to use RVC,
  • Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
  • Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
  • Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
  1. Be sure to know about the tensorboard

Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
If you are looking for the easiest way and for free, is using https://weights.gg which ofc uses RVC

#

Maybe you could try locally first

#

@simple ore did u ever hear someone train on a 4060 8gb laptop?

simple ore
#

You can, 8GB bs4 is doable

#

can be more with checkpointing turned on in the new build

knotty moth
charred portal
#

When I generated it before it generated a whopping 150 .pts files, is that correct or am I doing something wrong?

brazen skiff
#

Help, does anyone know what frequency I should train at if my dataset has a frequency of 44hz, should I train it at 40 or 48hz?

knotty moth
glacial pollen
#

those 4khz are actually important for sibilants and fidelity, regardless
40khz model would dumpen the clarity so, 48 is the way to go for. Luckily 48 works with 44.1khz audio fairly fine if you're careful with training and the data itself
( Tho ensure it's truly 44.1khz (( frequency spectrum itself )) )

latent kettle
#

I got model maker, now it's time for model master, can someone please guide me ?

woven anvil
#

Hi there RVC beginner here,
Does anyone have a good tutorial that explains how RVC/Gradio works?
Can't find a tutorial I understand on YouTube.

latent kettle
woven anvil
#

No I just want to create a voice model for myself.

#

But it's awfully complicated : S

latent kettle
woven anvil
#

Oh one sec I'll check.

knotty moth
latent kettle
woven anvil
#

Nvidia Geforce GTX 1660 Super and then I also have a Quadro P4000.

latent kettle
woven anvil
knotty moth
woven anvil
latent kettle
woven anvil
#

Oh my bad.

latent kettle
knotty moth
woven anvil
# latent kettle VRAM, video memory, not RAM

Introducing The GeForce GTX 1660 SUPER
Making it SUPER is the addition of 14 Gbps GDDR6 VRAM, which boosts peak memory bandwidth to 336 GB/s (a 75% improvement over the GeForce GTX 1660's 8 Gbps 192 GB/s GDDR5 VRAM).

Quadro should be 8GB

knotty moth
woven anvil
#

I have two PC's, one has Quadro the other the super.

woven anvil
latent kettle
knotty moth
latent kettle
#

So p4000 will de a good option

latent kettle
woven anvil
#

Okay so I need to use the other Desktop to do this.

latent kettle
knotty moth
woven anvil
knotty moth
latent kettle
woven anvil
knotty moth
latent kettle
#

Read carefully

knotty moth
woven anvil
#

I know, but I don't know what there telling me in the steps, there are lots of terms and file types I'm completely unfamiliar with.

latent kettle
latent kettle
woven anvil
#

How do I turn my audio sample into a .PTH file?

#

Says I have to drop it into the weight folder.

#

I only have wav type right now.

knotty moth
woven anvil
#

Dear gods...

latent kettle
#

So let's begin

#

1st you need a dataset, "dataset is the audio or samples of your character's voice " From me, recommended length is minimum 10 minutes maximum 30 minutes.

woven anvil
#

I have done that, I have almost 20 minutes of clean audio with lots of variation in pitch and emotions etc.

#

In WAV format.

latent kettle
#

To prepare a good dataset, you need to remove background music, reverb eco and noise by using UVR 5

#

Also cut the silence and pauses from dataset

woven anvil
#

Was like half way with training a model with Google Colab before it completely broke and became unusable 🤭

woven anvil
latent kettle
woven anvil
#

Can't I do it with gradio?
Was kinda happy I got it working 🥲

#

Colab is completely unusable so even if I wanted too I couldn't get it to work again.

latent kettle
latent kettle
woven anvil
#

Okay so I have to install that first then?
And if it's based on the cloud, then I could use this PC with the super instead?

#

Is Kaggle free to use?

latent kettle
woven anvil
#

Doesn't matter aslong as I get a voice model at the end 😬