#✨│ai-help

1 messages · Page 194 of 1

covert anchor
#

only train button

rare gobletBOT
#

Ayo? @covert anchor level 3 !!! lfg

knotty moth
#

sorry no idea cuz I'm not using the latest applio

covert anchor
#

oh :c

knotty moth
#

btw check the model name, should be only alphanumeric without spaces as safe bet

covert anchor
#

well, dw, you tried and i apreciate that

#

this is the name

#

it's a champion from league, on spanish

#

the name of the dataset, "mezcla" means mix

#

and that's all

#

probably im just gonna wait untill tomorrow and get help of a friend that uses the latest version of applio

#

anyways, thx!

simple ore
#

for 10+ hours, sure

#

slice the audio. You cant train with one 9 min file unsliced.

simple ore
#

because that's how it works. Either you slice the file youself in using 3-5s slices, or you let Applio slice it in preprocess

covert anchor
#

how do i let applio do it?

#

i will do it tomorrow btw but i want to know

simple ore
covert anchor
#

okay okay

#

i will try it tomorrow

#

thk so much ❤️

boreal sluice
#

does anybody knows how to use kaggle?

#

the modified version?

flint solar
boreal sluice
rare gobletBOT
#

Ayo? @boreal sluice level 1 !!! lfg

dusty bone
#

whenever i use any voice it rlly just cracks alot lol

#

is there a fix or is my mic the problem

knotty moth
gritty zinc
#

What is best ai changer for pc

#

Crash alot

odd shale
odd shale
#

It's kinda better than the OG version.

#

And if you ask, nope, we can't give support nor troubleshooting for voice.ai issues

odd shale
gritty zinc
#

Make sense

#

Ty

knotty moth
gritty zinc
#

Fr

#

Agree

#

Badly

odd shale
#

Tiny fact: The models made with tiny/moderated-size dataset prolly won't perform properly on W-Okada nor Voice.ai

#

Tho there can be tiny/lucky exceptions

gritty zinc
#

Make sense

rare gobletBOT
#

Ayo? @gritty zinc level 1 !!! lfg

gritty zinc
#

So this One work on everything?

odd shale
#

Deiteris fork is kinda more optimized than others.

#

Also it's matter of playing around with settings and reading the guide.

gritty zinc
#

@odd shale

#

i want use rick voice

#

soooo

#

hm

odd shale
#

I'm not sure which version of W-Okada you're using

gritty zinc
#

oh yh ty

pure pecan
#

I have f5-tts installed, how do I put weighted voices on it? it only seems to let me upload audio samples, and not .zip of weighted

dusty bone
#

i do have a deeper voice

#

soo ill find smth that fits more

pure pecan
low shard
# pure pecan Sorry, what is 0shot?

0shot = no actual training needed, just an audio file to work, inferior in terms of quality to few-shots
few-shots = training needed, an example is GPT-SoVITS (TTS) & RVC (STS)

#

You can't upload RVC models to GPT-SoVITS nor F5 TTS

pure pecan
#

huh, could have sworn I installed a version of f5-tts that could do training based on the github tutorial, oh well, thanks fo the information

low shard
#

RVC models can be used only in programs that use RVC, such as W-okada

low shard
gritty zinc
#

hm

low shard
#

You can train the big model that's being used for 0shot, not train every voice you want to a model in F5 tts

low shard
# gritty zinc can i smh link it to disc?

You mean as a realtime voice changer for calls? you could technically do that for any TTS

#

But it's not really realtime

#

You'd have to type the words and let the audio play so

gritty zinc
#

eh beter than nothing

low shard
#

So it's Speech To Speech, rather than Text To Speech

#

What's your pc gpu?

gritty zinc
#

this is too complicated for my dump ass i just want act in vc for online class

#

thought it would be good idea

#

prob no

low shard
#

Leo gave you the wokada guide above

#

which has everything you need to know

gritty zinc
#

yes buttt

rare gobletBOT
#

Ayo? @gritty zinc level 2 !!! lfg

gritty zinc
#

it kinda NOT what im looking for excatly

low shard
gritty zinc
#

i use that prob

low shard
# gritty zinc smt like the text speech thing

Yea you can install any TTS in the guide and use it for calls

Really depends tho
If you want generic voices and easy: edge tts
If you want custom voices and easy: F5 TTS or Fish Speech
If you want custom voices and best quality (but more complex): gpt-sovits

#

the process for using each of those in calls is kind of the same

#

Also, depends if your pc gpu is good enough tho

gritty zinc
#

alr tyy

true ravine
#

pls help

rare gobletBOT
#

Ayo? @true ravine level 1 !!! lfg

pure pecan
#

ok I downloaded RVC off of pinokio, what tab and where do I put the weighted file?

rare gobletBOT
#

Ayo? @pure pecan level 1 !!! lfg

true ravine
#

why did you guys get ai voice changer

#

whats the point of this

pure pecan
#

wait, is RVC only for singing, I thought it was a text to speech

timid valve
#

it's speech to speech actually

#

so that includes singing

pure pecan
#

well I founded a folder called weights, and put my model in there, but when im in the rvc ui, I don't see any option to select it.

pure pecan
#

nvm figrued it out, had to take the pth file out

#

just wish i could use tts instead of voice...

#

I want to script something and have it read it all out

#

blah i just keep getting errors trying to convert voice, oh well, thanks anyway

pure pecan
#

what am i doing wrong

#

well crepe woked

rare gobletBOT
#

Ayo? @pure pecan level 2 !!! lfg

pure pecan
#

sounds awful with my voice though

brittle wing
pure pecan
brittle wing
#

Colab or local

pure pecan
#

local

true ravine
#

Hello

#

How to hide app?

crude flame
pure pecan
#

maybe the model was just bad, tried another one and it's really good (1000 epocs hatsune miku)

#

seems to not work with recordings that are longer than a minute

#

and rmvpe doesnt work at all

#

yeah i think im gonna need some 1 on 1 help in call or something i just dont get what im doing

neon hemlock
#

hirari

pure pecan
#

yeah?

blazing crane
#

Need help
No errors, just the RVC bugs out and stops whenever i try to train
INFO:torch.nn.parallel.distributed:Reducer buckets have been rebuilt in this iteration

rare gobletBOT
#

Ayo? @blazing crane level 1 !!! lfg

blazing crane
#

Using Ov2 pretrain at 40k

brittle wing
#

Last time, which one is the best alternative for denoising?

pure pecan
brittle wing
#

-colab

azure marshBOT
# brittle wing -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

crude flame
brittle wing
#

The one in the colab.???

pure pecan
crude flame
#

oh

#

misread

#

i thought it was several images

brittle wing
#

It's a collage...

#

Yh

#

Just asking

crude flame
#

for colab melband it is the second one for mvsep you can use either standard or aggressive for denoise by aufr33 and xminus you can use either of the melband

brittle wing
#

Cause results are different everywhere

crude flame
#

the first two are the same (mvsep and colab) and idk about xminus

brittle wing
crude flame
#

just use the one that sounds best

brittle wing
#

Which one is it

brittle wing
crude flame
#

2

brittle wing
#

I don't wanna waste time or GPU

brittle wing
crude flame
#

yes

#

thats the second one

#

thats the one with a 2

brittle wing
#

In the Colab?

crude flame
#

yes

brittle wing
#

With high Chunk size and overlap 16?

crude flame
#

thats fine

brittle wing
#

Is this dereverb model okay for future datasets?

#

Should I use overlap 8 or 16?

covert anchor
blazing crane
marsh schooner
#

what do u think about this?

simple ore
#

1hr+ - 8

#

<1hr - 4

#

10 hr+ - 16

#

something like that

#

more data makes a batch more uniform

covert anchor
#

now i have another problem

#

i train too slow, idk why

#

52 sec every epoch

marsh schooner
simple ore
#

with smaller batch those outliers have more noticeable effeect

marsh schooner
simple ore
#

sure you can try batch 8 for 30m+

marsh schooner
#

or is it just the time

simple ore
#

it does affect the training result

#

the model is trying to find optimal parameters to generate a voice, with different batches it may overshoot or undershoot the target or circle around local minima

marsh schooner
#

oh dang should i try to reatrain then bc yesterday i made like a 22 min data set on 8 batch size and it doesnt sound like the voice i was making but it sounds realistic

simple ore
#

just to give an idea

covert anchor
#

there's any way i can improve the speed on the training?

simple ore
covert anchor
#

i should be training at 2 or 3 secs on every epoch, or at least i think i should

covert anchor
marsh schooner
simple ore
#

it was training 2-3sec/epoch because you had an empty set lol

marsh schooner
#

lol

simple ore
#

with 2 mute files and discarded 9 min audio

simple ore
covert anchor
simple ore
#

but now that you've sliced it the training actually uses it and for 9 min file and 4060 52/s epoch is a good speed

covert anchor
#

oh

rare gobletBOT
#

Ayo? @marsh schooner level 4 !!! lfg

covert anchor
#

okay okay

#

thx again, i will train at that speed and check if it was good

marsh schooner
#

the goal is to get it as far down as possible right?

#

the further down means more realism most likely

#

and is it fine going backwards like this?

modern surge
#

-rvc

azure marshBOT
modern surge
#

-colab

azure marshBOT
# modern surge -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

simple ore
simple ore
#

mel loss is how close the spectrogram of the generated audio is close to the original

#

less is better

#

fm is more complex measure, generally it should go down or stay somewhat stable

marsh schooner
sharp rampart
#

hey does rvc work in ableton or any daw for real time or no

#

like can you talk through rvc and get the outputs /inputs into your daw like fl studio, ableton, etc in realtime with no delayt

#

or is that simply not possible

marsh schooner
sharp rampart
#

this

rare gobletBOT
#

Ayo? @sharp rampart level 1 !!! lfg

sharp rampart
#

or do i have the wrong thing to do it

marsh schooner
#

that i have no clue i use the voice changer i thought that was just text to speech though

sharp rampart
#

oh thats probably what thats for then

#

let me go check

marsh schooner
#

i run mine using my 3060 ti

sharp rampart
#

i need it to have no delay so i can make music with it

#

thats not possible yet is it?

marsh schooner
#

i have no clue i dont do music ai

sharp rampart
#

i got a 40 series

#

ok

marsh schooner
unique rock
#

For how many minutes of dataset can I use this batch size without the colab disconnecting and for how many times so that the model does not sound bad?

marsh schooner
#

isnt this slow for a 4090 22-25 min dataset

steady geyser
#

Where do i get a virtual mic

simple ore
chilly ridge
#

Hello everyone! I would need the advice of someone who knows about RVC please, I can't find what I'm looking for on the forums or it's old (or I don't know how to search). I'm looking to train an rvc model, the problem is that I only have 2min of audio in wav files of about 2s. So I've tried 50, 100, 300 Epoch, changing the batch size... I still have a "robotic" voice and all the Ss or CHs are "metallic". Is it possible to train my model with all these wav files, or is it better to increase the size of my dataset? All my audio files are studio quality, without reverb ect ect... Thank you!

#

I run it locally on a quadro RTX 4000 Ada 8Go

analog obsidian
#

it does not really fix them but makes them appear way less

kindred swallow
#

"frequent errors occured" anyone know how to fix

chilly ridge
analog obsidian
#

use audacity audio labelling

chilly ridge
#

All my files is around 2s long, it's better to increase manually the duration ?

#

I can make a python script for that, 3s of full audio, without silence

#

3s average

analog obsidian
chilly ridge
#

Okay thx

analog obsidian
#

10 minutes is bare minimum for okay perfomance
models hit the "realistic" tone at around 40 minutes

chilly ridge
#

I have a another model with 8min of audio also in 2s wave file, but it's a really "specific" voice, it's really glitched when i use it. (It's Pat from Mickey in french), even if i try to sound like the original voice to help rvc it's the same. But if i set the pitch to -12 it's super clean, but, tooooooo deep

oak inlet
#

yii

analog obsidian
oak inlet
#

yoo

#

wsg

chilly ridge
rare gobletBOT
#

Ayo? @chilly ridge level 2 !!! lfg

analog obsidian
oak inlet
#

vocal and audio

analog obsidian
#

or mel roformer kim

oak inlet
rare gobletBOT
#

Ayo? @oak inlet level 1 !!! lfg

chilly ridge
analog obsidian
oak inlet
#

oh

analog obsidian
#

you can keep a few laughs

#

but be sure to not add too much of them

#

might confuse ai believing thats how the person sounds

oak inlet
#

how do i add ffprobe and ffmpeg to root. Im kinda restarted

chilly ridge
analog obsidian
oak inlet
#

nah win

#

11

analog obsidian
oak inlet
#

ok

simple ore
#

so RVC needs about 5-10k attempts to make a proper 'S' or 'Ch' or similar sounds out of pure noise. That is why undertrained model produces metallic S.

chilly ridge
#

Ho i see

analog obsidian
simple ore
#

with a too small dataset you need to train for like 2000+ epochs

#

but chances are that while it may fix S sounds it may ruin voiced parts (those wavy lines)

chilly ridge
#

I'll use audio file from all kingdom hearts game instead of 1 to increase the dataset size

#

I think i can reach 8-10min

chilly ridge
simple ore
#

the example aboive took 5000 loops

chilly ridge
#

The image ?

simple ore
analog obsidian
#

he's trying to explain you that the more steps you train, they appear less

#

xD

simple ore
#

not, they dont appear less, they take a proper shape

chilly ridge
#

Wow

#

It's a big difference

simple ore
#

the noise column shifts into the right frequency range

#

so instead of metallic z it is a proper hissing s

analog obsidian
chilly ridge
#

Haha

#

Ok so, with 10min dataset. 3 batch size and 2000epoch ? 🙃

analog obsidian
chilly ridge
#

Ok, i'll try it tomorrow (it's 2am here)

#

Good night, thx for all

analog obsidian
simple ore
#

it would depend on how many times there's S in the dataset and if you're lucky enough for the training to hit that S slice of audio

knotty moth
simple ore
#

anyway, with a pretrain the requirements are not that high.. i've been testing it from scratch

simple ore
#

nowhere perfect, but like 50% of the work is done there

#

after that it is just small touches here and there that slowly shape up the voice

#

but anyway, rvc is not a real voice clone, there are very important characteristics it can not reproduce such as peronal inter-phoneme microdelays and mannerisms

knotty moth
dull ledge
#

@analog obsidian Hi Lyery, are you there?

dull ledge
#

You have a really cool Goth Mommy model, unfortunately the link is not working, did you delete it? If not, mind to share it with me if its still public? Thank you 😄
The link is this one: https://voice-models.com/model/1ucea3z45g5

analog obsidian
dull ledge
#

Was the best one I heard 🤭 . No worries, thank you

oak inlet
#

yo

#

i got an error

azure marshBOT
oak inlet
#

File "C:\Users\vedant\Desktop\Retrieval-based-Voice-Conversion-WebUI-main\infer\modules\vc\modules.py", line 172, in vc_single
self.hubert_model = load_hubert(self.config)
File "C:\Users\vedant\Desktop\Retrieval-based-Voice-Conversion-WebUI-main\infer\modules\vc\utils.py", line 23, in load_hubert
models, _, _ = checkpoint_utils.load_model_ensemble_and_task(
File "C:\Users\vedant\AppData\Local\Programs\Python\Python310\lib\site-packages\fairseq\checkpoint_utils.py", line 423, in load_model_ensemble_and_task
raise IOError("Model file not found: {}".format(filename))
OSError: Model file not found: assets/hubert/hubert_base.pt

#

!howtoask

patent trellisBOT
# oak inlet !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
azure marshBOT
magic badge
#

/collab

#

-colab

azure marshBOT
# magic badge -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

wet fable
#

hi there, i'd like to ask whether RVC is better than weights.gg
RVC needs lots of time to generate, while weights.gg not
idk if there's differences. Thanks!

wet fable
#

I'm sorry so it's exactly the same thing?

knotty moth
crude flame
wet fable
#

Ok thanks!

low shard
#

I don't remember if it was an AI specialized GPU like A100 or an rtx 4090 tho

flint solar
#

hello chat

#

oh fuck wrong chat

prisma kite
#

pomocy wyskoczyło mi No module named 'gradio'
i nie wiem jak to naprawić

#

help, I got No module named 'gradio'
and I don't know how to fix it****

simple ore
#

but better just get a compiled version

#

with all the required packages included

gentle hollow
#

Guys plsss tell me wat Ai vocal remover y’all useee

young perch
#

Guys, what should I do if I experience a large delay in my voice changer? When I test it in the program, delay is about 3-4 seconds, but when I join the game, it shows around 30 000 ms. Please Help(((

sage wind
#

ummm guys

#

when i opened voice models channel theres literally 0 models

low shard
#

might be a discord moment

sage wind
low shard
low shard
sage wind
rare gobletBOT
#

Ayo? @sage wind level 1 !!! lfg

sage wind
#

i can even send screenshot but i dont have permission

low shard
low shard
sage wind
#

🔥🔥🔥🔥

#

ty yall

low shard
#

yw

chilly ridge
#

20s/epoch

dawn bronze
analog obsidian
low shard
#

Why would you want to convert them from .wav to mp3? mp3 is lossy and lower quality than wav

#

wav is way better, it's lossless

#

There's no reason to convert them,

If you really want to do that, you can just google any wav to mp3 converting site like this

#

But It's really not suggested

knotty moth
azure marshBOT
#
✍ Suggestions
  • Search for it in AI HUB Docs or Applio Docs. You will probably find your answer there 📚
  • Ask for help in #🔍│help-w-okada if it's related to real time voice changing but make sure to read #1297207135469305866 first
  • Ask for help in #✨│ai-help for general help, but use the command !howtoask first to learn how to structure your question properly and increase your chances of getting a reply
  • Last but not least, ask for help in #🔍│help-ai-art if it's related to AI images.
chilly ridge
#

but it's really really better

analog obsidian
#

anything below that use a batch size of 4

#

and pray for getting them less often

chilly ridge
#

okay, and if i try more epoch ? like 400 ? it will just getting worst ?

analog obsidian
#

not audible difference + unnecesary risks

chilly ridge
#

ok i'll try to increase the dataset again

#

do you know how i can extract automatically a voice from a certain caracter from a movie file ?

#

it look really long to do it manually

analog obsidian
chilly ridge
#

i tryed with python and speechbrain but it's not convincing

rare gobletBOT
#

Ayo? @chilly ridge level 3 !!! lfg

analog obsidian
#

sadly is better to separate speakers manually

analog obsidian
chilly ridge
#

it juste identifyed the whole movie like it's the character i'm looking for

low shard
analog obsidian
low shard
chilly ridge
analog obsidian
#

they do have a paid version i havent tried

#

probably he tried that

low shard
analog obsidian
low shard
#

Nuh uh

#

Not paying for any shit

#

I love Open Source

analog obsidian
#

me 2

chilly ridge
#

i'll try again with diarization

#

maybe i did something wrong

low shard
low shard
low shard
#

but maybe could help, im saying it just for that

#

imagine having to watch a whole movie to make a model 😭

chilly ridge
#

any idea how i can export all the movie scene with that character ?

#

clearelly i don't like the idea to do it mannualy

chilly ridge
#

sad

#

for just 2min of audio :)

chilly ridge
glacial pollen
#

so, where is it

analog obsidian
glacial pollen
#

ah

#

I mean

#

It should be pretty simple

#

🤔

#

you lack a model ( a component of rvc, that is )

oak inlet
#

oh

chilly ridge
glacial pollen
#

@oak inlet

analog obsidian
glacial pollen
#

Dl all you lack from there

chilly ridge
#

thx

oak inlet
glacial pollen
#

hubert .pth goes into assets/hubert

#

rmvpe goes into assets/rmvpe

oak inlet
#

I DIDNT KNOW WHAT TO DO ON THAT TY

glacial pollen
#

oof

#

You see

#

always look at the last part of traceback

oak inlet
#

legend fr

glacial pollen
#

( + additionally, the root of the issue

#

but typically the last line tells you what is the main deal

glacial pollen
#

Alr, glad I could help. best of luck man

oak inlet
glacial pollen
#

yes

#

.pt or .pth are pytorch models format
~ for the record

oak inlet
#

oh

#

is it hubert base

glacial pollen
#

yes

oak inlet
#

oh ty

glacial pollen
#

rmvpe is for f0 extraction
hubert is for features

waxen kelp
#

hello how do i send pictures for help?

glacial pollen
oak inlet
#

its levek 2

glacial pollen
#

oh, then 2 then. Seems like the threshold got lowered

oak inlet
#

i can do it rn

oak inlet
glacial pollen
#

or somethin'
to not clog the main chats

#

Anyway, I gotta go back to work now

oak inlet
#

yo i got another error now there is no trace back it procceses for a few sec then it says error

#

2024-12-01 11:13:13 | INFO | fairseq.tasks.hubert_pretraining | current directory is C:\Users\vedant\Desktop\Retrieval-based-Voice-Conversion-WebUI-main
2024-12-01 11:13:13 | INFO | fairseq.tasks.hubert_pretraining | HubertPretrainingTask Config {'_name': 'hubert_pretraining', 'data': 'metadata', 'fine_tuning': False, 'labels': ['km'], 'label_dir': 'label', 'label_rate': 50.0, 'sample_rate': 16000, 'normalize': False, 'enable_padding': False, 'max_keep_size': None, 'max_sample_size': 250000, 'min_sample_size': 32000, 'single_target': False, 'random_crop': True, 'pad_audio': False}
2024-12-01 11:13:13 | INFO | fairseq.models.hubert.hubert | HubertModel Config: {'_name': 'hubert', 'label_rate': 50.0, 'extractor_mode': default, 'encoder_layers': 12, 'encoder_embed_dim': 768, 'encoder_ffn_embed_dim': 3072, 'encoder_attention_heads': 12, 'activation_fn': gelu, 'layer_type': transformer, 'dropout': 0.1, 'attention_dropout': 0.1, 'activation_dropout': 0.0, 'encoder_layerdrop': 0.05, 'dropout_input': 0.1, 'dropout_features': 0.1, 'final_dim': 256, 'untie_final_proj': True, 'layer_norm_first': False, 'conv_feature_layers': '[(512,10,5)] + [(512,3,2)] * 4 + [(512,2,2)] * 2', 'conv_bias': False, 'logit_temp': 0.1, 'target_glu': False, 'feature_grad_mult': 0.1, 'mask_length': 10, 'mask_prob': 0.8, 'mask_selection': static, 'mask_other': 0.0, 'no_mask_overlap': False, 'mask_min_space': 1, 'mask_channel_length': 10, 'mask_channel_prob': 0.0, 'mask_channel_selection': static, 'mask_channel_other': 0.0, 'no_mask_channel_overlap': False, 'mask_channel_min_space': 1, 'conv_pos': 128, 'conv_pos_groups': 16, 'latent_temp': [2.0, 0.5, 0.999995], 'skip_masked': False, 'skip_nomask': False, 'checkpoint_activations': False, 'required_seq_len_multiple': 2, 'depthwise_conv_kernel_size': 31, 'attn_type': '', 'pos_enc_type': 'abs', 'fp16': False}
2024-12-01 11:13:14 | INFO | infer.modules.vc.pipeline | Loading rmvpe model,assets/rmvpe/rmvpe.pt

oak inlet
#

oh

flint solar
#

its info

oak inlet
#

but it says error

#

in the webui

flint solar
waxen kelp
#

hi! i try to use RVC but in the end it says this.does anyone know whats the problem and how to fix it

oak inlet
flint solar
waxen kelp
#

?

#

i thought i have the latest version

distant turtle
#

-colab

azure marshBOT
# distant turtle -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

waxen kelp
#

seems a lot of stuff

#

im not aware of much github

oak inlet
#

wait

#

make sure u have pip 24.0

#

3.10

#

rc1

waxen kelp
oak inlet
rare gobletBOT
#

Ayo? @oak inlet level 3 !!! lfg

oak inlet
#

wait

#

u wanna call or sm?

#

through dc

#

discoerd

waxen kelp
#

nah its aight i was just tryna make a silly cover

oak inlet
#

oh

waxen kelp
#

thank you very much helping tho!

rare gobletBOT
#

Ayo? @waxen kelp level 2 !!! lfg

low shard
#

and yea u shouldn't use rvc gui

waxen kelp
low shard
#

You can check your pc gpu via:
ctrl+shift+esc (task manager) -> Performance tab -> GPU
and here u should find the gpu memory

#

I have never heard of that gpu + can't find it online

gloomy lynx
#

I just realized my version of rvc gui is outdated too damn

rare gobletBOT
#

Ayo? @gloomy lynx level 1 !!! lfg

low shard
#

what's ur pc gpu

gloomy lynx
#

nvidia geforce rtx 3060 ti

low shard
gloomy lynx
#

Oh alr

#

what do you thinks better?

low shard
gloomy lynx
#

Alr lemme try it

chilly ridge
analog obsidian
chilly ridge
#

Both

#

10min and 25s

analog obsidian
#

thats not normal, it should be 1-2-3 minute per epoch if your dataset has 1 hour of data

#

how much vram do you have

chilly ridge
#

I reach 48min without silence in audio

chilly ridge
#

I have an IA gpu normally

#

I dont' know if it's change something

analog obsidian
#

oh that explains

chilly ridge
#

Quadro ADA RTX 4000 8Go

analog obsidian
chilly ridge
#

Yep

#

I cheked my cuda installation and everything look ok

analog obsidian
#

idk sorry i have no idea, i don't use these types of gpu

chilly ridge
#

With the 10min dataset and 4batch it was 7s/epoch

#

I checked the "charge dataset in gpu"

analog obsidian
#

this explains

#

caching dataset in gpu in huge datasets cause massive vram usage

#

you're using system RAM

#

this is why became so slow

#

system is using fallback memory

#

disable that on big datasets

chilly ridge
#

Ok so i need to uncheck this option ?

analog obsidian
#

yes

chilly ridge
#

Okay, i try

#

Thx

rare gobletBOT
#

Ayo? @chilly ridge level 4 !!! lfg

chilly ridge
#

Thx 🙏

analog obsidian
#

no problem

chilly ridge
#

I'll credit you if i finish my project one day

oak hearth
#

so uhwhere do i download

rare gobletBOT
#

Ayo? @oak hearth level 2 !!! lfg

oak hearth
#

my thingy flatlined, it TECHnically is going down and it isnt going up do i just download the latest .pth saved?

#

or do i let it go until it goes up?

simple ore
knotty moth
oak hearth
knotty moth
oak hearth
#

so uh is there any way to resume training without a g and d file

gloomy lynx
#

i dont think so?

#

I havent made a model in awhile so im not sure

chilly ridge
chilly ridge
#

Mb

knotty moth
chilly ridge
#

Haha, no, i prefer keep it

pastel fiber
#

can someone help

azure marshBOT
chilly ridge
timber hamlet
#

What’s the

pastel fiber
#

why does it say the input is the vb output, it also says output is vb input

timber hamlet
#

We’re are the buzzes poles

#

People

#

What’s a good room too be in

rare gobletBOT
#

Ayo? @timber hamlet level 1 !!! lfg

timber hamlet
#

How do u find that iam new🫤

#

Why u yell

meager comet
#

how much faster is an a100 compared to colab t4

knotty moth
chilly ridge
#

No no

#

I said i don't have an ada

#

Just a quadro rtx 4000

#

My boss's gpu is an rtx 4000 ada

knotty moth
knotty moth
chilly ridge
#

Haha yeah, it's this one

#

I thought I had the same computer as my boss

#

But it look like he keep the big gpu for him

knotty moth
chilly ridge
#

I see, so, my laptop with the 3060 8Go could be faster/epoch ?

knotty moth
#

also RTX laptops are usually easier to overheat than desktop ones

chilly ridge
#

I put my laptop upside down in front of the AC 🤫

chilly ridge
knotty moth
#

it literally shows the same performance as 2060

chilly ridge
#

Okay, not bad for a free gpu

shut goblet
#

how the fuck do i fix this

simple ore
#

or cuda tools

shut goblet
#

it showed up a error

#

is there a proper way to open the file?

simple ore
#

is failing to load because some dependency is missing

simple ore
#

from that open the dll shown on the screenshot, it will list what it needs

#

but my guess is either vc++ redist or CUDA toolkit

frank owl
wet fable
#

Are there any accessible applications that could have the same effect as RVC?

blazing solar
#

-cOLAB

azure marshBOT
# blazing solar -cOLAB
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

rare gobletBOT
#

Ayo? @blazing solar level 1 !!! lfg

low shard
#

what’s ur pc gpu and what are u looking for?

wet fable
low shard
#

RVC is the best speech to speech program

low shard
knotty moth
#

-rvc

azure marshBOT
shut goblet
#

tf is a gradio

low shard
#

so a requirement for eddy’s UVR UI

#

you seem to be on windows, did you run UVR5-UI-installer.bat Without administrator?

low shard
# shut goblet Yes

Are you sure it’s in the C drive and there’s no special characters in whatever folder it is?

#

it doesn’t seem to be on the C drive

marsh schooner
#

how much hours do yall think is too much hours on a data set to be used on a voice changer

#

like if i wanted to avoid it being overtrained what would i do

#

for the maximum

frosty coral
#

why does rvc keep crashing whenever i click start audio conversion? (not responding)

brittle wing
#

-hf

azure marshBOT
frosty coral
#

as in gpu?

marsh schooner
flint solar
#

But if I were to choose

#

I’d go for 7-10 min dataset of clean and diverse data

marsh schooner
#

minutes???

flint solar
marsh schooner
#

i was ab to go for like a 2-3 hours

#

dataset

flint solar
low shard
#

be sure to NOT follow yt tutd

flint solar
low shard
flint solar
#

Ik what he using

low shard
#

Can’t people just read our guides

#

its not that hard to read

flint solar
#

It’s the gui wit harvest and pm

#

The old ass one

low shard
low shard
#

fuck old yt tuts

marsh schooner
flint solar
knotty moth
flint solar
knotty moth
#

he also kept overtraining, perhaps till more than 1k epochs 💀

vale raptor
#

hi I have problem, I tried to import voice model to gui but got this error:
size mismatch for enc_p.emb_phone.weight: copying a param with shape torch.Size([192, 768]) from checkpoint, the shape in current model is torch.Size([192, 256])

how can I fix it?

#

I'm sorry for my mistakes, english is not my native language

flint solar
#

locally or colab

marsh schooner
flint solar
#

The gui ur using is very out of date

#

It’s only compatible with rvc v1 models

flint solar
vale raptor
flint solar
azure marshBOT
# flint solar -colab

Suggestions for @vale raptor

📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

flint solar
vale raptor
flint solar
low shard
#

in case he has a good pc to run it on

flint solar
low shard
low shard
# flint solar True

yea bc i seen people using colab while they got an rtx 3060, just because the helper gave them a colab

#

The user should always be asked what hardware they got so the helper can give them the tools to use

amber fjord
flint solar
amber fjord
vale raptor
flint solar
#

No need to use colab

low shard
#

that’s why every staff should always ask for the hardware first

low shard
# vale raptor rtx 2060s

your pc is good enough to run it locally (use it on ur pc), colab is just a cloud service (run it on remote good pc, for people who got bad pc)

#

As you got a good PC, you can use RVC locally, you can choose between:

  • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
  • Mainline: The original RVC
low shard
amber fjord
#

rtx 4060

rare gobletBOT
#

Ayo? @amber fjord level 1 !!! lfg

low shard
amber fjord
low shard
#

im guessing you want realtime voice changer for calls?

amber fjord
#

yeah

low shard
#

Yea its the best u always be specific when asking for help

low shard
azure marshBOT
amber fjord
#

?

#

me need click?

low shard
# amber fjord ?

1st link, its the wokada (program for using rvc , speech to speech, models in realtime for calls) fork (modified version, this one has better performance)

#

the 1st link is a guide to install it (which u will use the nvidia way as you got an rtx)

#

By reading the guide, you have everything you need to know, if you run into any issues, ask in #🔍│help-w-okada as this isn’t the correct help channel for that

amber fjord
#

thank

low shard
#

yw

vale raptor
rare gobletBOT
#

Ayo? @vale raptor level 2 !!! lfg

flint solar
vale raptor
#

I see

vale raptor
#

thank you very much

simple ore
#

too much variety in the samples, i guess?

flint solar
flint solar
simple ore
#

it started with 35 and ended with 36.4?...so training duration played little difference

#

I have an opposite problem

amber fjord
#

what is it?

simple ore
#

I wish people stopped placing projects into 'special' folders

simple ore
#

in short, programs like voice changer are not designed with the best Windows SDK guidelines. VC executed by you, without admin permissions, is trying to write stuff into semi-protected Program Files folder, which Windows does not allow, because the software should utilize a user folder instead

flint solar
# simple ore

I never really understood what the fm graph is good for

opal verge
#

hey im using google RVC colab added index didnt crated ); everthing eles goes well what shuold i do?

low shard
opal verge
#

rvc V2

simple ore
#

since it does not compare the exact values, it is possible that one part of the generated audio gets better while the other gets worse, but the average difference stays about the same

amber fjord
#

maybe I don't understand something, but no matter where I put it, the program doesn't work for me

flint solar
simple ore
flint solar
#

Actually most of the time

simple ore
#

here I had a model try to produce a single sample

#

you can hear the difference and the FM chart reflects that

flint solar
flint solar
low shard
simple ore
#

yeah, best case I see something like

low shard
simple ore
#

cuz someone needs to rename the channel 🙂

#

help-rvc (not VC!)

#

help-w-okada (the VC!)

flint solar
#

I thought it was bad

simple ore
#

I think the value going up is expected as the model figures out the parameters to use, but it should stabilize and settle at some value or around it without much deviation

#

like that 'cement' example above

#

the discriminator used in RVC is not stellar, so in most cases the model just finds some local minima for parameters and settles around that

#

and it does not get out of that hole no matter how much more you train it

#

there's a chance that fm going up is just that hump on the chart after which the value would go down once the model learns to reproduce the audio better, but I've yet too train for that long to see it

flint solar
simple ore
#

that would require a whole new set of pretrains

#

I've tested a new loss function that does not require much change, seems to be doing better

fallen grotto
#

does anyone have any colab i can use to make ai covers

rare gobletBOT
#

Ayo? @fallen grotto level 2 !!! lfg

azure marshBOT
# flint solar -colab

Suggestions for @fallen grotto

📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

flint solar
#

Use applio

fallen grotto
#

thanks ho

low shard
azure marshBOT
#

Not available yet

grim bay
#

Any order of model to use to extract vocals from a song to make AI cover?
I'm both using MVSEP and UVR

simple ore
rare bear
#

Sorry if this is the wrong place to ask this question but I'm brand new to AI voices and what can be done with them.
I was wondering if the w-okada voice changer is the only 'app' available for real-time changing of voices on a Mac or are there any other apps I could try RVC files out on a Mac with? Thanks.

rare bear
flint solar
simple ore
rare bear
rare gobletBOT
#

Ayo? @rare bear level 1 !!! lfg

rare bear
low shard
#

lol

flint solar
low shard
inner cloak
#

When I try to start Applio (colab), I am getting this:

An error occurred connecting to Discord: Could not find Discord installed and running on this machine.
Traceback (most recent call last):
File "/content/program_ml/app.py", line 90, in <module>
inference_tab()
File "/content/program_ml/tabs/inference/inference.py", line 418, in inference_tab
choices=get_speakers_id(model_file.value),
File "/content/program_ml/tabs/inference/inference.py", line 325, in get_speakers_id
model_data = torch.load(model, map_location="cpu")
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1004, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 456, in init
super().init(torch._C.PyTorchFileReader(name_or_buffer))
OSError: [Errno 22] Invalid argument

It just stays like this with no link to gradio. Anyone else?

#

It was working last night, so I am guessing it needs to be fixed. Edit: nvm

marsh schooner
#

how do i train 2 datasets at once

#

im on a 4090

shut goblet
#

how do i fix this

#

why must you be this way

rare gobletBOT
#

Ayo? @shut goblet level 3 !!! lfg

simple ore
#

if I had to guess, some translation is messed up

spice wing
#

Where can i find Voice models for AllTalk Ai?
pth files dont work there

shut goblet
unique rock
#

Hey, I need to remove the breaths from my dataset, for example: the breaths when a singer is about to sing du verse and takes a breath or releases it when he finishes singing.

low shard
low shard
#

also @viscid moss

wild yoke
#

Why is .wav not recognized as a format?

simple ore
simple ore
marsh schooner
#

what does uvr denoise even do

#

i feel like it does nothing

viscid moss
viscid moss
oak hearth
#

What are pretrains? If I want to clone a voice, should I use a pretrain or spend more time making it myself? What will be more quality?

rare gobletBOT
#

Ayo? @oak hearth level 3 !!! lfg

oak hearth
#

Are pretrains just faster and better and should i use them to make the most lifelike realistic clone?

analog obsidian
# oak hearth What are pretrains? If I want to clone a voice, should I use a pretrain or spend...

pretrains are pre-made models trained on days worth of audio (in rvc case)
making them from scratch is a very hard process even for people that have the knowledge
always use a pretrain model when training on rvc
their purpose is to have a baseline during the finetuning process (when you are training a model)
They can affect the final result quality since audio upscaling is involved
If you train without a pretrain your model is going to sound like shit because rvc has no prior knowledge of sounds

analog obsidian
#

for a realistic model train an expressive non monotone dataset of 30 minutes and above

the more data, the better and realistic

be sure your dataset has variety, for example avoid monotone dialogue or audios repeating the same sentence/words etc

coral frigate
#

Can someone give me advice? I’m training a model with 15 hours of data and I don’t want to use a pretrain on it. Which one should I select? Custom makes me add my own pretrain I think but I’m not sure.

rare gobletBOT
#

Ayo? @coral frigate level 5 !!! lfg

analog obsidian
#

aim for the amount of time as vctk dataset

crude flame
coral frigate
#

I’m not aiming to make a pretrain. I just want to make a voice model with my dataset but I don’t want to use any of the pretrains applio offers

coral frigate
#

If that makes sense

analog obsidian
#

rvc will not be able to give you a good result

#

use a pretrain instead

#

when you remove a pretrain you remove the knowledge of sounds

coral frigate
#

Im still new to making models but everytime I’ve used a pretrain, the voice has sounded strange and very off. Is there a way I can fix that then if I have to use a pretrain?

analog obsidian
coral frigate
#

Yes with 150 epoch. It sounds nothing like the dataset and just sounds really off. It has been the case whenever I use a pretrain. In this case I used the contentvec pretrain.

analog obsidian
#

is an embedder

crude flame
analog obsidian
#

for your huge dataset of 15 hours you can try batch 16, and it should take 1 or 2 days to give good results with a pretrain

coral frigate
coral frigate
coral frigate
#

Sorry if this is a dumb question but How do I use an original pretrain?

analog obsidian
#

yea that

#

leave it like that

coral frigate
#

Ow ok I just keep that unchecked and I’m good to go?

coral frigate
#

Ok thank you so much guys

coral frigate
coral frigate
#

Ok thank you

coral frigate
knotty moth
#

remember: quality > quantity

analog obsidian
#

the more the better but i agree 15 hours is too much 😭
also you have to be sure your dataset has no possible sounds that could cause problems later

coral frigate
#

Yeah sadly I did have to listen through all 15 hours like a podcast

analog obsidian
#

i mean technically a 15 hour dataset is better than a 30 min or an 1 hour one
but rvc already becomes realistic at the 30 minute mark

#

technically speaking is not bad
but is too much effort

coral frigate
#

Idk I’m dumb and still figuring rvc out. I’ve actually been trying to train this for 3 weeks now but applio hasn’t been able to train it. Just keep getting an error

analog obsidian
#

max 2 hours
higher than that is not bad, but not worth the effort

covert axle
#

-colab

azure marshBOT
# covert axle -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

coral frigate
analog obsidian
knotty moth
coral frigate
analog obsidian
#

@knotty moth is rmvpe on applio using gpu now?

#

or they're still using rmvpe cpu

#

if applio is still using rmvpe cpu, is going to be super slow

#

if applio is using rmvpe gpu it should be done in 30 mins approx

analog obsidian
#

so the effort will at least be worth a little bit

analog obsidian
coral frigate
#

ive been getting this error in the cmd for a month and im probably doing something dumb thats failing the extraction. have any suggestions on how else i can trouble shoot it?

knotty moth
coral frigate
#

ill try that

analog obsidian
coral frigate
analog obsidian
#

maybe it has one but is very marginal

knotty moth
analog obsidian
#

staring at crepe

knotty moth
analog obsidian
#

crashed that bad

coral frigate
analog obsidian
coral frigate
#

Yh it’s at 4 now for both

#

Could it be because I have 33 files in the dataset folder maybe?

analog obsidian
#

so is doing it on 3s samples

#

it goes out of vram when is doing a lot of them

#

at the same time

coral frigate
#

This has been an issue for a while now. Idk what else I can do to fix the issue

analog obsidian
#

might be related to applio only

coral frigate
analog obsidian
coral frigate
#

Well then I’m doomed

analog obsidian
#

cpu cores only affect feature speed when you're using an cpu based extractor

coral frigate
#

I’ll try that but I don’t think it’ll change much seeing that my cpu should’ve been able to handle 4

#

Maybe redownloading applio will do something

analog obsidian
#

the gpu is doing the extractor

#

hence why it gets out of vram

#

in older versions of applio rmvpe was actually cpu based

#

but is very slow compared to rmvpe gpu

coral frigate
#

Even then my gpu should’ve been able to handle 4 without vram issues. I’ll try redownloading applio and hope that fixes something

analog obsidian
#

yeah exactly it should be able to do it, i find weird its getting out of vram

#

i hope redownloading fixes the issue

coral frigate
#

Thanks. I hope so

rare gobletBOT
#

Ayo? @coral frigate level 6 !!! lfg

coral frigate
#

I’m out of options after that

glacial pollen
#

use 4 or 8 threads and switch to rmvpe ( non gpu variant )

#

btw, would be helpful if you provided info on your amount of ram, ( and vram if you actually did use rmvpe gpu variant )
Also, total length of the set and length per file ( can be avg )

analog obsidian
#

is weird he's getting oom with a 4090

glacial pollen
#

in any case, 4090 shouldn't have any ooms like that

analog obsidian
glacial pollen
#

only reasonable way out of it is.. the extraction's done on cpu and / or per-file length

#

oh

#

That's an overkill

#

what's the length per file? ( as I assume it's not big sample but split

coral frigate
glacial pollen
#

above 20 or maybe 25 ( but I'd stick to 20 per file ) mins, it can cause issues

analog obsidian
#

oh but even if he lets the slicer to do it?

glacial pollen
#

oh huh

#

lemme think

analog obsidian
#

yea this is new to me too xD

glacial pollen
#

oh yea

#

that can surely be the case cause, 1 set where extraction is loading 15 hours of data vs sequential 1 by 1 sid processing ( even 40 or 100 hours )

#

is different

analog obsidian
#

interesting, rvc always giving surprises

glacial pollen
#

or so I suspect at least

#

there's no other way out of it

#

other than applio being jammed ( wouldn't be surprised at this point

analog obsidian
glacial pollen
#

lemme prepare something

#

might have a 'temp' solution

#

@coral frigate Amount of threads you have?

#

I'll assume you can afford to use 8

coral frigate
#

How do I check? But I definitely would

glacial pollen
#

that's alright then, gimme a sec

knotty moth
coral frigate
#

Ryzen 9 9950x

glacial pollen
#

okay so

#

now the question is, are you willing to spend a lil bit of time on that?

knotty moth
glacial pollen
#

@coral frigate Cause basically, you'd have try 2 things

  1. Try to feature extract on single 30-50 min file
  2. If that failed, you'd have to use another fork or just, mainline for the sake of extraction ( and then move stuff back to applio ) (( that'd confirm whether applio is the issue or just the situation itself isn't favored by rvc in general ))
  3. If that failed too.. rip, in that case it'd mean rvc's not optimized for single speaker 15h at once extraction
coral frigate
#

Ok how do I move the feature extract back into applio if that’s the issue ?