#✨│ai-help

1 messages · Page 242 of 1

oblique viper
#

do they have a more up to date guide somewhere?

analog obsidian
#

you choose batch size depending on how big is your dataset

#

there are two slicing methods for datasets, simple mode, and automatic mode
Automatic mode is the same slicer of mainline
simple mode slices every 3s (by default), it doesn't take silence into account, so you have to remove silence in audacity using truncate silence

oblique viper
#

I ran with the default settings and reached around 200 epochs with my model, I usually get caught by the errors that come from doing things that are said in the guide

analog obsidian
#

ah i have no idea about colab specific errors, i gave up on them, too many errors

oblique viper
#

ikr

analog obsidian
#

noobies is one of the people maintaining the colabs

#

he knows more

oblique viper
#

I tried to get zluda working because I am unfortunate enough to have 6700XT but that was a whole world of errors in itself, worse than colab

analog obsidian
#

ive heard zluda training speeds are extremely slow so it's better to use a cloud solution anyway

analog obsidian
oblique viper
crude flame
#

this might be flux or placebo but i feel like amd gpus train models weird and give bad models

analog obsidian
crude flame
#

like i could compare my model trained locally and a model trained on a nvidia gpu and even with everything being the same the amd one sounds worse

oblique viper
crude flame
#

LOL WHY IS THAT SO ACCURATE

simple ore
#

after you beat you head all sunday evening against a desk because someone cant follow basic instructions...

oblique viper
simple ore
#

unfortunately for 20 people who follow the instructions and get the results there's someone who skips steps and misreads everything

#

I blame the ipad generation

analog obsidian
simple ore
#

nobody teaches the computer basics in school any more

cobalt carbon
#

in cs2 voice is not working

#

i changed the mic

oblique viper
#

out of the 5+ AI tools that I've used for various reasons, this is the only one that I had to bash my head against so much

#

and I've used Applio back when the guide was still not outdated, back then everything went smoothly too

hallow thistle
oblique viper
#

and me being a beginner, the average joe would struggle even more than I am

hallow thistle
#

This is where to discuss about the program issue, not showing off your ego. cat_seriously

simple ore
#

yeah, I dont know how to drive a push cart, lemme drive mclaren f1

#

great approarch for AI

oblique viper
#

isn't this discord server and the guide meant to make this more accessible for beginners?

hallow thistle
#

Don't take what everyone here says about you seriously, it ain't that deep.

cobalt carbon
#

@low shard can u help me

crude flame
simple ore
crude flame
oblique viper
hallow thistle
#

Creepy.

hallow thistle
patent trellisBOT
# hallow thistle !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
cobalt carbon
cobalt carbon
analog obsidian
#

don't use crepe with a hop different than 160

#

tho i would just use rmvpe

oblique viper
crude flame
analog obsidian
hallow thistle
# cobalt carbon bro are u blind

If I say yes, would you believe me? Ok, I know you are talking about W-Okada the realtime voice changer, but what you elaborated is too less.

analog obsidian
#

mangio-crepe was a silly idea

oblique viper
analog obsidian
crude flame
analog obsidian
#

even if u use crepe with a hop of 160, rmvpe is still better

crude flame
#

i take care of the ai hub docs now so ye

oblique viper
oblique viper
analog obsidian
#

they're there so you can monitor irregularities

#

or that g/total is innacurate

oblique viper
crude flame
analog obsidian
#

but that is only in a specific branch

#

and even that doesn't help you in choosing an epoch

#

the loss graphs are really a bit useless

#

best metric is to hear your model

crude flame
analog obsidian
#

u can see adv gen loss going up yet g/total may still go down

oblique viper
#

I think a good step to help beginners is to have a strong suggestion that auto backups be turned on in the Applio colab tab, as I couldn't find any mention of turning auto backup on

Since it was in extras I assumed it's not necessary and I learned that the hard way when my colab ran out of GPU resources

crude flame
#

mel carries g/total iirc

analog obsidian
#

only time they go up is if u try silly stuff like loss balancer

analog obsidian
crude flame
hallow thistle
#

Texts blur when I look closely to them. How am I thar blind? It's more like I lose focus on a small topic too easily, especially when there's an ongoing bigger topic in chat or channel. cat_wtf

analog obsidian
crude flame
#

anyway i havent updated the docs on the new logging stuff bec it isnt in the mainline branch of applio

simple ore
#

btw, removed hop length for crepe from UI in exp/f0 branch

analog obsidian
#

google doesn't really like local ai stuff so they don't care if a random update kills ai training/infer
they just want you to use their ai instead

analog obsidian
#

the colabs being buggy isn't really applio guys fault but more like google trying to savotage everything

#

kaggle is another option but imo is way more broken than colab (they also hate any deep fake related ai stuff)

cobalt carbon
#

Can someone help me ? in cs rvc doesnt work what can i do ? i changed the input

oblique viper
#

@analog obsidian what should I do step by step in the no UI colab to continue training on my ApplioBackup? I've been using the UI colab this whole time so I'm a bit confused

hallow thistle
#

W-Okada not working with Counter Strike 2 can occur with several reasons, like using an older and original version of W-Okada, VB-Cable and Voicemeeter seem to cause issue when using them with W-Okada on Windows, and your microphone.

analog obsidian
cobalt carbon
#

pls shut up

hallow thistle
cobalt carbon
#

u didnt help me so no need your help

#

👍

oblique viper
cobalt carbon
crude flame
#

💀

hallow thistle
cobalt carbon
#

im not using voicemeter or anything else

oblique viper
#

anyone who knows how the no UI colab works can help me figure out step by step on how to continue training on my ApplioBackup? I've been using the UI colab this whole time so I'm a bit confused

hallow thistle
analog obsidian
cobalt carbon
#

i tried to run it on steam web shift tab

#

it opens ui but browser doesnt have a mic perm

#

so it doesnt work with it too

hallow thistle
#

Mate, you said you didn't ask to me. Why did you switch up that fast?

cobalt carbon
analog obsidian
cobalt carbon
#

i did the same settings

#

ye bro

#

its working on dc

#

but not on cs

#

i think browser goes sleep mode when on cs

analog obsidian
#

does cs allows you to choose which mic you wanna use in game?

cobalt carbon
#

i dont know how to solve this

cobalt carbon
#

i did the settings

analog obsidian
#

hmm weird

cobalt carbon
#

when i alt tab it works again

analog obsidian
#

but even if the gui is frozen, the actual program is running in the cmd window

cobalt carbon
#

but in game it does not work

analog obsidian
#

see if restarting the voice changer fixes it

analog obsidian
#

if that doesn't work try restarting ur pc

#

could be some weird windows interaction

cobalt carbon
analog obsidian
#

or the browser is muting your mic when you close it/hide it

#

try a different browser

#

anything but operagx

cobalt carbon
#

oki

#

im using brave

analog obsidian
#

try chrome

cobalt carbon
#

oke

oblique viper
#

@simple ore could you explain which cells I have to run step by step in the No UI colab to keep training my backup? I'm getting this error
I've tried:
Mount google drive > Clone > Install > Load a Backup

analog obsidian
#

yeah this weird issue comes due to w-okada being written in javascript which is extremely buggy

#

so every browser reacts differently to the gui

#

some are fine with it, some can't run it properly

cobalt carbon
#

ye the problem is i think when im on cs browser is not using mic

#

same problem is on chrome too

hallow thistle
#

Some say the Javascript is a trash programming language, but that's it. anime_nom

analog obsidian
#

iirc the reason why is running in your browser is because the guy who made the fork noticed running it in the browser had better perfomance than running it in a window

cobalt carbon
#

i hate java.

analog obsidian
#

i may be wrong tho, that was long ago

analog obsidian
cobalt carbon
#

it worked with edge

#

lol

analog obsidian
#

xD

#

thats javascript for ya

#

buggy asf

cobalt carbon
#

but it works bad

#

i need to change my pc

analog obsidian
#

what gpu you have?

cobalt carbon
#

1650

#

trash

analog obsidian
#

ah yeah

#

i'd recommend a 4060 minimum

cobalt carbon
#

i plan to buy a 5070 or 5070ti

analog obsidian
#

nice, thats more than enough for this

#

at the moment you could try fcpe instead of rmvpe

#

fcpe is like a slightly less accurate rmvpe

#

but runs very fast

oblique viper
#

at this point I would've had better luck training on my cpu for 42 hours straight than using colab angerysad

analog obsidian
oblique viper
#

4 days of straight up 12 hours a day of trying to train a model using colab doesn't do good things to your brain

clever burrow
#

sorry i forgot to say thank you 😭

analog obsidian
glacial pollen
oblique viper
clever burrow
#

i have one more question though

#

i've tried to download a model on applio but it doesn't seem to be popping up even after refreshing

analog obsidian
oblique viper
#

first time I tried training with CPU, that went bad, reached like epoch 70, then had to start again on colab, didn't have auto backup on so lost progress

#

most of what I did is a blur at this point, I'm on 2 hours of sleep

analog obsidian
#

im pretty sure the autobackup option only saves g and d

#

you can convert the G file to a pth file actually

#

and use it as a normal model

oblique viper
analog obsidian
#

so uhm you wanna train 200 epochs?

oblique viper
#

I reached 235 epochs, I want to get to 300-500 to have as good quality as I can

analog obsidian
#

so in simple words
epochs = time the model has seen the whole dataset

#

if you force the ai to see the same thing a lot, it will believe it should only be able to clone the dataset and nothing more

#

it will quickly forget the pretrain knowledge

#

and become dumb

oblique viper
analog obsidian
#

and a lot of random weird problems

#

the model is going to try to clone your audio but it will not have the full knowledge of how to do it correctly

#

the only thing that takes 300-500 epochs, are pretrains

#

these are trained with like 50 hours of audio

#

your small dataset doesn't compare to that

#

so realistically speaking, you don't need more than 200 epochs

oblique viper
#

I have 12 minutes of audio

analog obsidian
#

most small models are done within the 100-150 epoch range

analog obsidian
analog obsidian
oblique viper
analog obsidian
oblique viper
#

okay then I have all the epochs I need

analog obsidian
#

there is a less biased method of choosing epochs but you're a beginner so stick to what i said, is easier

oblique viper
#

time to test

oblique viper
analog obsidian
#

you wanna check spectogram reproduction

oblique viper
analog obsidian
#

spek works too but rx11 allows for more precise analysis

oblique viper
analog obsidian
#

tho is easy to spot when the model is generating noise instead of data

#

you'll see missing harmonics

#

also some random artifacting

#

alongside more spectogram related issues

#

good epochs are able to do decent spectogram reproduction

#

so they sound less robotic and more natural

oblique viper
#

I'll try to listen to some epochs ranging from 120 to 200 for now, maybe after I rest I'll try the spectogram way

analog obsidian
#

sure, do some research about spectograms in general

#

you need that knowledge

#

at least for rvc is needed

#

you can also know if your batch size was either too much or too low by analyzing your model spectogram

#

but at the end, small datasets (10 minutes and below) give very random results, so for example, you can get a very bad result in your first training run, but if you train the same dataset a second time, you may probably get a better result than the first

oblique viper
#

what are these? @analog obsidian how do they affect talking?

#

also thanks a ton for being kind and willing to help, you're very down to earth kanna_heart

analog obsidian
# oblique viper what are these? <@775545133448953856> how do they affect talking?

index = is a file where the accent of the dataset is stored, it is possible to use index files from another dataset tho i only use the index of my model
a safe value is 0.5, so a 50% of the dataset's accent will be added in the result, the other 50% will come from the pretrain
too high values may introduce artifacting (glitching, voice cracks, weird sounds)

volume envelope = rms normalization, in applio this is bugged, so don't use a value different than 1

protect voiceless = supposedly decreases the amount of robotic sibilants and breaths but a good model doesn't need this, in case you wanna play with this, start with a value of 0.33, bigger values decreases this protection, and lower values increases it

#

a value of 0.5 disables protect voiceless

#

for analyzing epochs don't use the index file

#

you can then use the index file after you find your best sounding epoch

analog obsidian
#

personally i have noticed using the index makes the model sound more true to the dataset

#

so better resemblance between the model and the real voice (this is probably why rvc-boss, the author of rvc, added index files)

oblique viper
simple ore
#

the percentage is a blend between phonemes from the audio and the matching phoneme from the index file

analog obsidian
simple ore
#

0 - use phonemes from audio as is, 1 - use whatever the matches found in the index, anything between - blend the values

#

audio has 'th' phonemes, the closest faiss finds in the index is 'z', if you use 1 your model speaks english with german accent

oblique viper
#

so if I use 0.5, my model speaks slightly german accent?

analog obsidian
analog obsidian
#

obviously if your model has a english index, it'll have an american/brittish accent instead

simple ore
#

the model still can make an incorrect preduction what the sound should be if it has not been trained on specific phonemes

#

there may still be a slight accent with 0 index

analog obsidian
simple ore
#

likely yes, it just finds a bad match/does not find anything

analog obsidian
#

o nice to know

oblique viper
#

where can I change the value of the index ?

simple ore
#

and then the model is unable to produce anything because it has never seen such phoneme

analog obsidian
analog obsidian
oblique viper
#

ohhh

#

feature ratio made it so much better

#

setting it higher

analog obsidian
#

0.75 = 75% of the dataset accent blended in the result

#

find a sweespot where it works good for different audios tho, don't just use one

simple ore
#

with 200k slices the index creation runs clusetering argorithm to group close enough samples to some average, then it runs a trim if there are more than 4000.. you can see it in the log with minibatch output

simple ore
#

kmeans is 200k -> 4k

analog obsidian
#

for bigger sets, lets say above 1 hour, is not worth to use faiss?

simple ore
#

at most you can have ~4k unique samples

#

that's 3 hour set

#

and even then it will narrow it down to ~1200-1500

analog obsidian
#

i was about to train a 3hour set

simple ore
#

the largest index I saw about 1GB

#

you dont need to train the whole thing, you can just preprocess and extract features, then run the index creation

analog obsidian
#

i see

analog obsidian
#

or the index doesn't need that much data?

#

if i remember well, rvc-boss added kmeans because there was a bug that prevented index file generation while using sets above 1 hour

analog obsidian
simple ore
analog obsidian
simple ore
#

i have not tested it, i need to find a big index and compare

swift thunder
sudden quail
#

@low shard is there a proper tutorial into using rvc i wanna use voice changer with discord

languid cliff
analog obsidian
languid cliff
analog obsidian
#

and i can't predict where your model is going to start to overtrain since thats random

languid cliff
#

or is it more of a gimmic

analog obsidian
crude flame
languid cliff
#

ah, so its useless then

analog obsidian
#

hearing overfitting/overtraining is kinda easy, at a certain point the model will sound very robotic

#

you just use anything before that happens

#

so lets anything above 150e sounds very bad but anything prior to that is "alright"

languid cliff
#

Yeah fair enough, so i could just set the epoch to 400 and save every 20, and just see through all of them

analog obsidian
#

yeah basically

#

graphs don't tell when your model is "done"

languid cliff
#

the model i trained now sounded fine to me at 500 epochs, and now i added even more audio to it

analog obsidian
#

they're there so ppl can check for divergence issues and such

languid cliff
#

but doesnt longer audio usually mean you should use less epochs? or did i get that completely wrong

#

or is it more dependent on how varying your audio sample is

analog obsidian
#

depends in your batch size

languid cliff
#

with different words, tones, etc

analog obsidian
#

and how different is the dataset compared to the pretrain

#

the og pretrain is trained using very monotone speech

languid cliff
#

i see

analog obsidian
#

so if ur dataset is also monotone like the pretrain, the model will have a more easy task learning ur set

languid cliff
#

my dataset is very varying in pitch and tone

analog obsidian
#

what you can do is to train 200 epochs and if you notice e200 sounds fine, you can continue training until the model starts to sound very metallic/robotic

#

tho i personally never train over 100 epochs

languid cliff
#

So im assuming its very easy to notice when its overtrained?

analog obsidian
#

yea u dont need to be an audio nerd to notice when the model sounds unnatural and robotic

#

is pretty obvious

#

has a particular ugly robotic sound

languid cliff
#

yeah fair enough. On my 500 epochs i noticed like a few words and pronounciations that sounded a bit bad, idk if that because of lack of data, or too much data (too many epochs)

#

i guess i could test the 400 and 300 one and see if its better

analog obsidian
#

epochs = everytime the model has seen its full dataset

#

so your model has seen its own dataset 500 times

languid cliff
#

yeah

analog obsidian
#

smaller datasets don't have too much data to begin so they overfit pretty fast

#

for example a pretrain that has 50 hours is trained using 300 epochs

#

because there's too many stuff the model has to learn

#

but 300 epochs with a 5 minute dataset is overkill

#

more epochs don't mean better results

languid cliff
#

yeah

#

soo a general rule then is larger dataset = more epochs?

#

BUT also dependent on batch size?

analog obsidian
#

yup

#

you cant predict how many exactly

#

rvc is random

languid cliff
#

So with a larger batch size, you want less epochs?

#

yeah i see

analog obsidian
#

too many factors

languid cliff
#

ah okay, so its not like a rule of thumb for it

#

gotcha

analog obsidian
#

yup

#

depends how hard is the dataset to learn

#

depends in a lot of things really

#

u can try two approach of selecting epochs
you can either train 200e and save every 10
or train 200e, save everything, and hear all until you find one that sounds more natural to you

languid cliff
#

okay ill try at 200

#

Also, for "silent training files", even if the audio has no background noise, do you usually always leave this on 2?

analog obsidian
#

yes, dont touch that

#

leave it set to 2

languid cliff
#

okay

#

The "Fresh training" option, do you always check this ON when making a new model? or is that if you are making pre-trains?

#

same with the "Dataset Creator"

analog obsidian
#

use it if you wanna start your training from 0

languid cliff
#

Okay sounds good

analog obsidian
#

but sure you can have it enabled when making a new model, nothing bad happens

#

just don't enable it when resuming training

#

otherwise all of the process will be lost

languid cliff
#

yeah gotcha 👍

#

Its exciting trying to constantly improve the voice, the spectogram stuff sounds exciting too, but also sounds like a lot of stuff to learn

languid cliff
tough fiber
#

anyone know is collab broken atm? my voice on deiters fork is bad

#

cutting and distorted my voice

analog obsidian
#

voice changer inference works differently

languid cliff
analog obsidian
#

yep

#

if it works fine there, it'll work fine in realtime

languid cliff
#

Okay, sounds good! will try that then

#

Great

merry shore
#

Need help with the Okada Voice Changer, only Beatrice V2 models are producing audio output
I am using an app called audioRelay
To connect my phone to computer for microphone

light latch
#

Hi! I wanted help with the Colab Research link to create ai covers, Whenever I click on an old link it doesn't go through, does anyone know what the new link is?

sudden quail
#

does anyone have a tutorial for rvc/

#

?

knotty moth
stable tartan
glass summit
#

Hi anyone who used Foocus or any other ui who made lora can please join vc? i need a smal help

#

Please

reef haven
#

where can i find some good ai cover sites?

modest dagger
#

guys can anyone tell me free voice changer with rvc

#

live voice changer

red remnant
#

Guys, who knows why the program doesn't load, I've already tried everything, all the components, updated, a lot of things, why the program just doesn't load, and after 2-3 minutes it gives the error Error: Could not load Voice Focus estimator. and when I resume the same thing, does anyone know how to solve this problem?

red remnant
knotty moth
red remnant
red remnant
#

okay

wintry iron
#

pls help how to change epoch

outer wasp
#

why i cant choose model

simple ore
hallow thistle
#

!howtoask

patent trellisBOT
# hallow thistle !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
outer wasp
modern crane
#

guys, the ai voice changer program opens start-https.bat and the console closes immediately what to do

hallow thistle
#

Chunk doesn't make the audio sound better in quality; it's more like what makes the audio to delay. What makes better quality is Extra. A GPU has a contribution at converting audio in real time on W-Okada.

hallow thistle
modern crane
#

video card invidia gpu intel

#

MMVCServerSIO_win_onnxgpu-cuda_v.1.5.3.18a.zip

this file i download

hallow thistle
modern crane
hallow thistle
# modern crane nvidia

That's a brand name, not a full name of GPU. On Task Manager, go to Performance tab, spot where GPU 0 or GPU 1 is in the left side there, and click one of them to reveal its full name on the right side.

modern crane
#

intel xeon cpu e5-2650 v2

hallow thistle
#

For example, if your PC has NVIDIA GeForce RTX 4090, it's RTX 4090.

modern crane
#

nvidia gtx 1070

tight ether
hallow thistle
modern crane
#

Download NVIDIA on Windows
The lastest version as of December 7th 2024 is: nvidia-b2332 (click here to download)
If you have a GTX 700 card or below, use AMD/Intel version instead.

this?

tight ether
tight ether
#

y.u.p.

hallow thistle
#

Yes.

modern crane
hallow thistle
patent trellisBOT
# hallow thistle !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
hallow thistle
#

Um. Which W-Okada version are you using? And what is your PC GPU?

tight ether
#

namari is really busy in this channel.

hallow thistle
#

It's possible to lower chunk number under 30 ms for less delay, extra number stays at 2.7 s, and also force to use fp32.

#

I don't know how to explain this. cat_deaed

#

Damn. Although you can set extra up to 5 s, most of the time you may experience audio cutting off a lot, so 2.7 s is best overall. If the audio quality still low, it can be a voice model you're currently using.

mellow ermine
#

If I buy premium weights will I get a better quality image?

hallow thistle
tough gale
#

can i get some advice on making the ai voice changer sound "better". no matter what i seem to do with the pitch, format shift or index it always seems to sound off.

languid cliff
#

Try a different model?

hallow thistle
tough gale
#

i swear iv tried so many lol

#

i still have not ruled out the ai not liking the british accent yet

analog obsidian
#

just train a better one

oblique viper
queen sapphire
#

yo whats the newest way to make ai covers of songs

#

havent done it since i used ilaria

#

??????

#

help pls

worldly ibex
#

can apollo do from youtube? or even if not from youtube, is there any that seperate the audio convert it and combine them together?

narrow portal
#

Is it just me, or the overtraining detector is not working?

#

Also I just pressed the Stop Training button and it's still training

edgy tangle
#

I think it is useless

#

You should just manually detect overtraining

#

And looks like the model didn't learned nothing after ~40k (I think :b)

narrow portal
#

So i think I might just stay with epoch 240-270 or something

edgy tangle
#

How long is your dataset?

narrow portal
#

29min

edgy tangle
#

Hmmm

narrow portal
#

All samples are 22hz so I used 30hz

edgy tangle
#

After 60k is overtrained

#

I recommend you to check lowest points between 40-60k steps (Not only the lowest of them)

narrow portal
#

Oh

#

I see

#

I don't have my PC rn so I will check that later

edgy tangle
#

Probably the lowest is the best one, but it might be just noise and not really a good step

#

Just compare them and find the best one

analog obsidian
languid cliff
#

Probably a dumb question, but is longer dataset always better? or does it at one point end up making the final voice worse with too long dataset?

simple ore
languid cliff
knotty moth
knotty moth
analog obsidian
languid cliff
#

There might be a slight dB level change between the clips, does that matter?

#

if so, i can try to normalize it to 1 level

languid cliff
languid cliff
languid cliff
# analog obsidian i would use 16

okay, 16 it is then 👍 . Idk if its too complicated to explain, but whats the technical reason for higher batch size for larger datasets?

dry marsh
#

does anyone know about spectrograms

analog obsidian
dry marsh
#

can any1 help w my forum

languid cliff
#

Oh damn, this is actually really fast with 16 batch size

analog obsidian
#

thats for 1 hour and above

languid cliff
#

Yeah

#

So 32 batch size for 2 hours and above?

#

Or does it not scale thst way

analog obsidian
#

be sure your dataset is expressive enough and not monontone speech

languid cliff
#

Yeah, thats what im worried about if my dataset is varying enough, but its dedinitely not "monotone" at least

languid cliff
analog obsidian
#

16 is good enough for 2 hours and above too

#

so dw

languid cliff
#

So thats usually whats stopping people from being able to do good 1hr+ datasets?

#

Yeah gotcha

analog obsidian
#

they take a hella time to clean

languid cliff
#

Its using 16,6/32GB right now, so

analog obsidian
#

i finished cleaning my 2 hour set in like 4 days

languid cliff
#

With like removing instrumentals, background noise, etc?

analog obsidian
#

yup

#

contentvec is very very noise sensitive

languid cliff
#

Yeah feel that. Im lucky with this one since its just talking, pure voice, with a really good mic

#

So it takes me like 1,5hrs to capture 1 hr of datasets

#

So i dont mind doing 2 hrs, as long as it doesnt "hurt" the dataset with more data

analog obsidian
#

the more you add, the more realistic the output

#

and better the results because you replace more stuff from the pretrain

#

if u rely too much in the pretrain (small datasets) things get weird

languid cliff
#

Yeah, i might just add on to it then, see how good i can get it

#

Im still missing data from like whispering etc, so i might have to see if i can find any data for that

analog obsidian
#

nono

#

dont add whispering

#

rvc hates it

languid cliff
#

Oh okay

analog obsidian
#

makes the whole model sound eww

languid cliff
#

What about like yelling and laughing? And like.. mouth sounds like popping, humming, etc? Is that all bad too?

#

Because i might want to clean up my dataset based on that

analog obsidian
#

rvc randomly adds those sounds in the results

#

idk why but i know it does that

languid cliff
#

Ahh okay

analog obsidian
#

just train clean speech, keep every breath (very important), remove unwanted sounds and noise

#

and.. thats rlly it

languid cliff
#

Yeah it might be like 1-2 minute of yelling out of 1 hour

#

Might have to remove it then

#

Wym keep every breath?

analog obsidian
#

rvc cant clone yelling and laughing so its a bit pointless to add them in the dataset

analog obsidian
#

so it needs breathing samples

languid cliff
#

Im a bit confused on that one, isnt breathing sounds considered background nosise? Because my dataset doesnt have any breathing sounds because of the noise gate, its only clear speech

analog obsidian
#

no

#

breaths are part of the speech

#

they're the most important part of the dataset

#

without them, rvc wont be able to learn how to clone breathing

#

so your model will sound veeery robotic while trying to inference breath

#

never remove them

languid cliff
#

Oh ok, but i havent removed them, there just is none in the dataset. Do you mean like, the inhaling and exhaling type sounds before and after a sentence?

languid cliff
# analog obsidian yep these sounds

Hmm ok, yeah i dont think i have any clear sounds of that im my data set because of the noise gate being used. I wonder if this is something i could artificially add to teach the model? Or would that be hella work?

analog obsidian
languid cliff
#

Ahh, ok. Thats a bummer. I cant recall if my dataset has any of these breaths or not, because i havent been listening for that, but ill check whenever i come back to my pc. If anything, if i add more data, should i try to hunt down audio clips which has these breaths, or does ALL of the dataset need not have them for consistency?

analog obsidian
languid cliff
#

If i were to add it to every sentence tho it would take hours hahah

analog obsidian
analog obsidian
#

but rvc kinda needs a lot of breathing samples in order to learn them

#

idk how many exactly

#

you'll need to experiment with that

analog obsidian
#

gather different unique breathing samples

graceful tinsel
#

I’m trying to run it through Google Collabs if that makes any difference

knotty moth
#

otherwise 8 could be okay unless the dataset is diverse

simple ore
#

larger the batch the less calculations for gradients have to be done every epoch, so slightly faster

#

but I would not go higher than 8 for an hour long dataset

#

but it is up for you to experiment with

languid cliff
languid cliff
languid cliff
wild storm
#

how ar u

dry marsh
#

hi! does anyone know about facefusion.. because when i process the video the output doesnt come out..

dry marsh
#

but the output doesnt come out

#

for some rsn

#

i used huggingface

pastel oak
#

ic you asked in the facefusion server anyway wait for an answer there

dry marsh
#

yea 😭

#

been dealing w this for 6 hours

#

do u maybe have any idea why this happened

knotty moth
dry marsh
#

[FACEFUSION.CORE] Processing step 1 of 1
Analysing: 100% (334/334)

#

literally stops after

stiff goblet
#

@glacial pollen this training is not going well, right ?

glacial pollen
#
  • I'm busy rn, watching some vtuber stream
stiff goblet
glacial pollen
#

and I will once again, this is not my job on the server to keep guiding people on what's correct / good and or bad / wrong

#

You see, when I was learning, I had to do all on my own, that's the point of learning and understanding

stiff goblet
glacial pollen
#

Well it is, but as you should know " avg 50 " is not my invention

#

@simple ore He made it

#

I only ported it

#

So if anything, direct questions about that metric to him

#
  • I am on hiatus / inactive on the server ( doing a big break. Hence would appreciate lack of @ s
stiff goblet
simple ore
runic heath
#

CPU too high usage when on cs2

river juniper
#

Close cs2

#

.. well either that or make sure your model is using your gpu because it should use minimal cpu if it is

gilded robin
#

"If you are still using CABLE instead of Line 1, I beg you to switch over because it is unironically better than CABLE in any way possible."

#

what line 1?

fair glade
#

hey

gilded robin
# simple ore

ah i use cable C & cable D im guessing i cant do that option?

simple ore
#

I believe the suggestion is to ditch vb cable and use the proper one

gilded robin
#

asio okadafork->reaper->peace->discord

fair glade
#

hey so

#

my voice sound unrealistic

#

how can I make it more realistic

languid cliff
#

try a different voice?

valid vine
#

would anyone want to help me figure out why my AIs I made to play tag continue to be IDIOTS no matter what I try? I'm using pytorch and learned it with chatGPT so it probably led me astray somewhere but even after redoing the entire program 4 times I still feel kinda lost

#

I can't send the two models directly here bc no files

#
class TagStandardHide(neuro.Module):
    def __init__(self, Learning: bool = True, learnRate: float = 0.01):
        super(TagStandardHide, self).__init__()
        self.layer0 = neuro.Linear(4, 32)
        self.layer1 = neuro.Linear(32, 64)
        self.layer2 = neuro.Linear(64, 48)
        self.layer3 = neuro.Linear(48, 24)
        self.layer4 = neuro.Linear(24, 8)
        self.learning = Learning
        self.optimizer = torch.optim.Adam((self).parameters(), lr=learnRate)

    def forward(self, x):
        x = self.layer0(x)
        x = torch.relu(self.layer1(x))
        x = self.layer2(x)
        x = torch.relu(self.layer3(x))
        x = self.layer4(x)
        return x
    
    def updateModel(self, stateTensor: torch.FloatTensor, nextStateTensor: torch.FloatTensor, action, reward: float, gamma=0.99):
        optimizer = self.optimizer
        state = stateTensor.unsqueeze(0)
        next_state = nextStateTensor.unsqueeze(0)
        action = torch.tensor([action], dtype=torch.int64)
        reward = torch.tensor([reward], dtype=torch.float32)
        current_q_values = self(state).gather(1, action.unsqueeze(-1)).squeeze(-1)
        next_q_values = self(next_state).max(1)[0].detach()
        target_q_value = reward + gamma * next_q_values
        loss = torch.nn.functional.mse_loss(current_q_values, target_q_value)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        return loss.item()
    
    def getState(self, location: tuple[int, int], seekerLocation: tuple[int, int], width, height) -> list[int | float]:
        locationNormalized = (location[0]/width, location[1]/height)
        seekerLocationNormalized = (seekerLocation[0]/width, seekerLocation[1]/height)
        gameState: list[int | float] = [locationNormalized[0], locationNormalized[1], seekerLocationNormalized[0], seekerLocationNormalized[1]]
        return gameState
    
    def getReward(self, gameState) -> float:
        #removed

        return reward```
#

I had to remove the reward function because characters but both of them look basically just like this

#

I've tried just an input and output, having 1 middle, having more middle layers with more and less neurons in each, none of that really seems to affect anything

simple ore
#

your forward is funny

valid vine
#

I've tried a bunch of random forwards none of them really seem to help

#

what specifically is weird about it though?

simple ore
#

why activation only for 1 and 3?

valid vine
#

is it normal for them all to use relu? /some other activation function? I figured just having it use the layer would be fine

simple ore
#

without an activation function there's no point in having separate layers as the math collapses them

#

there are different activation functions offering different activation probabilities

valid vine
#

like if it was just one neuron each and the weights were right, from the first being say 0.35 the second layer could make it -0.12 which on the next relu would be 0

#

well that's a bad example I guess because if there was another relu it would make it 0 which would also make the last next on 0

simple ore
#

anyway, I cant imagine what that code is supposed to do

#

your model has 4 inputs and 8 outputs

#

btw, use standard aliases. torch.nn -> nn

valid vine
#

yeah I'll probably make it more readable whenever I do more

dreamy seal
#

is there any aihub docs?

craggy bough
dreamy seal
valid vine
#

I did it like that so they can move more like a regular person can since you can press w and a at the same time, for example

valid vine
wispy perch
#

it doesn't work, and whenever i talk it says this in cmd: 2025-05-29 05:22:15.7391809 [E:onnxruntime:, sequential_executor.cc:572 onnxruntime::ExecuteKernel] Non-zero status code returned while running Pad node. Name:'/rmvpe/mel_extractor/Pad' Status Message: CUDA error cudaErrorNoKernelImageForDevice:no kernel image is available for execution on the device

wispy perch
#

gtx 750

simple ore
#

what application/version you're trying to use?

#

using a voice changer on anything below nvidia's <1000 series is almost impossible

wispy perch
#

vcclient_win_cuda_2.0.78-beta

wispy perch
#

ok thx

valid vine
simple ore
#

I'm not familiar with this kind of model training, so I have no other advice

#

other than in order to learn something complex the model has to have enough capacity and with your current forward function it is essentially just 3 layers

valid vine
#

well it's basically two different models with one goal each

#

before I had it be just one model with a 5th paramater for if it was "it" or not and then changed the seeker location to be the nearest player but that wasn't working

#

so basically the hider model is just "go away from this point" and the seeker is "get as close as possible to this point"

#

and "this point" in each of them is just the 3rd parameter (x) and 4th parameter (y) and they're normalized to be 0-1

knotty moth
simple ore
warm dragon
#

-colab

patent trellisBOT
# warm dragon -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **RVC Mainline**

by Hina
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **Hina's Modified Original Wokada**
• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

languid cliff
#

Does this mean i started overtraining around 40k steps? if im reading the charts right?

languid cliff
simple ore
#

no avg_50 loss?

languid cliff
#

Pretty sure im looking at the tensorboard live

simple ore
#

how big is the batch size?

languid cliff
simple ore
#

too much... way too much for both

#

try a model from ~30k steps

languid cliff
#

like WAY overtrained right?

simple ore
#

no, it is just way too much for finetuning

languid cliff
#

nah im good

languid cliff
simple ore
#

when you train a model on top a pretrain, it is technically a finetuning

languid cliff
simple ore
#

you have a big set, 4-12x size of the common size for voice models

#

so it keep using the same high learning rate longer

#

and with batch 16 is generalizes the model quite a lot

languid cliff
simple ore
#

I use 12-16 for my 55h vctk set

languid cliff
#

55h hour dataset?? damn

simple ore
#

it is a pretrain

languid cliff
#

yeah

simple ore
#

so yeah, try just a 30-60min set with 8

#

pick the best content

languid cliff
languid cliff
simple ore
#

then cut it in half/quarter

languid cliff
#

ooh okay

livid cosmos
#

So I've installed RVC AI Cover Maker and after double clicking run.bat, I am getting this error

Traceback (most recent call last):
File "F:\RVC-AI-Cover-Maker-UI-1.0.5\programs\applio_code\rvc\lib\tools\prerequisites_download.py", line 3, in <module>
from tqdm import tqdm
ModuleNotFoundError: No module named 'tqdm'
Traceback (most recent call last):
File "F:\RVC-AI-Cover-Maker-UI-1.0.5\main.py", line 1, in <module>
import gradio as gr
ModuleNotFoundError: No module named 'gradio'
An error occurred. Exiting...
Press any key to continue . . .

simple ore
#

you did not install requirements?

livid cosmos
#

I installed requirements

simple ore
#

'no module named' says otherwise

livid cosmos
#

If you download the .zip from the release (here) make sure to rename the folder from "rvc-ai-cover-maker-ui-v1.0.5" to just "rvc-ai-cover-maker-ui" otherwise you may run into missing dependencies issues.

livid cosmos
gilded robin
#

hey what's latest guide for making a voice model from scratch?

valid vine
gilded robin
#

and can you train a pre-existing model on more emotion? or do you just have to make it from scratch

topaz slate
#

hey is there anyone that can help me with the w-okada settings? i cant set it.

paper bloom
#

hey idk if thats the right channel to ask questions in

#

but i have a question^^

#

are there good male and female voice that also sound like a real ^^

#

the one i have is good but some ppl recognize it but its very old idk if that makes a diffrent or are there like new once nowadays that are better?

silk sage
#

How can i make my friend voice to an RVC model Zip for ai

valid vine
slim schooner
#

hey guys its been a minute. just need some advice here.

if im going to train a voice model, should i use 32k, 40k, or 48k sample rate?

#

does higher sample rate require more training time?

simple ore
slim schooner
knotty moth
# valid vine becuase "it automatically finds the best path based on a known algorithm" doesn'...

imo Jump King speedrunning is the one you should try exploring on
https://www.youtube.com/watch?v=e-iOd42mF4g

※最後Youtubeの仕様で失われてしまったアーカイブ5分間はこちらで補完しました→ https://youtu.be/kyPb3-8bLMY

デビュー前からやりたいと思っていた「JumpKing耐久」!!!!!!
年末の日曜日!!満を持してやっちゃうぞ~!!!!!!!!!
12時間以内にクリア...

▶ Play video
valid vine
#

what

#

are you saying I should start by trying to make an AI speedrun jumpking?

hallow thistle
hallow thistle
patent trellisBOT
patent trellisBOT
valid vine
#

there are no known male and female voice models

slim schooner
#

worked just fine for a first run but tried to run another training session and got this, anyone know whats the issue? nothing changed

knotty moth
valid vine
hallow thistle
valid vine
#

(8 being the 4 cardinals + the diagonals)

knotty moth
valid vine
#

but also about an hour or two ago I made an even simpler test program that was just a single AI trying to get to a number that you could change and it DID learn how to find it pretty fast

valid vine
#

jumping seems like it'd be harder for the AI to understand

knotty moth
late flicker
#

Is this a final message of Mangio RVC local?
['extract_f0_print.py', 'C:\Users\Mike\Desktop\Mangio-RVC-v23.7.0_INFER_TRAIN\Mangio-RVC-v23.7.0/logs/test1', '22', 'rmvpe', '64']
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
no-f0-todo
['extract_feature_print.py', 'cuda:0', '1', '0', '0', 'C:\Users\Mike\Desktop\Mangio-RVC-v23.7.0_INFER_TRAIN\Mangio-RVC-v23.7.0/logs/test1', 'v2']
C:\Users\Mike\Desktop\Mangio-RVC-v23.7.0_INFER_TRAIN\Mangio-RVC-v23.7.0/logs/test1
load model(s) from hubert_base.pt
move model to cuda
no-feature-todo

hallow thistle
late flicker
#

I know about Applio, but sometimes, it didn't worked for me :(((

hallow thistle
late flicker
#

Ik

#

I will reinstall it

hallow thistle
#

Well, judging by your Mangio RVC folder path, you should never install any program directly on desktop. The path should be something like C:\Applio or D:\Applio if you use Applio.

late flicker
#

Okay

#

Good point

hallow thistle
#

Program shortcuts belong to desktop, not full programs within folders.

late flicker
#

Making sence

#

This is the first time, it worked cat_blush

gusty sierra
#

is this very bad? :D

simple ore
gusty sierra
#

g/total

simple ore
#

ouch.. is it like 1 minute set?

latent kettle
#

@viscid moss sorry to bother you but I want to know which models are Best in UVR 5 UI to prepare a dataset from scratch. Like which is good for vocal and instrument separation, de eco, de reveb, de noise. Remove baking vocals etc..

viscid moss
#

This was made by our QCs for RVC model creation

#

And here's the best models according to Music Separation server guys

latent kettle
#

One question more, does these models works in UVR 5 GUI ? The little windowed exe ? Or these can only be used in UVR 5 UI. the browser one

viscid moss
viscid moss
#

UVR5 UI does automatically

latent kettle
#

I see. Thank you a lot anime_giveheart

viscid moss
#

Ur welcome

river talon
#

Hello everyone, does anyone know where I could find some data for a chat bot ? i'm trying to make one using pytorch and don't really know where to start, if anyone can give me a lead to start I would be grateful

slim schooner
#

are these outputs good? It's for a 50min audio.

astral pine
#

Traceback (most recent call last):
File "client.py", line 22, in <module>
File "asyncio\runners.py", line 194, in run
File "asyncio\runners.py", line 118, in run
File "asyncio\base_events.py", line 687, in run_until_complete
File "main.py", line 140, in main
File "main.py", line 81, in runServer
File "uvicorn\server.py", line 69, in serve
File "uvicorn\server.py", line 76, in serve
File "uvicorn\config.py", line 434, in load
File "uvicorn\importer.py", line 19, in import_from_string
File "importlib_init
.py", line 90, in import_module
File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 935, in load_unlocked
File "PyInstaller\loader\pyimod02_importers.py", line 384, in exec_module
File "app.py", line 17, in <module>
File "PyInstaller\loader\pyimod02_importers.py", line 384, in exec_module
File "voice_changer\VoiceChangerManager.py", line 26, in <module>
File "PyInstaller\loader\pyimod02_importers.py", line 384, in exec_module
File "voice_changer\RVC\RVCr2.py", line 9, in <module>
File "PyInstaller\loader\pyimod02_importers.py", line 384, in exec_module
File "voice_changer\embedder\EmbedderManager.py", line 3, in <module>
File "PyInstaller\loader\pyimod02_importers.py", line 384, in exec_module
File "voice_changer\embedder\OnnxContentvec.py", line 2, in <module>
File "PyInstaller\loader\pyimod02_importers.py", line 384, in exec_module
File "voice_changer\common\OnnxLoader.py", line 1, in <module>
File "PyInstaller\loader\pyimod02_importers.py", line 384, in exec_module
File "onnx_init
.py", line 77, in <module>
ImportError: DLL load failed while importing onnx_cpp2py_export: A dynamic link library (DLL) initialization routine failed.

Press Enter to continue...

#

help

simple ore
analog obsidian
simple ore
#

it is supposed to be agv_gen

slim schooner
analog obsidian
brittle wing
#

Sorry if this is a stupid question but is there a way to make an ai cover with any voice model? If so how?

analog obsidian
slim schooner
# analog obsidian yup

ahh i see, i'll rerun training and check the graphs, what should i be looking for? when do i know training is enough?

analog obsidian
#

id recommend trying to train 100 epochs and save every 10

slim schooner
#

this is from my previous training

analog obsidian
analog obsidian
slim schooner
#

this one?

analog obsidian
#

yes but go to the scalars tab

#

ignore the grey graphs

slim schooner
#

you mean this one?

analog obsidian
#

scalars

#

read above

#

theres a big scalars name in the ui

slim schooner
#

ooohh mb lmao

#

this is what im looking for?

analog obsidian
slim schooner
#

so thats bad, too weak right? sorry im not an expert. do i change anything?

analog obsidian
#

train more and see if it goes down to 4.1-4.0

slim schooner
#

alright, i'll rerun the training and see if it improves

analog obsidian
#

but always hear your model, since loss graphs most of the time go down even when the model already started overtraining

slim schooner
#

thanks, i'll do that 👍

analog obsidian
#

hearing it every 10 epochs is fine

#

if ur model begin to overtrain you'll notice every epoch past a certain step amount sounds robotic

slim schooner
#

i would hear those in the "audio" tab right?

#

theyre like 2 sec segments

analog obsidian
#

so inference some expressive audio and hear how it sounds

slim schooner
#

gotcha, do i need to move the saved models into the inference folder? it doesnt read them where they are normally stored

analog obsidian
slim schooner
languid cliff
#

Is finetuning a voice with bigger pretrain dataset generally gonna take longer time than one with a smaller one? Assuming same batch and epochs?

languid cliff
# analog obsidian no

Oh ok, maybe i did something wrong then, or it takes some time to ramp up. When i was training with OG dataset i did like 5it/s. And now with the KLM i do like 2it/s. But i only looked at the first few epochs before i left

analog obsidian
languid cliff
#

Yeah makes sense then

simple ore
unborn canopy
#

hello everyone can someone help me install the voicechanger with phython i dont know what to do!?

simple ore
#

download the compiled version for your gpu, unzip, run

unborn canopy
#

@simple ore thanks

languid cliff
simple ore
#

the speed may go down if you run a game or something else that uses GPU and pushes the memory use into shared territory

jaunty shale
#

just as soon as the model was done.

#

do I have to wait? (I use kaggle mainline)

lucid creek
simple ore
#

or just download the baked model.

idle bramble
#

does converting to onnx affect the quality of the model?
also is rmvpe better than rmvpe onnx on nvidia? is it just better overall but more expensive to run?

safe echo
#

hey guys, any idea how to resolve these two? i reinstalled RCV but my perf is now 300, but it was 30~ before. (using F0 fcpe)

also, with my usual voice, when i use a world with PU in it, it cuts it out like when i say anything that "pops" any idea? thank you Prayge

analog obsidian
analog obsidian
random hound
cosmic frigate
#

Why does my voice changer bugs when I play Roblox while doing voice chat on discord?

#

Like they can’t even hear me

crimson oyster
#

uhhh is there a place where i can ask someone to make a ai model?

vast relic
#

are people still using this version

peak path
#

hey
should i use crying and laughing in my dataset files (.wav)?

simple ore
peak path
summer cliff
#

how do i get less ping while using the voice changer?

foggy belfry
#

Why is this happeing with Applio? -I can't send picture

proud valley
#

I'm developing rvc moodel with colab

#

ummmm............

#

Should I use the paid version of CoLab to create an rvc model?

tight ether
proud valley
#

ummm....

#

Is there anything other than the extra section?

simple ore
proud valley
#

ok

#

Thank u

royal marsh
#

Hello i have question but i cant put photos here

jaunty trellis
#

Hi, I have a problem, idk why the app doesn't detect my microphone when I use an RVC model but when I use the "Beatrice jvs corpus" it does cat_deaed

pastel oak
#

!give-media-perms @royal marsh 5h

pastel oak
pastel oak
pastel oak
jaunty trellis
pastel oak
waxen root
#

Where do I get voice models which are working for the German language too cannot find any

indigo jacinth
#

hi i followed the guide but vc not working

still phoenix
#

did u figure out what was that?

pastel oak
cosmic frigate
#

I am using this one bro

pastel oak
#

Send screenshot of voice changer

cosmic frigate
#

Still bugs when I play Roblox

cosmic frigate
cosmic frigate
cosmic frigate
unborn canopy
#

hello, can someone go in a voicall with me and tell me how to install the voicechanger because i am to stupid to make it myself even with instructions

ashen solstice
#

My g/total is horizontal, with some down spikes

#

What could be the problem? 110e so far

languid cliff
#

i aint no expert, prob means you are starting to overtrain

patent trellisBOT
# hallow thistle !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
craggy saffron
#

hi when i use a file as a input the output sounds good but when i use my own voice it sounds bad how can i fix this? or is this a microphone issue?
not about the pitch

#

i tried looking up if its about the microphone but the web says u dont need a better microphone

viral mason
river cairn
#

Is it possible to use an ASIO other than FlexASIO with Deiteris VCClient? I've tried the ASIO driver supplied by my Focusrite Scarlett 4i4 audio interface, as well as a virtual ASIO provided by VB-Audio Matrix, and both experience crackling and dropouts during realtime voice conversion. Buffer size 256, sample rate 48000 (as recommended by this guide: https://rentry.co/lessdelayasio)

marble vigil
#

is the illaria rvc vocal isolation tool also not working to anyone else?

silent condor
#

hey, im haveing fun makeing little ai covers on weights but the ai cover voice is kinda quiet, is there a way to fix it.

polar peak
#

Whats the exact google collab with old gradio UI?

summer cliff
pastel oak
peak path
#

i have a problem with https://github.com/blaisewf/rvc-cli
note that i use Google Colab
if i want to use resume option, i've got this error on Training session.

Autobackup Enabled

Starting backup loop...

/usr/local/lib/python3.10/dist-packages/librosa/util/files.py:10: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  from pkg_resources import resource_filename

Backup Complete: 860 new, 0 updated, 0 deleted.
Backup Complete: 0 new, 1 updated, 0 deleted.

Files are up to date.

/usr/local/lib/python3.10/dist-packages/librosa/util/files.py:10: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  from pkg_resources import resource_filename

/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:558: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(

Checking saved weights...
Using HiFi-GAN vocoder
Starting training...

Loaded checkpoint '/content/Applio/logs/voos/D_2500.pth' (epoch 100)
Loaded checkpoint '/content/Applio/logs/voos/G_2500.pth' (epoch 100)
#
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:558: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.

  warnings.warn(_create_warning_msg(
/usr/local/lib/python3.10/dist-packages/librosa/util/files.py:10: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  from pkg_resources import resource_filename

terminate called without an active exception

Backup Complete: 1 new, 0 updated, 0 deleted.
Files are up to date.
simple ore
#

@peak path you dont need to use rvc-cli, if you gonna train anything, just use noUI colab.

#

dataloader warning is strange, have not seen it before

#

perhaps you failed to restore the dataset?

peak path
simple ore
#

colab does not access your google drive directly, you need to move the dataset from the backup to the colab node.

peak path
#

yes, i did it in resume tab.

#

there are 2 sections.

  1. load data
  2. set values
simple ore
#

okay, so in the filesystem browser (folder icon) there should be your folder '/content/Applio/logs/voos' with a bunch of stuff inside.. f0, f0_voiced, extracted, sliced_audio folders, etc

peak path
#

yes yes
they are there

#

/content/Applio/logs/voos

simple ore
#

thats the backup

#

i'm talking about colab side

peak path
#

yeah i know
i can see them in colab

#

let me do it again

simple ore
peak path
#

true

simple ore
#

okay, so you should be able to select a different max epoch (> 100 you have saved), and it should resume the process

peak path
#

my first session was 100

simple ore
#

obviously, otherwise you are at 100 as trained before

peak path
#

oh my lord
let me do it

peak path
#

i don't have the link

peak path
#

ok, thank you so much
i thought that you have another option.
i used it before

simple ore
#

RVC-CLI is a command line interface for Applio, but it is kinda redundant

peak path
wide perch
#

Using RVC nvidia on Github.

As soon as I begin audio conversion, the entire process freezes and the command prompt is empty
Other people I talked to had this same issue
Anyone know how to fix it?

#

Fixed it, but the voice changer isn't working

#

Just getting "Audio Block Passed"

simple ore
wide perch
#

Also yes, I have both input and output on MME

#

RVC worked for me before, now its just spamming audio block passed and the voice changer isnt working at all

simple ore
#

link your "RVC nvidia on Github"

simple ore
#

so yeah, ancient

wide perch
simple ore
#

in AI terms, project that have not been updated for 6+ month are hopelessly outdate, your's is like 2 years old

wide perch
#

Alr thanks

elfin dome
#

guys , i need this files

fierce pivot
#

Is there no longer a place to request someone create a model for you? Apparently I suck at it 🙂 and need someone to do it

wide perch
#

I set up my input and output correctly

simple ore
elfin dome
simple ore
#

unless you're using some outdated app that point at non-existent repository

elfin dome
#

hmm

polar sage
#

Hello, what Colab are people using actually?

#

!colab

patent trellisBOT
# polar sage !colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **RVC Mainline**

by Hina
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **Hina's Modified Original Wokada**
• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

river cairn
winter burrow
#

I haven’t used w okoda for a while, is it still the best realtime voice changer?

analog shale
#

Hey, I'm trying to find a small language model that learns with prompts. Any suggestions?

polar sage
#

Hello, what Colab are people using actually?

slim schooner
#

what was the virtual audio cable that you need again for wokada?

#

I think it's called VIC or something, not sure

river cairn
#

By Muzychenko

slim schooner
#

thanks bro 👍

outer wasp
#

May I ask?

  1. What if I want to create a speech model from scratch on applio or any speech model (meaning without download any pre-existing other model data)?
  2. Is applio is right way to create a voice model?
  3. How much voice recording data does it take to create a voice model?
    Thank u for readinganime_giveheart
simple ore
silent condor
#

on weights is there a way to make the ai voice loudrr, its kinda quet on covers that i do

outer wasp
simple ore
crude flame
median monolith
#

I have a question about the Weights voice model creation feature, could I maybe do it here?

also, wasnt this channel named something with "help"?, its been a while since I asked for a question in the server, and I swear this channel was named something like this. just wanted to know.

latent kettle
#

You can ask questions about Realtime voice changer or any other help like RVC or something

devout tulip
#

Which website to use to make AI cover?

soft tiger
#

So where can i do google veo 3 stuff for free?

latent kettle
viral ruin
#

Is there any working colab to train RVC ?

silent condor
#

please somone le me know if theres a way to make the ai cover song thing on weights any louder, the voice isnt very loud

ancient portal
#

hello, I need newest version of AICoverGen

edgy minnow
#

what does this mean?

#

okada w

#

How did you fix this

simple ore
edgy minnow
young dirge
#

when I use the ai voice changer sometimes its like where my voice kind of cuts out for a very small amount of time. Is there any way to fix this since it makes it sound way worse

simple ore
young dirge
# simple ore mic sensitivity? voice activation is generally bad

I beleive my mic should be pretty good. I am pretty new to this voice changer. So Idk if there is any specific like mic sensitivity or like how to check it. All I know is when I tried to record my voice with like obs I could hear how the voice cuts of alot

young dirge
mighty sinew
#

im using applio and im having a problem where with one certain voice it wont produce a audio file but it says its been inferred succesfully any way to fix this as the voice seems to work for others and i want to use it

simple ore
simple ore
young dirge
simple ore
#

is there an issue when you use a voice changer and use mic as an input and headphones as an output?

young dirge
#

I use the cable input to be able to speak on discord and more. And that is when the problems accure. I use my normal mic for the input and the cable input for my output

simple ore
#

we'll get to that, please answer the question above

young dirge
simple ore
#

okay, so now mic input, line1 as output, discord line1 as input, push to talk = enabled

#

is there an issue when you use push to talk?

young dirge
#

gonna try it with friend so he can let me know if it sounds good

#

its still quite laggy apparently when I try it

#

or like it cuts of alot

simple ore
#

okay... is the noise canceling enabled in discord?

#

there are a few settings you can try

young dirge
young dirge
viral ruin
#

Is there any working colab to train RVC ? RVCDisconnected seems to be banned

young dirge
#

I finally fixed it on discord but now its just not working very well on teamspeak