#✨│ai-help

1 messages · Page 197 of 1

simple ore
#

yeah, the encoded phonemes and pitch

#

during training it builds a link between phonemes and spectrogram, during inference it uses phonemes to build a specrogram, and it "fills the gaps"

#

encoder training masks parts of the sequence and the model has to generate something that matches the original

glacial pollen
#

I don't think your explanations will be any good for newbies in here

simple ore
#

so the process repeats with different parts of the sequence being masked every step until the model can generate the entire sequence on its own

#

yes, it is a bit advanced

#

kl almost always goes down and its contribution to training is rather small, so it can be ignored

steel forge
#

Suno to create a beat similar to the artist, "2000s hip hip chipmunk soul, Chicago rap"

Extract the vocals and instrumental with UVR

Infer the vocals with Kanye RVC model in Applio

Combine new vocals and instrumental in audacity. Do some tweaks to the vocal mix and you're done

#

Just using kanye as an example btw

waxen jasper
#

hey everyone, i have a question, i wanted to make a ai voice singing a music, but when i use rvc, the ai voice is also singing the instruments that are played on the song, so it makes it weird, how do i fix that ?

low shard
#

it doesn't automatically do that unless you use aicovergen or aicovermaker or weights.gg

knotty moth
#

seems like just another troll

brittle wing
#

How do I fix this with Applio?

simple ore
#

get a compiled version and unzip it into C:\Applio

brittle wing
knotty moth
simple ore
#

download 4.5gb zip

#

unzip to C:\Applio

#

not to some other weird folder

#

wait until unzip finishes

brittle wing
simple ore
#

yes, of course

brittle wing
simple ore
#

if you use 7zip, not that long, if you use windows, it is a slowpoke

#

so about a minute

crystal gull
#

🤔 I want to have my ai talk in voices using tts, what would be the best setup for that? What app has api support? All local 🙂

brittle wing
crystal gull
brittle wing
#

Okay, its working now. I had to wait a bit longer than expected.

#

thanks guys!

#

Whats better in Precision? fp16 or fp32?

#

what batch size should I apply with 30+ mins of datasets? I have an RTX 3090 24gb.

#

Pitch extraction algorithm, which one is better for singing?

#

and lastly, what Index Algorithm should I use?

analog obsidian
analog obsidian
crude flame
#

and that model is one of my best (in terms of most fire emojis)

analog obsidian
#
  • noisy graphs
brittle wing
crude flame
analog obsidian
#

models should converge at 200 epochs

crude flame
#

i dont have the logs anymore but it sounds fine so

crude flame
analog obsidian
#

so it just overfitted

analog obsidian
crude flame
brittle wing
#

is it possible to still game while training? if so, lower the resolution?

analog obsidian
#

ideally u want them to converge at around 170-200 epochs, then you train until they start to overtrain

lavish lintelBOT
#
Congratulations Razer by Weights!

Your Grotle is now level 25!

crude flame
#

was 20 min not 30 but yk

analog obsidian
brittle wing
analog obsidian
#

too low batches causes model to be focus learning one specific thing rather than trying to learn more

crude flame
brittle wing
# analog obsidian 16

Out of curiosity, how many EPOCH to train with? 30+ mins datasets with added pretrained like KLM? batch size 16

analog obsidian
analog obsidian
#

if you notice g/total just goes up for over 1 hour, stop training

#

and select your lowest point in the mel graph before the g/total rising

brittle wing
crude flame
brittle wing
#

Good example vs a bad example?

analog obsidian
#

past red circle = overtrained

analog obsidian
brittle wing
analog obsidian
#

when the graph rises means the model is getting confused

#

so nothing useful there

#

that is the margen of error, so you want the less error possible aka the lowest point

#

rising is more errors

brittle wing
#

Thank you

analog obsidian
crude flame
crude flame
#

i remember testing that and the model came out decent

#

i think i got lucky 😭

analog obsidian
#

you're just training the model with only noise at that point

crude flame
#

idk how it worked

#

the only downside i remember is it sounded wobbly

analog obsidian
#

then learned from that

#

and only that

crude flame
analog obsidian
analog obsidian
#

forcing the model to stay in one place

frozen ledge
#

-rvc

azure marshBOT
frozen ledge
#

-colab

azure marshBOT
# frozen ledge -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

analog obsidian
#

so thats why the model did not improve much more than just a couple of epochs

red cliff
#

Is there a known reason as to why sometimes if you're throwing a full song size vocal into RVC and there's a long silence somewhere in it, when the vocal comes back in, there can be a lot of artifacts? I was gonna look into splitting the audio up programmatically (I think Applio implemented that?) in hopes that it would help. Maybe it would speed up the overall inference too

brittle wing
#

Should I Cache Dataset in GPU?

analog obsidian
brittle wing
#

Thank you

analog obsidian
red cliff
#

@analog obsidian I'll give it a shot but any idea why the issue occurs in the first place?

analog obsidian
brittle wing
#

I can't seem to get this training to work?

analog obsidian
brittle wing
#

Because I got an error again

azure marshBOT
brittle wing
analog obsidian
# brittle wing

hmm looks like its something else, im not an applio dev so i can't tell exactly whats going on here
you could try reinstalling the latest compiled version and try again

#

but better wait for a dev response

brittle wing
#

Yeah Im on the newest compiled version

glacial pollen
brittle wing
#

ApplioV3.2.8-bugfix

#

complied version

glacial pollen
#

U used applio before?

#

if so, worked for ya?

brittle wing
#

Yeah from 3.2.6

glacial pollen
#

Not sure what's going on exactly with the mainline but, if you want, can recommend you my fork of it

#

a matter of 1 click install and running, is stable and has few nice things to help you train

glacial pollen
#

Gimme a sec

analog obsidian
glacial pollen
#

u fast 😩

#

was about to pack all into zip but guess it'll do

analog obsidian
glacial pollen
#

for now dl it that way

#

no zips atm

#

then run install and run fork bat

#

as always

brittle wing
#

Thanks, i will give Fork a try

glacial pollen
#

thx, lemme know of any potential issues
( should be stable tho

#

oh yea, and the new gimmicks you'll find at: Trainint tab, advanced settings and at the bottom:
@brittle wing

#

just in case

brittle wing
#

woaw thats new, whats the best option to pick?

analog obsidian
#

avg loss specifically is very good

glacial pollen
#

Basically, the warmup uhh

#

say you wanna train for 300 epochs ( approx )

#

then you could try 30 for warmup or 25

#

as for average.. recommend you first doing a test run for first few epochs and see how many steps you get per epoch

#

For example, if you have 40 steps per epoch, set the

#

to 10 or 12

#

can be even 8.
You get the point, it is some chunk of 1 epoch's total steps
not too much, not too little

#

it's to get an " avg " performance's metric from that epoch

brittle wing
#

Im training on 30+ mins of singing dataset on KLM 3 32k pretainer. I assume 500 epoch is enough

glacial pollen
#

yeaaa, you can always pause earlier in anything so no issues here

#

in that case, try 35 for warmup

#

( or if you wanna stick to 10% of total epochs rule, do that but I'd recommend 35 at first )

#

effects of warmup on rvc in general aren't well tested in field yet ( on actual pretrains, that is. )

brittle wing
#

Great, ill try

glacial pollen
#

Neat

brittle wing
#

@glacial pollen I got an error for installing, and then got an error for running Fork?

glacial pollen
#

Huh, this shouldn't happen

analog obsidian
glacial pollen
#

oh, you gotta type in " total "

#

in scalars to see em ( in filters tag )

#

it's normal

#

@analog obsidian

analog obsidian
glacial pollen
brittle wing
#

C:\Users\PC\Desktop\codename-rvc-fork-3-main\logs?

glacial pollen
#

just highlight all in the console, paste into notepad, save as txt and send me

#

also before that

#

what windows u running?

#

11?

#

either way, we can discuss it in more details in dm

brittle wing
#

Yeah im on 11

glacial pollen
#

update: issue fixed. Case's closed

brittle wing
#

@glacial pollen You recommend 35 on the "warm up phase" for 30+ mins of datasets on 500 epoch? What should I put in the "Frequency of avg running loss"?

glacial pollen
#

@brittle wing

#

pretty much you first gotta run a lil test for few epochs

#

we need to know steps you get per epoch

brittle wing
glacial pollen
#

you can run for 1 epoch really ( but do 2 )

brittle wing
#

Okay, then should I set the "warm up" + "frequency" on default settings for now?

glacial pollen
#

you can keep both at 0 for now

#

or whatever def value i left there, doesn't matter just yet

#

none of the options will affect steps per epoch you'd get

brittle wing
#

on tensorboard

glacial pollen
#

an example

#

you wanna check S value for your epoch

#

in this example you can see it's 25 steps per epoch

#

consecutively, 25, 50, 75 etc

#

Once you get what you have, we can think on what to try

simple ore
#

who does that?

lean chasm
#

where do i download the virtual audio cable?

low shard
lavish lintelBOT
#
Congratulations Nick088 [ITA/ENG] by Weights!

Your Charizard is now level 59!

low shard
#

what’s ur pc gpu?

#

because the vac is already in the written guides, but im feeling youre following some outdated yt tut

lean chasm
#

AMD Radeon 780M

#

i already downloaded the cable btw

#

i just don't know how to use the okada with it

brittle wing
#

Wav

low shard
azure marshBOT
low shard
#

The 1st link is wokada deiteris fork, its better in performance

#

the 2nd is original wokada

#

@lean chasm i would highly suggest u to use the wokada deiteris fork instead of yt tut one

lean chasm
#

wait so i have to uninstall the wokada and get this one instead? (they are the same?)

low shard
#

Could u send the link of the guide you used or tell me where you got it from?

low shard
#

All youtube tutorials are 1 year old

#

meaning you got an older version of the normal wokada

#

You should delete that one, and download the deiteris wokada fork

#

its way better in performance

#

you just have to read the guide

lean chasm
#

Download NVIDIA
Download AMD, INTEL and CPU

#

these are the two options, i got a graphic card too so does it count as nvidia

low shard
#

you told me your gpu is amd radeon 780m

#

or do you got another nvidia gpu?

lean chasm
#

yeah

#

i got 2 gpu

#

so should i use the graphic card gpu instead or the amd?

low shard
lean chasm
#

rtx 4070

latent pumice
#

I don't know if its here i should ask this, but i really need to know, Is there a way to use text to speech with RVC?

#

Cause i have a problem called "My pc is in my brother's room and i don't wanna wake him up" So i wonder if there is something that does the voice from RVC work with text to speech

lean chasm
#

i don't think there is

latent pumice
#

Welp, it was worth asking

low shard
#

You should get the Nvidia one then

arctic willow
#

Hello guys, I have some problem with the program, I have no sound

low shard
# latent pumice I don't know if its here i should ask this, but i really need to know, Is there ...

There are different Text To Speech (TTS) AIs:

GPT So Vits: RVC isn't as good as GPT So Vits for tts, but gpt so vits (few shot tts, which means needs just a lil training for models) can't use rvc models (and viceversa), and its only limited to: english, chinese & japanese, if you wanna check gpt so vits instead, read https://docs.ai-hub.wtf/tts/gpt-sovits/

Freemium 11labs: An easy way to do TTS is https://elevenlabs.io/, you can't use RVC model on this but its a mostly premium easy way for good quality TTS

FishSpeech: FishSpeech is a 0 shot (no explicit training needed) TTS, if you got a good pc you can use it locally else use their site

With RVC Models:

RVC is natively for Speech To Speech, but forks such as ilaria rvc mainline & applio have built in tts (using Microsoft Edge TTS to make a generated tts audio, which i suggest you to choose a tts model that is the same gender and language of the rvc model you wanna use, and then convert it with rvc)

If you wanna do tts locally with RVC Voice Models (if you got a good pc):

  • You can get Applio in our docs
  • While Ilaria RVC Mainline here (no guide as of right now)

If you don't got a good pc you can do tts with RVC Voice Models on cloud:

  • Ilaria RVC Zero (Running on A100 GPU, free fasted rvc on cloud) and the guide

  • Use Applio UI Colab (with google colab T4 free daily limit gpu)

  • if you don't wanna use edge tts, you could try another tts ai from our tts index and use the output as an input in rvc

low shard
#

There's a lot of different tons of AIs

arctic willow
#

MMVC

low shard
#

If you mean the realtime voice changer for calls, Wokada, be sure to download the deiteris fork from the written guide and to not follow yt tuts

low shard
#

This is the wrong channel

#

And be sure to not follow yt tuts

arctic willow
#

thanks

low shard
#

Yw

lean chasm
#

@low shard i downloaded the nvidia version and its runs on the web?? i thought it was a application

low shard
#

WebUIs (like Gradio & Streamlit) are used ALOT on almost every single AI Applications

They are way easier & faster to costumize/build for developers

#

And most importantly, it can be used on cloud (remote good pc), as many people like me don't got a pc good enough for AI
(A normal application program built with qt or tkinter wouldn't be possible to be shown on cloud)

#

Dw about it, uses your gpu

lean chasm
#

about the vb cable, i've done like the instruction but i can't use it in discord

#

i can use it on the web perfectly but discord is not recieving my audio

torpid loom
#

would yall say this is overtraining?

simple ore
#

go to the correct tab of the tensorboard (scalars)

#

and show all other related charts

glacial pollen
dim jewel
#

What is the difference between AI covergen/Mangio/Applio/Mainline?

glacial pollen
# dim jewel What is the difference between AI covergen/Mangio/Applio/Mainline?

Forks ( different takes on what rvc originally does )

  • Mainline is typically more mentioned in context of original rvc
  • Mangio was the first fork of rvc ( it's hella messy and outdated )
  • Applio is kinda a successor of Mangio but maintained by different people / team.
    Packs the most features and can be considered more useful, modern and advanced than rvc
#
  • covergen is probs only for covers and not training but I haven't used it so can't say for sure, I'd recommend to avoid it as it's most likely either old or too niche
dim jewel
#

Thank you for explainig it

glacial pollen
wintry torrent
#

My internet has been out for 26 hours but its finally back
i finished downloading the models and now when i run python src/webui.py it gives me this, should i be worried about anything

#

it opened the webui perfectly fine but will that affect anything

#

should i? and if so how do i

glacial pollen
brittle wing
#

Where should I stop training?

#

or should I kept it going?

glacial pollen
#

then you'd just pick an epoch from before the dip happens

#

( is why I recommend saving every single epoch during training )

brittle wing
#

Yeah I saved everyone 1 epoch

glacial pollen
#

oh ye, in that case, lemme do some example scenario for ya

#

It's just one of possible situations

#

naturally it doesn't ( and won't ) be like that 1:1

#

but you get an idea

brittle wing
#

Okay, in this case just keep it training? it's maxed out 500 epoch

glacial pollen
#

But then yea, it's a pretty meh scenario anyways because

brittle wing
#

its already finished lol

glacial pollen
#

The loggings you see

#

are like

#

Okay so, remember how I mentioned an epoch can have N steps ?

brittle wing
#

Right, i remember

glacial pollen
#

Now, the problem is, applio and rvc are logging in a manner where the actual logging point

#

references only the last step from a given epoch

#

so it's biased because

#

Lemme get an example pic

#

The green circle, is how it logs

#

So in reality, epoch could do an awful overal but the last logging ( last step ) could be " good "

#

or could be the exact opposite

#

Hence why I proposed average loss in my fork
``It's still not the most ideal approach

(( because in proper scenarios, training models has 1 extra phase in training where evaluation happens, where model's tested on unseen data and then scored appropriately ))

but def better than what it is rn``

brittle wing
#

oh gotcha

glacial pollen
#

Yea

#

So I'd take it with caution, the metrics themselves

#

best to follow what I mentioned before + just actively testing the models by inferencing
just follow your ears

brittle wing
#

next time, instead of 500 epoch maybe do 700 or 1000 epoch. since its still showing sighs of training?

glacial pollen
#

I'd say, a good approach is, double or triple your expected training ( total epochs ) time

#

that's how I do at least

#

👀

brittle wing
#

👍

glacial pollen
#

Cause restarting the training has it's drawbacks compared to just doing it all in a 1 go

brittle wing
#

Should I wipe out any sorts of that data from the system and do a complete restart?

glacial pollen
#

essentially what matters ( files created during first training - wise ) is the:
G, D files and tfevents file + most recent epoch ( small weight: .pth model )

wintry torrent
wintry torrent
#

like a youtube link

glacial pollen
#

oh, welp.

wintry torrent
#

it automatically extracts the voice from the songs

#

is there anything else like that

glacial pollen
#

Keep in mind what you use

#

is heavily outdated
just so you know

wintry torrent
#

i can tell it keeps asking me to update stuff

glacial pollen
#

No in general, it is outdated

#

structure wise

wintry torrent
#

if im going to use UVR whats the best webui to use with it

glacial pollen
wintry torrent
#

whats that

glacial pollen
#

it's an audio separation site

#

associated with " voice separation " discord - ish

wintry torrent
#

Is it free

glacial pollen
#

It is, if you register an account you can use it as you please with some limits ( file size / length wise ) + you can't use ensembling but that's not important

#

only drawback ( but imo not as much ) is the queue

#

Nothing crazy to call it bad tho. That's your best quality bet anyways

#

bs-roformer / mel-roformer models for separation ( which mvsep does use ) are beasts

#

My recommended flow of work is:

    1. Get Applio ( or my fork if you intend to train in future )
    1. yt-dlp is a nice tool that lets you dl yt audio in best quality the yt's servers provide
    1. uploading the audio to mvsep to get your vocals and instru
    1. Using applio for covers
crude flame
wintry torrent
wintry torrent
glacial pollen
crude flame
wintry torrent
wintry torrent
crude flame
wintry torrent
#

Also, where do i get voice models

glacial pollen
#

Well, notebooks / colab is def harder to set if we speak of just drag-n-drop websites naturally

#

But sometimes it might be worth it
Esp in this case

wintry torrent
#

Should i download the applio desktop app?

#

Or the webui

#

Oh nvm the app is alpha

glacial pollen
wintry torrent
#

Does it have to be on my C drive

glacial pollen
#

C:\applio\applio's code n shit

#

ex. ^

wintry torrent
#

"C:\Applio" is fine right

glacial pollen
#

yup

wintry torrent
#

Okay

wintry torrent
glacial pollen
#

Pinokio?

wintry torrent
#

is it like stabilitymatrix?

glacial pollen
#

Not sure what you're referring to

wintry torrent
glacial pollen
#

I mean, I don't know this thing / neither used it so can't say much + this chat ain't for that

wintry torrent
#

oh okay sorry

glacial pollen
#

Best to avoid any 3rd party abstractions or whatever like that

#

as we only provide support for what we recommend

#

👀

wintry torrent
#

👍

simple ore
#

stability matrix / pinokio are products made by companies trying to take a niche "we make it easier for you to do x", but completely failing to keep things up to date and breaking shit that is not supposed to break

#

dont use them unless you're a complete dummy

#

ipad-generation idiot for whom a computer = screen and who can't tell .exe from .pdf

glacial pollen
wintry torrent
dim jewel
#

Hi, I have a question.
Regarding index algorithms. As I understand, Fiass is default while KMeans used to decrease the files size for longer datasets. Is there a trade off for using or not using KMeans on longer datasets?

glacial pollen
#

Pretty much

#

tho I personally always go for faiss I guess

#

( ignore the subpoints 3 and 4 from the 2nd ss)

#

But then, imagine if the index was 1-2 gigs ( you never know what kind of datasets people would want to use ) yea
rip memory, rip efficiency

#

The biggest I've tried to use faiss on, so far and without any issues, would be 48~ mins of data
Anything bigger wasn't attempted by me

glacial pollen
#

Don't remember, was half a year ago

dim jewel
#

Thank you for help again)

glacial pollen
#

yea it's alright, in fact I highly recommend gpt for explanation of more technical concepts

#

it's quite good ( and usually accurate ) in abstraction

simple ore
glacial pollen
#

yooo wtf, I can hardly think of having 200k samples 💀

wintry torrent
#

it has been downloading for 2 hours at 1mbps

#

I hate my country

#

Anyway, where do i get models

glacial pollen
#

rip on the dl speed. Feel ya tho, in 2009, in my small town, we'd have 64kbp/s ( some darn awful wireless tube based )

#

lol

#

Imagine the pain back then 💀

glacial pollen
#

Either way, glad it works for ya now

wintry torrent
#

but i mean in 2009 everything was less than a few mbs

glacial pollen
#

Well true, ye

wintry torrent
glacial pollen
#

go ahead

wintry torrent
#

Do i use fp16 or fp32 (3070 8gb)

#

for inference

wintry torrent
glacial pollen
#

so I believe, there's no point doing it in fp32, not that it'd matter much anyway

#

you won't hear a difference in this scenario

wintry torrent
#

Okay

glacial pollen
#

As for pluggins, I don't really use them so can't say for sure

#

uvr's meh, we got standalone uvr and / or mvsep / colabs

#

Elevenlabs I don't use

#

Basically, no point adding em unless you use elevenlabs, got api and stuff ( I'd assume

wintry torrent
#

Ill stick with mvsep and if i have issues ill download uvr

#

What about voice models where can i get them

crude flame
# wintry torrent What about voice models where can i get them

You can search rvc ai voice models at:

if there isnt one, you can:

earnest muskBOT
wintry torrent
#

Is it normal for the download link to be that long

#

Also why does everyone have by weights after their name 😭

analog obsidian
wintry torrent
glacial pollen
#

biggest thing between fp32 and fp16 is for training really

#

both stability wise and, well, 'max' potential you can squeeze out of the model ( quite likely given full precision and gradients' representation )

wintry torrent
glacial pollen
#

All audio that goes on yt undergoes compression and other postprocessing

wintry torrent
#

Idk why i said lossless

#

i meant high quality

glacial pollen
#

Best you can do is use yt-dlp

#

arg -x

wintry torrent
#

Yeah that thanks

glacial pollen
#

usually u get .opus

#

( -x makes it fetch the best audio for a given video the server has )

wintry torrent
#

Also can you tell me what models to use because i have no idea what these do

glacial pollen
#

Now, I am a lil busy so can't respond rn

#

will be back in a bit

analog obsidian
wintry torrent
lavish lintelBOT
#
Congratulations Razer by Weights!

Your Grotle is now level 26!

glacial pollen
#

😌

glacial pollen
#

in terms of that

#

Razer already sent you docks

wintry torrent
#

What output should i use

#

flac right

glacial pollen
#

But if you want to hear an opinion from me personally? I always go for bs-roformer

#

I go for wave
less conversions / encodings, the better for me. like it raw Do not shoot it

wintry torrent
#

Okay

wintry torrent
#

Also why is the output 68mb

glacial pollen
wintry torrent
#

Its gonna take 40 years to download

glacial pollen
#

unless you are handy ( and have ) adobe audition

#

alternatively, fl studio or any other daw

wintry torrent
glacial pollen
wintry torrent
#

I literally cannot download anymore things

#

Egypt really sucks internet wise

#

Everything wise actually

glacial pollen
#

then the instrumental

#

and lastly, dl it all

#

Better to get audacity or such and call it a day

wintry torrent
#

Whats the smallest option other than mp3

#

flac?

glacial pollen
#

there's quite a few of things better than mp3

#

aac, opus / ogg

#

then there's crappy mp3

#

as for lossless, there's flac

wintry torrent
#

They arent available on mvsep

glacial pollen
#

If it ain't going for training ( but you use it for idk, mixing or something ), you can use flac

wintry torrent
#

How big would a 3min file be if converted to flac

glacial pollen
#

matters of individual case and compression ratio

wintry torrent
glacial pollen
#

Typically

wintry torrent
#

or produce anything

glacial pollen
#

" People often favor FLAC because it takes up significantly less space on their devices. FLAC files can be up to 70% smaller than the same WAV file. "

wintry torrent
glacial pollen
#

In that case, go for flac

#

no issues whatsoever

wintry torrent
#

My problem is not with storage its with downloading the file

#

Is audacity easy to use

#

The thing i liked about aicovergen is that it did all of that for me

#

It seperated and combined the voices byt itself

glacial pollen
#

it's rather simple and generic so

#

if you get some basics, you should do well

but if alignment of 2 files is what you want ( no effects, compression and such - generally mixing )

#

i.e. Vocals and music

#

you just put em both in, align to the edge, export and call it a day

#

so

wintry torrent
glacial pollen
#

I can do it just fine

#

maybe try ctrl+a

wintry torrent
#

Yeah i figured it out

#

just had to pause

glacial pollen
#

a

#

yeee

#

you can't do it midway in auda

brittle wing
#

What’s a proper way to make 48k datasets into 32k?

#

For testing*

analog obsidian
# brittle wing What’s a proper way to make 48k datasets into 32k?

u can either resample the dataset to 32k using audacity/rx studio/whatever
or using an script that uses soxr_vhq which is technically better than the above ^
or just selecting 32k sample rate in applio and let it to resample for you in the preprocessing (by default the resample is done using soxr_hq)
realistically speaking most people cant hear the difference between all of these options, so do the one which is the easiest for you

brittle wing
analog obsidian
glacial pollen
#

SoX's having the best currently known algo

brittle wing
glacial pollen
#

Ye, search up sox resampler
( the setting for quality would be 'vhq ' )

#

alternatively, use my fork as it has it in use

brittle wing
glacial pollen
brittle wing
glacial pollen
#

it's using librosa's default resampling algo, whichever it is

analog obsidian
glacial pollen
#

it does? 🤔

#

As far as I know, my forks were the only ones using soxr, no mainline no applio

#

well.. in any case, it's a matter of adding

#

in: root/rvc/lib/utils.py

glacial pollen
#

oh, well, it's commented out

analog obsidian
#

o

#

no way

#

😭

glacial pollen
#

also, last time I checked it wasn't there 🤔

#

or maybe I remember it wrong.. either way ye

#

can be easily added ✨

analog obsidian
#

i correct myself:
mainline almost got soxr

glacial pollen
sonic agate
#

@glacial pollen hi i'm using the correct channel now

#

help

glacial pollen
#

oh, you picked the wrong one 👀

#

gimme few mins, pushing the latest changes to ver 3

#

( cause you got version 1, that one's rvc based, not applio )

sonic agate
#

ohno

#

lol

#

i just noticed

glacial pollen
#

shhh

sonic agate
#

?

glacial pollen
#

🤫 let em not know lol

lean chasm
#

can i get a help with the vb audio cable? i followed the instruction for nvidia gpu okada and finished the setting but somehow the cable is not working in discord

latent kettle
#

How do I get Tensor board in Applio

glacial pollen
#

Then copy the link console gives you and paste in the browser's address bar

latent kettle
glacial pollen
#

if you're training

#

paste that in ur model's folder

#

then like so ^

#

Paste in the path and done. Gonna open up in browser

knotty moth
brittle wing
#

@glacial pollen I'm ready to try out Fork, I remembered you needed 2 epoch to figure out what to put in "Warm up Phase", and "Frequency running loss"? Is this what you meant?

brittle wing
# glacial pollen

Whenever you resample into 32k with Audacity, when pressing export, do you keep the vocals in mono or stereo?

simple ore
#

rvc uses mono

brittle wing
knotty moth
simple ore
#

both for training and for inference all the audio converts to mono 16k

#

training does use a full sample rate files, but still mono

brittle wing
#

When should I lower batch size? lower the better? I'm using RTX 3090 24gb, I'm using 16 batch size.

uneven horizon
#

how to run tensor board? i don't see any bat file named such in mangio folder

simple ore
#

then pip install tensorboard

#

and after that you can use it from command line like tensorboard --logdir=X:\Applio\logs

simple ore
#

for a regular model that is a major overkill

#

and would only lead to shitty results

brittle wing
simple ore
#

what's your dataset size?

brittle wing
simple ore
#

I'm sure wit 3090 you can try 4, 6, or 8 and see which one gives the best result

simple ore
#

make one folder with a dataset for batch 4

brittle wing
#

I assume 4 would be stupid slow?

#

unless beefy gpu?

knotty moth
simple ore
#

batch size may affect an overall training speed

#

one epoch would be the about the same regardless of the batch size

#

batch 4 - 500 steps x 1s, batch 8 - 250 steps x 2s, same thing

brittle wing
#

on average

knotty moth
flint solar
#

As long as the gradients are synced

simple ore
#

also two cards may not actually get 2x faster training because of the sync

safe sparrow
#

how can i run tensorboard in a program that didnt come with tensorboard

pure gust
#

probably a silly question, but i downloaded a model and there is a json file, should that be uploaded somewhere or leave it?

simple ore
#

weights.gg has models named as model.pth and model.index, so json is needed to tell what the f it is

pure gust
#

theres only a pth which i used, no .index

safe sparrow
# simple ore

the directory, does it have to be the folder that has the events.out.tfevents? or the one just before it?

simple ore
#

either

#

if you want to see all logs, or a specific model's log in case you have 100 models there

safe sparrow
#

thank you

simple ore
#

100 model logs gonna take a lot of time to load

knotty moth
safe sparrow
#

actually im training a beatrice v2 model, and it has a tensorboard support, but the loss_g is just a straight line so shrug

#

nevermind, it just updates itself based on the checkpoints

flint solar
safe sparrow
#

what is?

flint solar
#

When choosing the lowest point u will choose it from the loss/g/mel graph

safe sparrow
#

Thank you sir!

uneven horizon
#

Does longer datasets take more time than smaller ones to train for the same number of epochs?

flint solar
wintry torrent
#

Like is there no credit system

flint solar
wintry torrent
#

Then whats the catch

#

How do they profit

flint solar
hallow thistle
hallow thistle
#

Unless you don't wanna wait for the very rare long number queue and in hurry, you can buy their premium. nails

flint solar
wintry torrent
flint solar
wintry torrent
#

Egypt

flint solar
wintry torrent
#

ah makes sense

knotty moth
knotty moth
uneven horizon
#

What’s the best batch size for 4060 ti 16gb?

knotty moth
uneven horizon
#

So lesser batch size equals to lesser consumption of vram?

#

Also which is better higher batch size or lesser for overall training?

knotty moth
#

mostly between 4 or 8, and fp16 (as default choice for RTX gpus) theoretically halves the vram usage of fp32

#

the difference is just that fp32 may offer little better quality and gradient stability but also slower as well

simple ore
#

fp32 - better stability, less wild gradients (i've seen 30k+ with fp16)

#

1hr set fp32 batch 8

#

fp16 halves the vram usage used by the model / discriminators

#

but that would be something on top of ~4-5GB it takes anyway

#

so in this case it would be ~7-7.5GB instead of 9

uneven horizon
#

I’m new to this but what is fp32 cause i don’t see any such options in mangio

simple ore
#

kindly delete mangio and install Applio

uneven horizon
simple ore
#

mangio is oudated and should not be used

uneven horizon
simple ore
#

in applio the default is fp16 (it is okay for finetuning), you can switch it to fp32 in settings, kill the terminal window and restart the app after

#

as for the batch size, it really depends on the size of the data set.. batch 40 may work with 100hr+ set, but it is excessive for 1hr set. Same as batch 4 may be okay for 10 min set, but not applicable for 10hr+

#

what's good for finding a tick in a matchbox is not good to find a bowling bowl in a potato field

uneven horizon
simple ore
#

=4 and <=8 generally

knotty moth
signal bloom
#

Does anyone know what the most popular tts people are using?

#

currently using sapi5

analog obsidian
simple ore
#

i'd rather have a good model

knotty moth
signal bloom
knotty moth
signal bloom
knotty moth
simple ore
#

or you can install something locally

#

f5-tts, fish speech, xtts

#

first two may require some finetuning

#

also depends on a language

simple ore
#

buut somehow it happens during training from scratch with weird models

analog obsidian
knotty moth
analog obsidian
#

almost 12 hours of training no issues for me

simple ore
#

my attempt of "cement" with refinegan did not go right

knotty moth
#

(almost 0)

analog obsidian
#

last time i faced negative kl was training some very damaged dataset

simple ore
#

I think the formula is messed up or the values calculated by the encoders

#

it should not be possible to have a negative, and here here we are

analog obsidian
tawny nexus
#

my bad thanks

flint solar
#

I’ve never had negative kl

#

😂

knotty moth
analog obsidian
#

i dont even know what causes negative kl

analog obsidian
simple ore
#
    kl += 0.5 * ((z_p - m_p) ** 2) * torch.exp(-2.0 * logs_p)

    kl = torch.sum(kl * z_mask)
    loss = kl / torch.sum(z_mask)```
knotty moth
analog obsidian
simple ore
#

i've seen logs_q being way too high that causes that

analog obsidian
#

that using noobies method

uneven horizon
#

Applio training showing 65-85% cpu usage with 80-100% gpu usage consuming about 5.3 gb vram. Is this normal?

flint solar
analog obsidian
knotty moth
analog obsidian
#

last time i got negative kl was in a compressed dataset

#

not saying compression causes that

flint solar
analog obsidian
#

0.5 sec slices

flint solar
#

Ohh

analog obsidian
knotty moth
flint solar
#

Ima try this later today

analog obsidian
simple ore
#

I'm using AMD GPU and because of that some stuff gets offloaded to CPU, but even there it is only 75% tops

analog obsidian
simple ore
#

dont comment out hipass filter

#

use it

analog obsidian
analog obsidian
#

and my cpu usage is fine

simple ore
#

and I wonder if 'Resizeable BAR' affect it as well

knotty moth
uneven horizon
analog obsidian
#

oh so is for removing dc-offset

#

i got lucky my dataset didnt had that

simple ore
#

you can try enabling that option and check the task manager's performance tab, as long as the shared memory is not used, you're good

#

and that should lower the CPU%... hopefully

alpine valve
#

anyone here with decent prompting experience, i need some quick help🙏

analog obsidian
#

@simple ore chunk_len=5.0, overlap_len=0.5 is this good? <

simple ore
uneven horizon
simple ore
simple ore
uneven horizon
simple ore
#

i've shown the code used in rvc

#

they've implemented something weird instead

knotty moth
#

other distance functions:

latent kettle
#

Is there any applio hugging face space is available?

distant turtle
#

-colab

azure marshBOT
# distant turtle -colab
📒 Google Colab Notebooks
ℹ️ Note

While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

flint solar
simple ore
#

need to close the app and restart it in a new window

uneven horizon
#

How do i use custom pretrained models in applio?

simple ore
#

check custom box, select custom G and D files

uneven horizon
simple ore
#

X:\Applio\rvc\models\pretraineds\pretraineds_custom

glacial pollen
#

in your case you can set avg value to 23 or 30
( and warmup is up to you but I'd stick withhin min - 5% of total epochs and max range - 10% of total epochs )

latent kettle
low shard
low shard
#

the first one is made by @viscid moss , i think he updates his uvr much

#

the space should too ig

viscid moss
#

Well.. the HF space, not yet. I'm waiting to add the last missing model, to make a big release

#

Yesterday, 17 new models were added to audio-separator, which is the core of UVR5 UI. But there is one more that needs a workaround to work.

signal bloom
#

any recomendations for generating more natural sounding audio using edge tts. Looks like it doesn't support SSML

latent kettle
#

@simple ore can you please tell me how to see tensor board correctly?

latent kettle
#

There are too many graphs 📊

#

Some are going up some are going down

#

What to do ?

simple ore
#

Tensorboard is a series of graphs where we can monitor the progress of our model during training, but there are many graphs. We are only interested in the graph called 'g/total'. You can find this by clicking on 'inactive' and selecting 'scalars'. Then, go to the last page, where you will find it in the last graph.

latent kettle
ionic canopy
#

So, out of curiosity, can yelling be apart of a dataset?
Like let's say, Eren Jaegers yelling mixed with his talking
Do I just need to put both together in their own group?
Like all the talking lines first, then the yelling?

latent kettle
#

I want to stop training on current epoch (65) how do I stop it in applio ?

simple ore
#

if you chose to only save the final model, then it wont be saved until the very end

latent kettle
#

It was just hiding. Now I see it.

#

So I have to wait for at least 100

simple ore
#

do you have .pth files other than D/G in your model's folder?

latent kettle
#

Do over training detector works in applio ?

latent kettle
ionic canopy
#

Please

simple ore
#

do not include yelling in one file with normal speech

#

otherwise the normal spech would normalize to nothing

loud condor
#

how to update applio?

simple ore
#

download new version, unzip to new folder, move audios and models over, delete old folder

loud condor
#

thanks

dim jewel
#

Hey guys, I have a question. In theory, If a model has an accent, can more epochs decrease it?

fast phoenix
#

oop

#

yeah im very beginner at this

brittle wing
#

10 hours in, keep training?

crisp void
#

Anyone help me resume training?

azure marshBOT
crisp void
#

!howtoask

patent trellisBOT
# crisp void !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
pseudo dagger
#

It was here that people were working in a text to speech ai? cant find a good one that uses rvc models

brittle wing
idle stag
#

So my applio has been working perfectly fine till right now "RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input." I saw someone saying to use the split audio option but that didn't work I ran the "install.bat" file again but that didn't help either what should I do?

#

Fixed it I was just running out of memory ThumbsupTom

crisp void
#

The model has completed 300 epochs, and I would like to extend it to 500 epochs. How can I continue training from epoch 300 without starting over?

Could someone teach me?

simple ore
#

you're squeezing all the juices out of it

simple ore
brittle wing
#

They look similar

simple ore
#

(personally I'd stop at 30k lol)

pseudo dagger
#

How do i add rvc voices to aplio in pinokio?

brittle wing
#

Alguien que hable español que me ayude a tener un buen modulador con voz de chica que no parezca taaaan robotica? porfavor

glacial pollen
#

we don't quite officially support it

pseudo dagger
glacial pollen
#

In applio, you'd place em in /logs/<model's folder>

#

within the folder, along with the index

pseudo dagger
#

My aplio logs has only one folder called mute

pseudo dagger
glacial pollen
#

naturally

#

you see, if 90% of people use X1, hardly anyone will want to dive on their own into X2 to help 1 person

#

and as most of us or me, are against crappy automations ( pardon my lang ) the chances are even smaller

#

Imo Pinokio is for lazy people who aren't willing to learn a bit to do things right

#

That's a lil scary considering such people intend to work with artificial intelligence

#

it never was meant to be easy or easily accessible with no effort put into it tbf

#

That's like asking for troubles in 5-10 years

pseudo dagger
glacial pollen
#

ig, if reading up a bit of text that'd literally take ( at worst ) 10 mins is what you call programming lesson

#

then I suppose, you shouldn't be using AI for memes

#

¯_(ツ)_/¯

#

it's 2 clicks man, 2 clicks

#

1 .bat for installation, 1 for running

#

lol

#

And yet here you are asking about fixing N problem on an unknown site or a service
Taking you more time than it'd if you read a bit of instructions

#

I hope you get the point

pseudo dagger
glacial pollen
#

Well, never hurts to ask

#

or to check the repository man

glacial pollen
#

So you know how to deal with them and maybe even help other users, if at one point you felt like it

#

it's a basic skill anyone should have, problem solving

#

Man, I'm kinda scared what's gonna happen to this new generation

#

Can't imagine ( no offense ) such people operating atom/nuclear-powered facilities in 20 years

crude flame
glacial pollen
#

welp, good thing I'm not from US

#

nah, half joke

crude flame
glacial pollen
#

Either way, Soryu
If at one point you changed your mind and actually decided to give applio a go, let us know
Always open for support as long you need it

pseudo dagger
glacial pollen
#

I suppose? not that I'd expect much of support towards tools not associated with the server anyways

pseudo dagger
#

Also i finded the solution, works the same as the normal aplio just needed to drop the file in the download section of the interface

glacial pollen
#

so if you wanna volunteer, go ahead

glacial pollen
#

same as it always was with rvc

#

But congrats on figuring it out ✨

pseudo dagger
brittle wing
#

alguien español?

low shard
simple ore
#

how hard is what? Why do you need pinokio for that?

hallow thistle
hallow thistle
solemn shell
#

cant do an interference on applio

#

i followed the steps for amd gpu

#

the applio opened but is in a infinite loading to do an interference and are not using my cpu or gpu

glacial pollen
solemn shell
#

its says: Compiling in progress. Please wait...

#

after I try to interference

uneven horizon
#

If i’m using custom models in applio do i need check custom in embedder model tab?

mild oar
simple ore
simple ore
uneven horizon
#

So just leave it at contentvec?

uneven horizon
simple ore
#

contentvec is the default feature extractor

#

pretty much all models use that

knotty moth
glacial pollen
#

so yes, they are

#

well, or should I say " contentvec's 500class model "

uneven horizon
#

What is Hop Length and what does it do exactly?

glacial pollen
#

@uneven horizon
simplifying / abstracting it without going into too much details:

brittle wing
#

uhh, I left my PC on and activated sleep/hibernate mode on my PC and its still training?? When I went to work and thought nothing of it.

#

possible that it can still train during hibernate mode? My PC is chill and not hot or anything.

#

I'm 1005 epoch in lol

#

Here's 21 hours of training, when is it over training and should be stopped?

brittle wing
latent kettle
#

Read it

clear mesa
#

hey, I was curious, if you add more audio into the dataset, do you have to start training the model from scratch or can you just continue and enhance the existing model?

latent kettle
#

How do I resume Training on Applio

brittle wing
latent kettle
knotty moth
brittle wing
latent kettle
brittle wing
#

rtx 3090

knotty moth
latent kettle
brittle wing
latent kettle
#

But how

brittle wing
#

I left 1500 epoch on it skullsob

latent kettle
#

Can you tell Me how do i resume training

#

On applio

#

@knotty moth

brittle wing
flint solar
#

Use the same model name, and sample rate dont preprocess, dont extract features

#

Use same batch size and click train

latent kettle
#

Okay

flint solar
#

I believe

latent kettle
#

Ohh. Thank you. But I'm training on applio

brave ermine
#

is there any good tutorial to train ai model

low shard
brave ermine
low shard
brave ermine
#

6

low shard
#

You can try either Local or Cloud

Local:

  • Applio: A fork of RVC with some extra features like Applio TTS, kinda faster and simpler but same quality tho
  • Mainline: The original RVC
#

You can train RVC models on cloud (remote good pc):

  1. Prepare the Dataset
  2. Setup RVC:
    Choose a cloud way to use RVC,
  • Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
  • Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
  • Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):

Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time

  1. Be sure to know about the tensorboard

If you are looking for the easiest way and for free, is using https://weights.gg which ofc uses RVC

#

But I think Cloud would be the best

brave ermine
#

thanks ill take a look

latent kettle
#

What is Mel reformer?

sonic agate
#

a vocal separation model

neon rover
#

Voice cuts off… what to do?

sonic agate
#

where?

neon rover
#

on discord

low shard
low shard
glacial pollen
#

around 16k it's more or less where you should stop as past that the performance as you can see is regressing

#

if you want more accurate results, get my fork with averaged metrics

hallow thistle
#

A converted audio you inferenced on RVC can be cut off to silence abruptly if using a bad RVC model. nails

glacial pollen
#

as then you can see avg performance throughout epochs themselves too

#

so you can "more or less" see how an epoch performed ( still, on it's own data but better that than what stock logging on it's own is )

#

ofc, normal losses are still there too

#

but you can already see the differences

#

Stock behavior of logging is to log given epoch's last step's performance where averaging does log the loss over n steps ( of your choice ) within that epoch

#

reason I mention it is because stock loggings are hella inaccurate. Example;
Imagine your epoch is 67 steps, the logging takes place on step 67, that one could be great metrics wise but 80% of the steps in that epoch display rather mediocre or bad performance. You get the point

#

Naturally, having a proper evaluation phase during training would be the most ideal, where aside of training and own-losses, losses based on how model performs on unseen data ( evaluation set ) is also measured. That'd showcase the model's generalization. yet, we don't have that ( at least yet )

glacial pollen
# brittle wing No, it's on Batch 6 of 30+ mins of Datasets.

ps. batch size of 6 might not be the most ideal option here ( esp for 30 mins), I'd highly recommend trying out 8, it's more balanced and since 8 is a number that is a power of 2, the performance of training is somewhat better as parallelism in a sense is in your favor

brave ermine
#

i succesfully trained and tested my voice model it worked well i used 200 epoch and 5 minutes data but i wonder howmany epochs and howmuch data length is ideal ? im looking for any tips for newbies

glacial pollen
brave ermine
#

i looked over that but i didnt understand anything

glacial pollen
#

once you learn to evaluate what's going on with ur model on graphs, you can def improve ur models' quality

glacial pollen
brave ermine
#

im not aiming to be a professional but i wish for better

#

im just using this for trolling my friends not business

glacial pollen
#

it is just what's used in all machine learning cases ( well, most, there's also keras stuff

#

cause " I'll train for N epochs as I think its's good " was not and won't ever be a rule to follow sadly

brave ermine
#

thats why we save bunch of epochs

#

i see

#

i mean checkpoints

glacial pollen
#

I mean yea, saving every single epoch and testing em is an option, but a pain in the ass tbf

analog obsidian
glacial pollen
#

you see.. if you don't wanna go that far into metrics, you can just follow simple rules~

  1. Lower = better.
  2. if it keeps on rising and keeps that tendency for a while = bad
#

it

#

is a nobrainer once you dedicate like 15 mins of your time into understanding it ( even basics will do, and you're already well prepared for most ml trainings), my dude

#

Imo reading basic graphs is a basic skill most of us should have in 5-10 years

brave ermine
#

ill check few tutorials i guess

crude flame
brave ermine
#

or ill learn by trial and error

glacial pollen
analog obsidian
#

should take you a couple of minutes to understand them, don't worry, does not require years of machine learning knowledge to understand them

glacial pollen
#

read this up

#

but in a short;
if you get my fork, evaluation of your models gets pretty easy

#

you get to more or less see how that one epoch does, in terms of performance

brave ermine
#

imma try my best

glacial pollen
#

Then you'd have like, two steps.
Normal graphs ( hypothetical scenario )