#Guide for Applio - Help me

1 messages · Page 1 of 1 (latest)

pallid breach
#

no u gotta select it

#

ok so first choose spin here

#

enable custom pretrain

#

and use dat

#

done

#

now u training spin

#

important shit to remember
in the inference tab u also need to select spin

#

otherwise ur model kinda

#

dies

#

😭

scarlet rapids
#

keep in mind that models trained with spin would also require spin embedder in inference, otherwise the voice would speak gibberish

pallid breach
#

click this

scarlet rapids
#

the inference option should also have like that

pallid breach
#

now click that

#

and choose the finger spinner

#

also in case u wanna use your model in realtime, this doesn't work in w-okada
but works in another voice changer named vonovox

#

no i mean using the model as a voice changer

#

vonovox is literally a better wokada just in case

scarlet rapids
pallid breach
#

shit can run fine while playing games without having to decrease graphics

#

and

#

it doesnt open in ur browser

#

give it a try

#

spin is there

#

click the link sybau (😭 ) sent

#

same shit as applio

scarlet rapids
#

it also works for RTX 50-series, so no worries

pallid breach
#

download source (zip)

#

run setup.bat

#

let it install its shit

#

run start.bat

#

way better

#

trust

#

dev is known here

#

yea sure

#

vb audio cable is shit why u using it?

#

thats

#

actually quite rare

#

lmao

#

bro is lucky asf

#

pray vb audio cable works there anyway lmao

#

no one has ever tried it

#

but it should

#

lets say vb audio cable is maybe a tiny bit buggy

pallid breach
#

the same person who made vonovox also made spin for rvc

#

cool guy

scarlet rapids
#

at least the UI system (tkinter) isn't bloated since it doesn't use chromium framework

pallid breach
#

block size is chunk ms

#

server mode wokada

#

use wasapi

#

to use spin

#

u always used client mode in w-okada?

scarlet rapids
#

the audio device type, WASAPI has less latency than MME like used in wokada client mode

pallid breach
#

audio backend is the thing that sends audio between apps

#

client mode was a silly idea wok had at the last time

#

it may be the reason why his voice changer app runs extremely bad

#

block size = chunk

scarlet rapids
pallid breach
#

lower block = less delay, higher gpu usage
higher = u get the idea

pallid breach
#

play with the block size value until u find one that suits best for u

#

but dev is now working on that actually

#

bad for realtime

#

yea like u can try it after u finish ur training

#

if u set your index value to 0 wokada wont use it btw

#

index is bad because it can potentially add more delay and runs poorly on cpu

#

also messes up pronunciation

#

like super bad

#

index idea was because models trained with super small datasets usually don't sound like the intended target, rvc-boss made the index files so they can sound more closer and true to the original source

#

bigger datasets dont have that problem

#

g and d uses 1gb of space together iirc

#

epochs are 54 mb

#

you're not saving every g and d, right? lmao

scarlet rapids
pallid breach
#

are they named g_23333?

pallid breach
#

ok nice so you're not saving every g and d

#

u cant, your model needs the g and d files in order to continue training

scarlet rapids
pallid breach
#

you're not saving every g and d

#

dont worry

#

"save only latest"

#

be sure thats enabled

#

never disable it

#

for original rvc training? (og pretrain, cvec) nothing

#

for spin, select spin where i told u, and use custom pretrained

#

ignore the rest of stuff

#

yeah

#

deletes everything and starts the training from 0

#

useful if u wanna train the same dataset but with a different batch size

#

btw for vonovox u need this

#

ah

#

im restarted

#

lmao

#

weird

#

no

#

applio doesnt install that

#

nor vonovox

#

also hear ur epochs

#

machine learning has this thing named overtraining that like the name implies, if u train a model too much, it may not be able to do predictions anymore

#

in rvc the only way to know if your model is overtrained is by hearing it

#

overtrained models sounds very robotic and unnatural

#

yea trust, this is pretty ez

#

kaggle uses a gpu from 2016 lmao

#

i mean its free

#

ai shit is heavy what else i can say

#

my 5 hour dataset log folder is 15gb

pallid breach
#

the model also knows how to reproduce volume changes more naturally, and knows which sample has to use depending on the context (pretty cool)

for realism i'd say 1-2 hours minimum of data is ok (if u want the best possible results obviously you need more than that, up to max 24 hours maybe), better results if the dataset is super expressive, in my case my dataset was super monotone KEKW

pallid breach
#

did u ran start.bat as admin?

#

either you ran setup.bat as admin or start.bat as admin

#

dont run any bat as admin

#

just double click them

pallid breach
#

just in case reinstall the whole thing
this time be sure you dont run none of the two bat file as admin

pallid breach
#

delete the folder

#

xD

pallid breach
#

i mean u can just use audacity or something

#

im gonna tell the dev this suggestion

#

btw vonovox has a disc server too

#

dude who made the app it's there (dr87)

#

scroll down and find this

#

click where it says discord server

trim plume
#

Just set the output to your headphones