#Guide for Applio - Help me
1 messages · Page 1 of 1 (latest)
ok so first choose spin here
enable custom pretrain
and use dat
done
now u training spin
important shit to remember
in the inference tab u also need to select spin
otherwise ur model kinda
dies
😭
keep in mind that models trained with spin would also require spin embedder in inference, otherwise the voice would speak gibberish
^
the inference option should also have like that
now click that
and choose the finger spinner
also in case u wanna use your model in realtime, this doesn't work in w-okada
but works in another voice changer named vonovox
no i mean using the model as a voice changer
vonovox is literally a better wokada just in case
check out the vonovox guide here https://docs.aihub.gg/rvc-voice-changer/local/vonovox/
Last update: June 2, 2025
shit can run fine while playing games without having to decrease graphics
and
it doesnt open in ur browser
give it a try
spin is there
click the link sybau (😭 ) sent
same shit as applio
download source (zip)
run setup.bat
let it install its shit
run start.bat
way better
trust
dev is known here
yea sure
vb audio cable is shit why u using it?
thats
actually quite rare
lmao
bro is lucky asf
pray vb audio cable works there anyway lmao
no one has ever tried it
but it should

lets say vb audio cable is maybe a tiny bit buggy

get VAC lite here https://docs.aihub.gg/rvc-voice-changer/local/deiteris-w-okada-fork/#a-virtual-audio-cable-vac-is-what-you-need-to-use-the-voice-changer-on-discord--games
Last update: May 5, 2025
at least the UI system (tkinter) isn't bloated since it doesn't use chromium framework
block size is chunk ms
server mode wokada
use wasapi
to use spin
u always used client mode in w-okada?

the audio device type, WASAPI has less latency than MME like used in wokada client mode
audio backend is the thing that sends audio between apps
client mode was a silly idea wok had at the last time
it may be the reason why his voice changer app runs extremely bad

block size = chunk
haven't tried vonovox actually, but does it also have ASIO?
lower block = less delay, higher gpu usage
higher = u get the idea
yes
play with the block size value until u find one that suits best for u
but dev is now working on that actually
bad for realtime
yea like u can try it after u finish ur training
if u set your index value to 0 wokada wont use it btw
index is bad because it can potentially add more delay and runs poorly on cpu
also messes up pronunciation
like super bad
index idea was because models trained with super small datasets usually don't sound like the intended target, rvc-boss made the index files so they can sound more closer and true to the original source
bigger datasets dont have that problem
g and d uses 1gb of space together iirc
epochs are 54 mb
you're not saving every g and d, right? lmao
I think the index option should be provided anyway, some ppl with capable spec may find it useful for better accent
are they named g_23333?
ok nice so you're not saving every g and d
u cant, your model needs the g and d files in order to continue training
at least hide it with warning like that
you're not saving every g and d
dont worry
"save only latest"
be sure thats enabled
never disable it
for original rvc training? (og pretrain, cvec) nothing
for spin, select spin where i told u, and use custom pretrained
ignore the rest of stuff
yeah
deletes everything and starts the training from 0
useful if u wanna train the same dataset but with a different batch size
btw for vonovox u need this

ah
im restarted
lmao
weird
no
applio doesnt install that
nor vonovox
also hear ur epochs
machine learning has this thing named overtraining that like the name implies, if u train a model too much, it may not be able to do predictions anymore
in rvc the only way to know if your model is overtrained is by hearing it
overtrained models sounds very robotic and unnatural
yea trust, this is pretty ez
kaggle uses a gpu from 2016 lmao
i mean its free
ai shit is heavy what else i can say

my 5 hour dataset log folder is 15gb

but it's totally worth it, the more u add, the less robotic the output 
the model also knows how to reproduce volume changes more naturally, and knows which sample has to use depending on the context (pretty cool)
for realism i'd say 1-2 hours minimum of data is ok (if u want the best possible results obviously you need more than that, up to max 24 hours maybe), better results if the dataset is super expressive, in my case my dataset was super monotone 
did u ran start.bat as admin?
either you ran setup.bat as admin or start.bat as admin

dont run any bat as admin
just double click them
just in case reinstall the whole thing
this time be sure you dont run none of the two bat file as admin

i mean u can just use audacity or something
im gonna tell the dev this suggestion
btw vonovox has a disc server too
dude who made the app it's there (dr87)
scroll down and find this
click where it says discord server
Just set the output to your headphones