#✨│ai-help
1 messages · Page 215 of 1
Either getting rid of it or extracting just that
What
Either getting rid of it ( backing vocals ) and leaving just main / lead vocals or extracting just that ( Only backing vocals )
well
send me the file
can be dm
There is something wrong with the file ( or you didn't put it in there?, next to ffmpeg
How can it be a gambling?
man
see this?
depending on the model, you always have boxes to check
instru, vocals, other, etc etc
i think its something wrong with the files
Alright, then
Now, depending on what the model does
maybe i should use something different to record voice messages
the 2nd option, in this case instrumental only
in case of, say, bve model
it doesn't output instrumental but just the backing vocals
Do you guys know where I could find a large number of CC0 free-use voice speaking samples with high quality recordingings? I tried Mozilla's Common Voice but each speaker only has like 5-ish minutes and the recording quality sucks.
Ye, send it over dm
I'll analyze the file if you want and tell you how to fix it
Alternatively, just change the recorder
what recorder should i use
oh, wait, by voice messages you don't mean phone-related?
then what is it, you record discord or such?
no i just use the voice recorder app
anyway, obs or audacity ig?
Whatever you use, fucks the files up
stock-apps ah moment
Use audacity
Audacity is the world's most popular audio editing and recording app. Edit, mix, and enhance your audio tracks with the power of Audacity. Download now!
ok i will thanks
Either way, rest is up to you to figure out
recorder fucks up the files, that's that
I gotta go for now so, gluck
alright
any audio editors/DAW are better than the built in windows, and it should allow saving in lossless wav format
@glacial pollenWhy
Makindanyee hopefully explains you what's what n why - I am busy atm
- try converting the input file to wav first if not sure
- if you have only igpu, I'd recommend to uncheck "GPU conversion"
HomiethePossum told me to ask you whats wrong with 20000 epochs on a model when it's says that amount on config.json. I don't see anything wrong with it. If it said it on config.json then it should be correct
I mean it’s correct, it’s just not ideal. You always want to use tensorboard to look out for overtraining on your models. Depending on your dataset length overtraining can happen quicker or slower, but you need to watch out for it. If you need a tutorial for Applio or Tensorboard, someone else will be able to help.
How do I put a model on tensorboard
@formal wind will help he just msged me
Wait huh.
yeah you said you’d help him
Ok. I'll wait for an answer
Haz has free time he should be able to help
OK. I'll wait for an answer from him
srsly is it 20k epochs or steps?

epoches
Look
Thatll answer it
Then who does?
tensorboard pls
Then who does
@viral mason they have time
I never used it before. I'm new to voice model uploading and making so I need help
and the log interval seems cursed
I mean you could just look at the tutorials?
I'll still wait for an answer
wha
Where do I find tensorboard tutorials
@viral mason help a brother out
sure
good luck
I'm still waiting for an answer
worm give him an answer
You heard him
uhhhhh
Wayta?
@red kayak
don’t uhhhh worm you know exactly where they are
idk any tutorials
you are the tutorial
Help him out worm 🤣
please worm
You just don't want to say
EXACTLY right
he’s gatekeepking them from you
???
You heard the man
I'm genuienly confused
worm just give him the tutorials
You heard the man
Oh please
@viral mason I mean come on you can’t have all the tutorials to yourself? help a brother out.
Great. Now my 3 models were removed for no reason the 3rd time. I didn't do anything wrong. I don't know what's going on
damn
do u clean your datasets?
I don't know who keeps doing it. But I know it's not me
Seriously, I didn't nothing wrong
I don't know who keeps doing it. But I know it's not me
Also I reported and blocked you Your_Local_Worm. Because you offended me with that Bonnie gif
what???
L
that’s fair enough
I'm telling the truth. I did nothing wrong
I had to do that. Cause that was unnecessary
sent that bonnie gif
I'm gonna upload them one last time for tonight. But if it happens again, I'll post them on Uberduck model showcase instead
worm don’t anger him more
Yes. And that was so unnecessary
You heard the man
Worm. I don't appreciate your accusing about me doing something wrong which I didn't
Great now I got banned for nothing. I didn't put any instrumentals on the demos. I wasn't against the guidelines at all. I did nothing wrong. There wasn't a single answer. So banning me was a huge mistake
I think it was the 20,000 epochs
I don't see that saying anywhere on the guidelines
you didnt get banned you had a role removed
it was removed for repeated low quality models
I don't see that saying anywhere on the guidelines
I mean when you think about it, it’s kinda pointless posting a model that’s low quality because then nobody will use it?
I'll try again and I'll try harder
Just because you passed the qc test doesnt mean you are allowed to post bad quality models
well you have to redeem yourself to even get your role back before you can post
Is there a way I can get the role back
make a good model, become model maker.
apply for model maker again
then dont post bad models
so it doesnt get removed again
I didn't lol
How can I make them good?
well I think u shouldnt post meme models using few second dataset, or try repeating it to at least around 2 mins, even that wouldnt be ideal enough
clean them and train them properly
Oh razer while you're here. I wanted to bring up my application for model master. It's been about 3 days and have gotten a response yet. Is that normal
Where's a tutorial for that?
yes
Will UVR Denoise on X minus work?
i dont do those
Will UVR Denoise on X minus work?
i think litsa does them
depends
Bet thx
I'l try harder
that’s the spirit
and smarter
Where do I find the channel for apply again?
bro is not making it
don’t anger him again

That wasn't nice for you to say to him. Just for that I'll report and block you as well
I cant tell if youre jokin
I really am gonna do it cause that wasn't nice
no jokes allowed on ai hub
I'm not joking. I'm serious. I'm gonna do it
There. Reporting and blocking done
He wasn't even being mean that guy was literally just being not to be rude here. A little dumb
Either way, That was unnecessary
it’s not that deep
I'm sure that guy will get better at model making. You just gotta make practice and all you have to do is keep trying. Practice makes perfect
then you msg him and teach him?
You hear that, @rugged hinge?
I can try
go for it
While i try to figure myself out, I'm gonna post my models on the Uberduck model showcase instead in the meantime
I already did it and gave him some advice
good on you my friend
Thanks
Good idea
👍
wtf happened here
y'all gotta chill
🦈
How To Troubleshoot 
- Don't simply mention your issue, like "
my rvc is not working". - Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
- The more context, the better.
- Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
- It's okay if you're frustrated, but don't take it into this server.
- Don't DM without prior consent.
- Don't ask for every little instruction. Put your own effort & test things by yourself.
- Don't ask to ask.
- Check if your answer is a Google search away/on our guides website.
I can give you at least some basic advices on how to make a voice model to sound good: sample rate, audio dataset quality for example. But I don't have a permission to accept a voice model to pass to #1175430844685484042. Some advanced things go to engineers and model checkers. If you think you know things better than anyone else here, you can try think for yourself what you gonna do with your way.
wtf happened im couhging the fuck up
someone yapped about a person with some problematic model
😭
basically it can help speeding up training by using the VRAM, but will cause a big vram usage
I'm not even sure if it would work with AMD, but 8gb or more vram is probably suggested
I didn't try that option honestly
not sure if it gives any difference in quality since I never tried that option
only for small datasets of 10 minutes
also if you use more vram than your gpu has you'll be using system ram (which is painfully slower)
nop
4060
dont use overtraining detector, its innacurate and bad
ive asked noobies to remove it but he told me that 'you can just ignore it'

yesss
first one
u want to undertrain your model?
🦈
I never tried it, but can't it be adjusted ?
just pay someone to do the model for you i guess
nop
i mean yes but still the thing is bad lol
idk better ask him 🦈

i would do it for free bc u have the dataset
tho im not going to clean it and you cannot submit that model for your model maker role application
remember that also if the model sounds bad dont blame on me, i'll be using your exact dataset, no cleaning whatsoever
👀
removing reverb, removing noise
any dirt noise/sound that is not a voice
artifacting such as clicks, etc
did you truncated the silence of your dataset?
or kept the natural silences intact
hey just looking for like a simple answer since im not the brightest but what website can i find beatrice v2 models on
deleting the silence like this
no but like, you said you removed every pause
i need natural pauses
yes thats bad
coz rvc needs natural pauses
otherwise model results are meh
yes bc after that i do truncate the excessive silence and i keep the natural pauses
you have the original one?
without the pauses removed
ok so i can train that but
possible issues:
model will not be able to inference silence
longer training time
im looking the spectogram rn
to be fair i've trained models that have similar pauses to that
yes, rvc also injects silence specifically to help datasets like yours
to help them learn what silence is
okay i'll be training ur set
yes, the model will be able to generalize more
30 minutes is enough for good generalization
hold on
which separation model are you using?
also
wait
your dataset has harmonies
i cant train that bruh
only mono audio
no harmonies, chorus, whatever
minute 14 in your dataset
harmonies
yuh
no overlapping in the dataset
only 1 voice
🦈 👍
too old
use gabox fv4
download the uvr beta roformer version https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.6/UVR_1_15_25_22_30_BETA_full.exe
after installing that exe, install the patch (DON'T INSTALL THE PATCH BEFORE INSTALLING THE MAIN VERSION)
https://github.com/TRvlvr/model_repo/releases/download/uvr_update_patches/UVR_Patch_1_21_25_2_28_BETA_small_rofo.exe
uninstall the old one
before installing this
after u install all you do this
(click install model)
and you follow the steps
pretty easy
download voc_fv4.ckpt
and voc_gabox.yaml
uvr will ask for these when installing the model
^
with these settings
u can decrease the overlap to 8 or 4 if you dont care losing some audio information
16 gives better SDR outputs
the speed
but its veeery slow
yea
u can speed up by decreasing to 8 or 4 but u got less sdr
but hold on
we have a colab for dis
no need to install all of this locally
if u have google colab premium u can make this very fast by using an a100
🦈
the spectrogram consistency
uhhh
pointless, the colab T4 is still not too slow for overlap 16, it's around half the audio duration
amd moment
technically true
1 sec
he has a colab actually
with all of this
everything is self explanatory
just read the cells
yess
so your output dont have noise introduced by 24bit wav
codec things
🦈
^ do this
after that you can clean your dataset, remove noise, harmonies, chorus, reverb, echo, whatever
what i need is only his voice alone
mel karaoke model instead of gabox fv4
is not perfect and gives muddy outputs
its up to you
tho if the audio is very muddy rvc will make the model sound robotic
karaoke after fv4
O
i thought mel karaoke worked bad on already isolated audio
using karaoke directly on mixture can work, less bleed but muddier
thx 🔥
which model you recommend for deverb?
after mel karaoke you have to use a dereverb model
yuh
yess
🔥
in case u forgot the settings just check this
works for every model you'll be using
that's too cheap for commissioning to a model master
also they would only accept through paypal etc. instead of such in game currency/tradable items
even that's too low
-# The prefix for commands is !
Select a category from the menu down below to view all related commands
LunaBot 🌙 is the perfect music bot! Feature rich with high quality music! And Custom Playlist
You can start listening music by just joinning a voice channel and typing: /play [song name or link] (Remove brackets).
We support only Spotify, soundcloud, bandcamp and more!
To view more help on a specific command or category, run
/help <command> or /help <category>
Important Links:
Support
Premium
Invite
Command Categories:
🎶: Music
💰: Premium
⚙️: Utility
📕: Admin
Select A Page From Dropdown Menu Below
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- Hina's Mod AICoverGen WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Hina's Modified Original W-Okada's Realtime Voice Changer, Google Colab
- FaceFusion UI, by Nick088 Google Colab
- FaceFusion NO UI, by Nick088 Google Colab
- EasyGUI, by Rejekts Google Colab
- 🆕 Music Source Separation Training (Inference), by Jarredou & Makidanye Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
thank you 👑
reminder to also check #📰│dev-updates
its on the same colab
btw i forgot to mention, dont remove breaths from the dataset
keep them
yea its not perfect sadly
u just remove these
manually by silencing it
(don't delete, SILENCE it)
deleting it causes pops
silencing it does not give pops
rvc hates pops 
would i have better performance using internet than my 1650 yo
this probably isn't the right channel but been having issues with UVR5 UI huggingface space with it just not wanting to work
yes but u can easily avoid them by just silencing the parts you want to delete
audacity literally has a silencer
hold on i found something
silencing on audacity gives pops

@analog obsidian a sharp cut is only visible in the voiced parts of audio
and it is all across the spectrum
enable feathering in izotope RX to enable crossfade while doing silence
forgot to mention that
but
hes doing it on audacity

i would prefer to have a pop free dataset
and yea pressing delete could still cause popping
yes, you should not cut in the middle of
but the training happens on a small random slice of a segment and that can be anything
what i shown it was actually an example of something you shouldn't do 
what are your settings for feathering?
🥹
tldr of this is to avoid having pops in your set
rx11
yup
yes also very important
you have to disable audacity's dither
adds noise too
🦈
nice
yess nice
no pops
you also can see the spectogram in audacity
pops would look like that
🦈
right click the track and click spectogram
use that if you're unsure that you fixed the pop
🦈 🔥
do you need the "funds" for rx11?
32 bit float shouldnt have dithering at all
hi
After i Update the Applio Today (pulling it from Github), where tf is selecting training mode, like Hifi-gan & refinegan?? 
i can make a "donation" for rx if you want
1 min per epoch
6700xt is faster then a 6650xt
refinegan got deleted because it was having too much problems
well it is
6700xt is barely faster then colabs t4
just sell both the cards and get a nvidia
i see, but should just delete refinegan option only, not all option
hes fine
more like 200-250
if youre going to do ai its so worth
idk
im not on linux
mrf-hifigan at that state wasn't better than hifigan either
so what it default mode to?
i can't select like RVC or Applio like Hifi-gan
refinegan has advantage on absence of spectrogram mirroring but unfortunately it still suffers more robotic sounds and some other problems
hifigan
by default the model should work in every rvc client (w-okada, mainline)
thank you
🦈
if u still stick on the amd card, I'd recommend running under rocm in linux, and 7000 series ones or 9070/XT are better in rocm support and optimization
just buy nvidia
windows also getting worse, and linux gaming is getting better thanks to valve
im not really a fan of multiplayer games that inevitably use anticheat that lacks linux support
and the 60gb of rvc datasets and stuff
i hate this fr
😔

I have made 400 gb free space after purging my old model datasets having done already
wild
lol
isnt it in mac/ios?
this is some theme i liked
ill remove it when you give me the cleaned dataset
30 mins to 1 hour max for a model
time to remove the harmonies i guess
just silence it and remove any pops
no need to delete the whole clip, you can apply silence to only the parts where you hear harmony
then do the thing to fix the pops
use rx
idk how to fix pops on audacity
yea remove those
for pops, rx can be used or if you're desperate, heavy eq on the lower end
30-40 is more than enough
yess
sometimes it might even be bad to go for too much of data, if it's too divergent or repetitive
🦈
a properly cleaned dataset sounds great so dont worry
delete the first one
second one is fine
only that
just delete everything that is very damaged bruh
🦈 delete that small part
or well silencing it
Anybody knows why its not loading the audio
holy shiet you're using a very very old version of applio
Whats the latest version?
what's ur pc gpu and what u tryna do
GPU NVIDIA GeForce GTX 1050 Ti
i am just converting a vocals to ai version, not training
it was working till yesterday, dont know what happened today
yes
Please help
were you using a local downloaded version or a cloud huggingface space?
Local
yeah u prob download some old youtube shit
Your Nvidia GPU is good enough to do inference (use models) locally (on ur pc), not the best to train (make models) even if still possible
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.com: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio UI Colab: max 4 hours daily, not granted, of GPU
- RVC-AI-Cover-Maker-U ColabI: Automatically separates the vocals and instrumentals, converts the voice and mix all together back
Easiest possible (automatically separates vocals & instrumentals) : weights.gg & rvc-ai-cover-maker-ui
easiest cloud: Ilaria rvc zero
easiest local: Applio
all the ones I sent are updated, you can choose based on your needs
Thanks mate, appreciate it
the other ones are more updated than mainline though
you're welcome
but those are cloud based right?
both of those have a cloud and local version
mainline colab is broken right now though #📰│dev-updates
i dont wanna train any model, just wanna use for conversion, which do u suggest?
are you going to make ai covers or just convert files?
Applio locally if u want to wait, or applio colab ui if u can deal with the gpu time and don't want to wait more time for conversions
I'm not sure how old version installation was, but it's pretty easy, you just gotta read the guide
save it as flac
yup
dw i'll resample it later

its fine just upload it somewhere else
Guys, I installed the Snowie v3 pretrain model and before I even started the training it ended with error. Any solutions?
It works with default pretrain model, but when i switch it to Snowie one - it brokes
you probably did not download it completely so the file is corrupted
download completely what?
the pretrain
I use bult-in download pre-train feature
Where i can download pretrain model completely?
Oh ok i just downloaded it from another source
if thats what you really want
I think the issue is that because @scenic gale moved the files to another place, you get a huggingface error html trying to download 40 and 48k models
different sample rate? or the sample rate label (40000 vs 40k)
cant blend that
i must do a 40k and 40k?
yeah, only similar models can be blended that means same sample rate
ok
Btw, the slider's description is a bit misleading
It controls the weight / influence of model A ( left one ).
In a short, 70% / 0.7 on slider means the model A gonna be dominant in 70%
No idea why would anyone declare the function as " the side you move the slider to, has more influence " ish
i just tried to blend a 32k model with 32k model but still get the same error
then it is probably the 32000 vs 32k label
Nah, it's just broken ( the picker, I mean, the path box
same thing on my fork
Gotta drag n drop the files
yeah i dragged and dropped the files
huh
but isnt that the same thing
when it compares values, it is not
^
put the models in some folder
gotta edit the dict
use a path
O
today i learned... THIS
manually provide the path C:\models\model1.pth, C:\models\model2.pth
:' )
Yea, whoever does the translations, is kinda goofy
ok
yeah still just gives an error
then i have to make similar models and try
Try to restart applio
and try again
If that doesn't help, get my fork ( just copy the env folder to it )
you can do this
As long you use the repo's one and not precompiled, it'll work
@broken urchin
im using the applio from ai hub docs website
yea, the zip one then
thats what i use
Anyways, just do what noobies recommended
where is that?
you know
Applio folder and from there extend it to the path noobies provided
That's the rvc folder
yes
restart applio
send links to the models
in dm?
whichever way
ok i will
hm
whats wrong?
maybe it is my issue
regardless, the models you're trying to merge are different
they are 40 and 48k
the ones you sent me
Hello
. I think no. Haven't touched those files for some quite time
Hello, does anyone know what software use to have the same voice as him (text to speech). The voice actor is called Michael McConnohie (HunterXHunter narrator) and go check the link to see what i'm trying to have pls (go at 15 seconds to hear the voice i want). https://www.tiktok.com/@opiumomen/video/7476499628079320366?is_from_webapp=1&sender_device=pc
Well, the workflow would be to first create an " rvc model " ( using applio )
NOTE: RVC models originally are for voice to voice conversions ( v2v )
( The way it works is: You input target vocals / acapella ( idk, some rapper's vocals extracted from some track, and the model you trained on some voice, say, mickey mouse, gonna sing that in style of the original vocals. )
Then, once you have the model, the model would be used for real-time voice changing / conversion ( Software for that we call " W-okada " )
Alternatively, there is a chance someone already made a model with the voice you desire.
Worth checking: #🔍│find-models #1175430844685484042 and www.weights.com website for models
.
So these two are for, respectively, covers and real-time voice changing
Now, if your interest is in TTS voice cloning then there's a few solutions around, yet I personally would recommend " gpt-sovits "
There's other things as well. Xtts, tortoise, style-tts, etc etc
But I believe gpt-sovits is the most straightforward and easy to do
Thank you a lot, and yeah my goal is to use TTS because i will use the voice in Video and i think the audio quality is better in that mode than in covers or real time voice changing right?
In that case, gpt-sovits yes
However, if I may add something from myself
A really nice workflow with polishing the tts inputs is to use the cloned voice's output as input for rvc ( and rvc using the same voice model )
That way you'll boost up the quality
Here's an example:
Gpt-sovits due to gpt part of it ( you can think of the gpt component as the language / context understanding element )
learns how to speak in a manner and tone, style and so on, of the speaker
however there are some gimmicks and quirks in the voice cloning part itself
so
by doing that
u have more emotions?
in the voice of the character
its like his personnality
was represented
essentially, gpt-sovits:
- clones the voice
- does learn to more or less reproduce the style, emotions and manners of the speaker
however getting the quality decent is a lil difficult and has implications
rvc however, being a voice to voice, uses that tts's output to enhance the quality
( provided the rvc model is done on the same voice / samples as your gpt-sovits model )
Pretty much. tl;dr, gpt-sovits would be your input audio source and rvc your refining / finisher / polisher tool
yeah i'm trying to understand how it works
Ofc, it involves learning 2 programs, but the end results are worth it
This is the invite for gpt-sovits repository's discord: dnrgs5GHfG
hey how do i find voice models
thank you
@harsh ravine
Search there
i did but i dont know how to activate the search
I don't remember which version but the applio I was using worked on blending two 32k models, as I manually entered both the model path
Perhaps, ye
tho what I meant is, manual-path-wise blending seems to be jammed compared to drag n drop
Chat is this overtraining or not i'm so confused
it stays roughly still for a while then goes down
I need help, which version of rvc should I download, I am using amd, I am also on windows 11. Should I download the vcclient_win_std_2.0.76-beta.zip? I want the latest release but the one that works for what im using
Inaccurate graph, sorry my dude
Not sure why you're still running old applio
Old applio?
yes
Depending on whether you go for Noobies approach or mine, Applio is now using new method for logging the metrics
As you can see, there's average loss incorporated
the prepackaged / pre-compiled zip one is still not updated I believe
Oh, Kaggle
Yeah lol
Good or bad change 
Either way...
( examples of how it could look like (( ofc, there's way more patterns than that, but you get the idea
graphs are inaccurate butttt, if you really wanna know then no
doesn't seem like overtraining to me
The change would be way too abnormal compared to the rest of graph
and from there it could " supposedly " get better or hover around flat ish with occasional drops
Orrrr, could flat out and go up
Great aid is also looking at discriminator really
If it's loss ( for discriminator ) seems to be abnormally going up and then getting stable + dropping or dropping right away, then you know gen fails hard
I feel more confused than when I started
clears throat just refer to this
if you see no patterns of that sort, keep on training
Here's some real life example
You see the dip right?
between 4 and 5k
then there's a valley like and flat ish region ( 5.5k to 6+ k )
There ^
Until you get such patterns or those I demonstrated before, just keep on training.
In the end, you'll be ( hopefully ) testing lots of ckpts anyways to find the best fit ( From those around the " dip " on graph
Yeah listening is the most annoying part
Depending on how you look at it 👀
Better to make absolute sure it's that one than worry later and regretting, having that thought in the back of your head it coulda been better lol
ig
If you're struggling with graphs I think catching up on how Generator and Discriminator synergy works would help you
Most videos are pretty basic and max 10-15 mins ( some even demonstrate the loss dynamics
That's how I started initially anyway
There is way too much info for my brain to be processing
I might die right here right now
well, just look at the pictures, that's all
rest comes with time
I leave such information so others can benefit from it too ( those more advanced
or might be you gonna come back to it some time later yk

I'll try my best to understand this 
For now, just focus on spotting the dip and then a dramatic change in ur graphs
Its at 1k epochs and still looks like the image I showed you basically. It feels wrong
1k epochs..
Even pretrained / base models don't do that
uhhh
what is ur dataset's size and batch size
to begin with
cause like, 7k steps for 1k epochs 👀 crazy
3:05
batch size 4
Stopp Im boutta crash out
aaaand that changes the whole situation, lemme look at the graph again
Ill send you the current one
Yep
set the smoothing to around that, ignore outliers, resize the D and G total graphs
also, rescale with the button tagged with number 2
for both G and D
and if ur graph's bottom is cut
Tensor board is TWEAKING 😭
well dang uhhh, can you send me the tf file?
Might take a look at it, seems messy as hell
tf file?
tensorboard file containing ur graphs
it is located in model's folder ( in logs
gonna be called tfevents something something
also, it should have a size that exceeds 1kb
if it's not directly in model's folder, then in eval folder ( don't remember how old applio builds handled it
theres 2 tfevents
Did you resume the training at any point?
Nope
both have size above 1kb?
both 88b
It just says 88 B
then the heavy one is the one
other is empty and useless ( newer applio builds don't generate that one
Is there a newer version of applio on kaggle?
Not sure if someone done any updates
Wait so do you still want the file, I've got the download link for it copied
ye, can check it for you
Alr, gimme a bit. First gonna help one dude in dm installing applio
Thats alright. Take your time sorry for me bein' braindead 😭
Alr, lemme take a look
like said above, the avg graphs are more accurate and the avg g might have saturated or turned up earlier
jeez I thought I knew enough about training models yet I have no clue what suturated means in this context
Jesus
that's around where it ends ( hopefully )
because..
lemme showcase it graphically
wait
See how at first it goes rather " uniformly ish " down?
and then, the more it goes, more " flat like " it becomes, even if at first glimpse it's descending
Now, we can't be 100% sure because these graphs are inaccurate
yet my best bet would be to.. check this region
Which is this
Feels bad man
It's so darn messy that even I'd probs give up
not only the silence nodes
but so many of those points you can barely read it lol
Why won't you train locally btw
amd or rtx?
Its an AMD Radeon RX 6600
right
anyway.. If you asked me
I'd check those
Unfortunately those ultra deep dips are a hit or miss
so yea, no other choice but to listen to 'em all
And if you wanna know why old graphs are inaccurate
You know how an epoch has steps right?
yeah
Old graphs long only the last step
from a given epoch
at each " node " in the graph
New logging averages the result over all steps from a given epoch ( at least in my approach, noobies handles it differently
So assuming by your response before, training locally wouldnt work
I can't say for sure as I never tried playing with rocm nor attempted training when I still had AMD
Noobies def knows more than me about it
as he used amd before
Oof
or, well, zluda
I think there's a part on it in the guide? in docs
should be, ye
search for zluda in there
the normal g graph have sharp dips because it logs batches that learn a mute file
^
Or so happened the last step ended up on the mute file
In any case, a hit or miss
yeah Imma give up on that model
Ive never had an issue like that so Imma try reinstalling applio
Wait you said don't download the Compiled Version earlier right?>
Depending on how you wanna go about it, there's Applio from repo ( not the zip ) or my fork
They're in-line when it comes to core functionality but some things are changed in my fork
For instance the way it logs, the ui and stuff ( and some more
Noobies' approach is based upon averaging every N steps
mine's done on all steps from a given epoch basis
Well whats best in your opinion
looks like so
Well, there's no winner but
As I'm the owner of my fork, Imma recommend just that
Lol

Anyway, whichever pick you make, it'll be fine
but I think my logging's a bit more safe ( subjectively
Where do I get my grubby hands on your fork
Update - sync with applio changes;
All changes, fixes and updates are in sync with the current state of main applio repository.

There haven't been any zluda changes ( afaik ) for a while and so, all is in-line. Meaning it has to work ( if it's meant to work ) else, you'd have to ask Noobies for some help
So does the installation for normal applio match up with yours, or is it a different process?
(why not both)
Becaue it doesn't reflect exact epoch's performance
but more averaged / long-term one
having fixed 50 steps just isn't ideal imo ( when you want a per-epoch-monitoring strategy
Hence why it depends on what ones prefers really
I like to epoch-scope so that's the approach I took
Matter of how you approach it tbf
well yea per epoch avg seems to make more sense
it's far from perfect for sure
but still tons better than what it originally was
Other than per epoch
I also incorporated per 5 epochs avg so there's also longer-tendency checkup
What does it mean by:
"Using command line from the Applio folder run"
Does it mean run a command prompt from inside the folder and paste the line of code in
If so it doesnt work
I have a feeling they meant Applio folder's runtime
propably?
Wdym runtime
something like env\python.exe app.py --open
"Install Applio"
2nd step
Wdym
ye then that basically means:
you open up a cmd in applio, and execute these
just like so
Gonna open up cmd right there
o
I'm curious if there's eventually someone with 9070/XT willing to test
Good point
How to load models, locally which folder?
Should i rename them as "weights"?
Then hit refresh
nope
name doesn't matter
but for best sorting, I'd make a folder for each model, having there pth and index file
Okay
would feel impressed if they find if it could match or outperform 7900 XTX
ikr
But then, I'm unfortunately still team Nvidia
That is, until amd finally takes AI into own hands and make shit work as it should, natively on windows
yea eventually esp for linux users
yup
whats fcpe?
Generally, embedder depends on what model was trained on
but in reality, I'd say 99 or 98% of models are contentvec
Oh gotcha
as for rmvpe, fcpe etc, it's pitch extraction method
fcpe.. I haven't tested it much but it's probs faster or more lightweight than rmvpe?
In any case, rmvpe is the way to go as it's the most robust one
thanks mate
Np, cheers
a newest pitch extractor being more lightweight than rmvpe (or can just say it as "bootleg")
but is it good then rmvpe?
What does it mean by "edit the run-applio-amd.bat file and change the value from "0" to "1"." (I'm like so frickin' sorry for the hassle)
rmvpe is better for most cases, except fcpe might crack less on doubled pitches just like crepe
ah gotcha
first
It's assumed your primary AMD GPU has index 0. If your iGPU is listed first under 'Display Adapters' in Device Manager, edit the run-applio-amd.bat file and change the value from "0" to "1".
Is it 0 or 1
Or rather, Imma ask, do you have any igpu?
Wtf is an igpu
How do I know the answer to that question.
What's ur cpu
Intel(R) Core(TM) i7-10700F CPU @ 2.90GHz
Well, you typically know if you know ur cpu's details
F series do not have igpu
check task manager if it is labeled GPU 0 or 1
its 0
In that case, you don't have to edit anything
The whole idea is to change 0 to 1 ( as that's your main gpu ( indexed as '1' ) in cases where you have igpu, which is seen as 0
👀 Worry not, hope all works for you
Thank you Mr. 0🙏
✨
I have to hand it to you it, your fork looks alot cleaner than regualr applio
The box where I have to put the path to the dataset. Do I put the path to the dataset thats in my file explorer cause I keep getting error's
What does the error say?
Also thanks. Am very proud of that theme lol, my baby
oh, that one's weird 🤔 haven't seen it yet lmao, the error
A new one... Why am I cursed
Anyway, try to put the dataset in:
assets/datasets/
so:
assets/datasets/my_pog_model/sample.wav or multiple .wav files
Btw, for errors it's better to check the console
gradio likes to simplify / obscure lots of things
Well the issue turned out to be that the console crashed
Lol
The error is fixed but its pretty late so I might get some shut eye
it sometimes pop out in applio kaggle due to probably ngrok's "connection instability"
Makes sense, in that case
Ye, get some rest man
Tysm for tonight! Both of ya!
Np man, have a nice one, Gnight n sleep well
@glacial pollen Hello again, sorry for bothering you, do you know by any chance how to use the voice I talked to you yesterday (Michael McConnohie) and use it in ElevenLabs (TeamtoSpeech)?
I don't find any options to import the voice tbh
But someone told me he cloned it
DK how
I think ( can be wrong ) that you need subscription to clone voices ( as in, custom )
iirc, free users are both limited in total usage time and limited to a specific library of voices
I am looking to run some of these voices locally for annoucements-- can I use coqui with these models?
How do i install gpt_sovits? and can i write text in the app and use the cloned voice in it?
@glacial pollen
You gotta follow the instructions on repo
GPT-SoVITS-V2 Tutorial,Japanese model fine-tuning training,Onikata Yoshiko,TTS
GPT-SoVITS-V2 new version of the one-click package: https://pan.quark.cn/s/234d3e437526
GPT-SoVITS-V2 new version of the one-click package :https://pan.baidu.com/s/1VoTQTpx28TZKhiRjiGchJw?pwd=v3uc 提取码:v3uc
Official project address:https://github.com/RVC-Boss/GPT-SoVIT...
alternatively ^
yet as for specific values, configurations and all of that.. it's something you have to figure on your own
There's almost lack of guidelines everywhere
i'm checking
I have like two versions for the software : the one from github ("GPT-SoVITS-main" and the one from the guide online "GPT-SoVITS-v2-240821"). Is there a specific one to download? @glacial pollen
you can grab the v3 release
from releases
in repo?
yes
lemme check
where do you find it here?
oh?
wdym ohhhhh 
1 min voice data can also be used to train a good TTS model! (few shot voice cloning) - RVC-Boss/GPT-SoVITS
oh
🤨
mb bro
smh
RVC is STS, not TTS, so nope unless you firstly make an audio with TTS then use it as an input in RVC
Did you download the zip
Tho, why you even doing that
