#✨│ai-help

1 messages · Page 328 of 1

dim needle
#

how do i download the voice changer

hallow thistle
dim needle
#

5070ti

#

nvidia

#

nvidia gpu
im trying to use the voice changer
theres no issue i havent started

hallow thistle
dim needle
hallow thistle
#

-realtime

patent trellisBOT
# hallow thistle -realtime
🔊 Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Vonovox

A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options

• Wokada Tg-Develop Fork

A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project.

• Applio Realtime

A Realtime Voice Changer with similar performance to Vonovox & Wokada Tg-Develop Fork, with extra features.

• Wokada Deiteris Fork

Deiteris' fork (modified version) of wokada that doesn't get updates anymore.

⛔ Outdated/Discouraged

These options are not recommended for use.

• Original Wokada

Not suggested, older versions in youtube tuts are even way worse. GUIDE

• RVC GUI Mainline Realtime

The program is worse compared to the ones above, and much less updated. GUIDE

hallow thistle
viral mason
dim needle
#

in the vids i was watching it said they were free are they?

viral mason
dim needle
#

yeaa ik one of them showed this dc so im asking in here cis there all 2 years old

#

how can i go about downloading vonovox?

viral mason
viral mason
dim needle
dim needle
#

"Many Effects are Premium (paid), such as Low Quality Mic" is it mostly like this? i just want to use models from the models channel not really any effects will i be fine

hallow thistle
viral mason
#

those r optional

glass smelt
#

voice changer

dim needle
#

i downloaded it and its open but how do i use it just to hear myself

dim needle
#

how can i delete vonovox and voice cable so i can redownload it

hallow thistle
dim needle
hallow thistle
#

I don't know, I can't identify an issue from your words alone. Send your screenshot to here.

hallow thistle
# dim needle

Did you know? "Exclusive mode" is an audio mode in WASAPI/ASIO that makes a sole program (like Vonovox) as the only program to output sound while mutes other programs at the time if they all on the same audio system. It's better to set this mode off.

dim needle
#

when i go on discord and set my speaker to line 1 i see it moving but cant hear anything and i tuened exclusive off

hallow thistle
#

You should do this.

#

On Vonovox, press "Start" button to start converting.

dim needle
dim needle
hallow thistle
#

Why Vonovox works for others though? Did you follow the guide I sent to you at least?

hallow thistle
#

Send your full screenshot of Vonovox.

dim needle
#

i fixed it

hallow thistle
#

You set input device on Vonovox wrong.

dim needle
#

thats what you said to put it to

#

it only works when i set it that way and then do the opposiute onb discord

hallow thistle
#

On Vonovox: input is microphone, output is Line 1.
On Discord: input is Line 1, output is speaker.

dim needle
hallow thistle
#

Elaborate?

dim needle
#

when i do line 1 as input on vono and then line 1 as speaker in discord it works kinda

hallow thistle
#

That's not how it works.

dim needle
#

its the only way i can hear myself with the voice changer for me

#

the pitch is off thouggh is there any way to fix it sounds high pitch i tryed 2 dfifferent ones

hallow thistle
dim needle
#

how can i delete everything so i can restart

dim needle
#

what about voice cable

hallow thistle
#

If you made the same mistake for another time, you should question yourself. I was giving the most agreed approaches, you're literally doing opposite.

dim needle
hallow thistle
#

If you set "Line 1" as output on Vonovox while set "Line 1" as input on Discord, this is correct. But when you set "Line 1" as speaker on Discord while set "Line 1" as input on Vonovox, this is incorrect because you're gonna send all those Discord sounds (including ping sounds) to Vonovox through Line 1, not Vonovox to Discord as intended. You're just confused, bud.

#

If you don't believe me, you can ask fellow members who used voice changer here. misc_shrug

hexed ruin
#

One message removed from a suspended account.

hallow thistle
# hexed ruin One message removed from a suspended account.

This AMD Athlon CPU (released in 2018) isn't really that old, though positioned below AMD Ryzen 3. AMD Radeon Vega 3 is an integrated GPU, so probably skip that. Do you mean like you want to run the voice changer as CPU-only? Because of course it gonna be slower.

hexed ruin
low shard
#

which rvc related program?
Elaborate:

  • your pc os
  • what are you trying to do: AI Covers, TTS, E Girl Trolling / Catfish or Roleplay
brittle wing
#

can someone help me'

low shard
# brittle wing can someone help me'

This is a General AI Discord Server, elaborate:

  • your pc gpu
  • your pc os
  • what are you trying to do: LLMs, AI Covers, TTS, E Girl Trolling / Catfishing or Roleplay
  • the tutorial link used
severe fiber
#

ive downloaded evberything and tried to open MMVCServer and the cmd prompt comes up to download stuff but ive done that 3 times now and the actual rvc prompt hasnt came up yet

#

4060 gpu

#

i5 8400

#

windows

tawny bane
#

hey are EaseUS Voicewave and Voice ai good?

#

I want to have differnet voices that sound well and not 2022 choppy. both male and female

hallow thistle
tawny bane
#

FivemRP

severe fiber
#

ive downloaded evberything and tried to open MMVCServer and the cmd prompt comes up to download stuff but ive done that 3 times now and the actual rvc prompt hasnt came up yet
4060 gpu
i5 8400
windows

tawny bane
#

we don't really do anything but humans

hallow thistle
tawny bane
#

@hallow thistle

hallow thistle
#

<@&1159293140440723499> Hacked account in help channel.

severe fiber
#

so should i upgrade from wokada to vonovox?

severe fiber
#

is it the same delay orrr

tawny bane
hallow thistle
hallow thistle
tawny bane
#

understood ty

severe fiber
#

like whats the downside to changing

low shard
low shard
indigo python
#

Give a deep voice model

low shard
sage flume
#

is it open-sourced model?

hallow thistle
untold marten
#

Hola,

GPU: rtx4070 ti super 16GB vram
OS: Fedora KDE Plasma 43
What I am trying to do: I installed wokada TG-develop fork and works with the model & want to link/send the output of the fork to another program like discord or anything else.
The Issue: Can select my mic as input but when it comes to select the output device, I see no virtual cable showing despite having portaudio installed (did I miss anything from the docs?).
**the tutorial link: ** The link from the docs on the TG-develop fork (Realtime Voice Changer > Local > TG Develop's)

I used before deiteris fork on windows and works nice and had vac and was all fine but first time trying to use TG fork and on linux with portaudio. From what I heard, portaudio doesn't create a virtual cable? And that u may need to use pipewire? If anyone knows better how to set up this, I would appreciate a lot ^^

tender hedge
#

can you give me the link for download the realtime voice changer

untold marten
#

well it's in the docs, literally. Also Sapphire just gave the link for docs above

tender hedge
#

uhm can you teach me how to download it

patent trellisBOT
# nocturne mural -rt
🔊 Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Vonovox

A Realtime Voice Changer with similar performance to Wokada Tg-Develop Fork, with extra features, but it supports only Nvidia GPUs on Windows 10/11 unlike other options that have wider support. and without cloud options

• Wokada Tg-Develop Fork

A personal fork (modified version) of Wokada Deiteris Fork, it just adds some Quality of Life improvements to it like supporting Spin Embedder and Audio Effects. Don't expect too much about it since the creator made it originally as a personal project.

• Applio Realtime

A Realtime Voice Changer with similar performance to Vonovox & Wokada Tg-Develop Fork, with extra features.

• Wokada Deiteris Fork

Deiteris' fork (modified version) of wokada that doesn't get updates anymore.

⛔ Outdated/Discouraged

These options are not recommended for use.

• Original Wokada

Not suggested, older versions in youtube tuts are even way worse. GUIDE

• RVC GUI Mainline Realtime

The program is worse compared to the ones above, and much less updated. GUIDE

nocturne mural
untold marten
#

GPU: rtx4070 ti super 16GB vram
OS: Fedora KDE Plasma 43
What I am trying to do: I installed wokada TG-develop fork and works with the model & want to link/send the output of the fork to another program like discord or anything else.
The Issue: when opening the web interface, audio processing is locked onto server (when the first time it was working fine but after trying to create a virtual cable with pipewire, it went haywire) and I cannot start the server and run the voice changer at all. Also I set the Sample Rate at 48000hz but it gives errors and changes to 44100 while saying the input/output/monitor supports only 48000hz... I don't get what is wrong here... And I tried 2 browsers: firefox & opera gx + tried the troubleshooting from TG fork about this issue with audio processing locked onto server. At least if the server was working...
**the tutorial link: ** The link from the docs on the TG-develop fork (Realtime Voice Changer > Local > TG Develop's)

sharp jungle
toxic perch
#

wall of shame asap

#

<@&1159293204038955078>

low shard
viral mason
viral mason
viral mason
untold marten
viral mason
#

First download is for the voice changer second one is a virtual audio cable that connects it to both discord and any game you play using it

untold marten
#

I've seen vonovox and wish I could have tried but from what I understood in requirements or "pros/cons", it said that it works only on nvidia gpu with windows only or so I understood

viral mason
viral mason
#

In case tho you'll need to switch to Wokada tg fork, idk which one is the Nvidia Linux version tho

arctic musk
#

I would like to know if there's a good voice changer, also if there's a good woman voice as I run a VTTRPG and would like to stop hurting my throat and instead using a voice changer. I already have some male voices to use that I like, but I don't feel VCC from Okada branch is working good for me. I used it for a while but was not able to configure a good output for it. I have a NVIDIA RTX 4060 with 6gb, 16 ram and an Intel core i9 14900HX. I'm running in Win 11.

viral mason
untold marten
hardy yew
#

the way interface jumps from 48000 to 44100 is normal in Windows release of tg too, IDK why it happens but it has never caused any issue for me so I didn't really care

arctic musk
untold marten
#

I have no idea why...

hardy yew
#

sadly can't help with the virtual cable issue, I haven't ever run w-okada on Linux so I lack experience here

untold marten
#

also wanted to ask, how I would install that VAC470lite on linux as it's, from what I can see, windows installer :))

#

I mean didn't try yet with wine so not sure how it works

hardy yew
#

Doubt it would work via Wine tbh

#

I would just search for an Linux alternative

arctic musk
untold marten
#

I tried to make a virtual cable with pipewire but seemed to break client audio processing

viral mason
arctic musk
viral mason
#

Do you have VB cable or the one I sent? The one I sent you is recommended over VB cable

#

VB causes odd issue on windows sometimes

#

This one doesn't

arctic musk
#

It's the same, Just checked the one I have already installed

viral mason
#

Ah

arctic musk
#

Yea, VB caused a lot of issues for me

#

I will give vonovox a try! Also, where can I get pre trained models? I'd preffer them in spanish but I think I can work around with english ones XD

arctic musk
#

Thanks for the help!

viral mason
#

You're welcome!

#

If you need hell or have questions just ask me or a helper ^^

arctic musk
#

hell? cat_doom XD

viral mason
#

My bad

#

Typo plus still waking up

#

If you need help

#

Lol

arctic musk
#

God, this sounds a lot better tan VCC anime_pray

viral mason
arctic musk
viral mason
#

<@&1159293140440723499> weird account

viral mason
#

As well as Applio real-time

#

That's somewhat newer like Vonovox

arctic musk
#

I see, might take a look at that one too

marsh galleon
#

hi i got a nvida gpu 5090 does anyone have the fork okada i used this one before but lost it

viral mason
#

what are you planning on using it for btw just curious

marsh galleon
#

hanging with freinds i like using solo leveling voices and sh

#

they sound so relistic i used vonovox but it jst not like tgfork

viral mason
#

I'll get u the downloads rq

#

I don't understand how people lose stuff like this, do you randomly delete it or what

marsh galleon
#

nah i needed to reset my pc aand sh i had to many files

#

sorry for taking up ur time to get the files im really thankful tho!

viral mason
marsh galleon
#

sirrr

#

i have vonox im asking for tf fork aha sorry for the confusionn

viral mason
#

ohh

#

how come?

#

vonovox gives much better quality and is better in general

marsh galleon
#

i had that one before it way easier vonox is so confusingg to me

viral mason
#

but the beta is easier than tg fork

viral mason
#

everything is done for you

marsh galleon
#

wait can u show me what fork looks like because im lowk dont know if were talking abt the same thing

viral mason
#

this is vonovox

#

this is Wokada tg fork

#

Vonovox isn't complicated at all

#

neither are

#

I wouldn't sacrafice quality just for one to be "easier"

#

that's just me tho

marsh galleon
#

oh wait now im weirded out so with wokada it would be on the browser and it be like sounding so nice but ig that must be fan made or smth

viral mason
#

?

#

what??

#

I'm confused 💔

marsh galleon
#

yea same w me

#

so i had one

#

that looked like okada the normal one

#

but it was on browser

#

sm guy gave it to me

viral mason
#

yea wokada tg is on browser too

#

vonovox tho no

marsh galleon
#

i just use vonovox thank you fuck me i mst be confusing aha

#

is there a website for voice modles

#

??

viral mason
viral mason
#

there's plenty here but also a site that has them too

marsh galleon
#

ohh what siteee

viral mason
#

this one!

#

I'd check here first as this place has a lot more quality control over good models

marsh galleon
#

thank you alot

#

ur really helpful

#

lowk should be a mod

viral mason
#

people joining just because some random old yt video said they have them here makes me feel some kinda way

grand plinth
#

w-okada not working on rtx5060, a little help?

low shard
grand plinth
low shard
# grand plinth * windows 11 * real time voice changer (RVC) * tutorial link?

RVC doesn't mean realtime voice changer, it means Retrieval-based-Voice-Conversion,

this is a General AI Discord Server and many people confuse it, that's why I'm asking what are you trying to do since there are different tools: AI Covers, E Girl Trolling / Catfishing or Roleplay?

Also, did you use any tutorial or download link for whatever are you using right now? What brought you there?

grand plinth
viral mason
viral mason
grand plinth
viral mason
#

You're welcome!

wicked bane
#

Anyone knows where can I find a feminine but not too feminine voice (femboy)

past rune
viral mason
#

why?

#

what's your pc gpu? (Nvidia or AMD) and what do u plan on using it for? just curious ^^

#

better not be with egirl models cat_seriously

#

u promise you're gonna use normal stuff like Goku or Darth Vader ect

#

ok

#

peak

#

here's the two downloads you need ^^

#

first is the voice changer second is a virtual audio cable to use it in games ect

#

nah it's really easy

#

just extract both zip files, for vac lite (the virtual audio cable) just run the file called setup64 and then install driver

#

and for the voice changer run mmvcserversio

#

yea no weird setup like the old one

#

and it runs on browser

#

for the first time it could take a bit but it shouldn't take long

#

are you able to send a screenshot?

#

you have your settings wrong

#

input should be mic output should be line 1

#

you're using a different virtual cable but yea should be fine

slim mantle
#

Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (0, 26) at dimension 2 of input [1, 128, 6] tf is this 😭

brave quartz
#

I see many but what is the best like free and do tts with rvc model same appolio ?

wild forge
#

i am setting up a full off grid property(for when shit hits the fan) and need help with the ai aspect to control (cameras,hydroponics,gates,water) i have been diving into it with ai and they recommend i start with MS-01 but i also want to run 120b models and just would like someone to talk to who knows a little more than me...

shadow cave
#

can i get help? everytime i run tthe start this pops up

fleet marsh
#

-colab

patent trellisBOT
# fleet marsh -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **RVC Mainline**

by Hina
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **Hina's Modified Original Wokada**
• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

fleet marsh
#

its the third time this happens

viral mason
#

you should probably be using Applio btw on Kaggle, google colab kinda stinks

hallow thistle
fleet marsh
#

rn i was trying on collab, the link that should send me to the ui didnt work

fleet marsh
fleet marsh
narrow coyote
#

Anyone able to help me with an ai voice model?

#

Curious what ai voice trainer thingy it is

clear depot
#

hhello !! I've been using this and it stoppped working so I was wondering if there was a new verison <3 vcclient_win_cuda_2.1.4-alpha

viral mason
frigid spindle
#

jai

viral mason
#

what is your pc gpu (Nvidia or AMD) and waht do u plan on using it for

frigid spindle
#

hai

viral mason
#

you should use Vonovox, I have to go very soon so I'll get you the downloads

#

here ya go

#

first link is for the voice changer second one is a virtual audio cable (it's recommended to use it over vb cable)

clear depot
#

yessss I have vb cablee !!

viral mason
#

the second link tho is recommended to use instead of vb cable, it does the same thing but sometimes vb cable is buggy for no reason

torn edge
#

so im switching out of deiteris fork to a different program

#

which ones the better option performance wise

#

tg-develop fork or vonovox

viral mason
#

Since Vono is Nvidia only

hardy yew
#

damn

#

I thought this was a funny joke, repeating after each other

#

but instead it turns out they're all the same bots cat_doom

#

@low shard triple kill here

low shard
low shard
#

This is a General AI Discortd Server and there are many voice changers, elaborate:

  • your pc gpu
  • your pc os
  • what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
  • the tutorial link used
low shard
low shard
low shard
low shard
low shard
brave quartz
fallow heron
#

is there any free website alternative than weights gg that allows the use of custom rvc models ?

ember tapir
#

hi, question: what is the group's stance on the limits of AI assistance when it comes to writing in research papers? Is it the provenance of the ideas or the style of writing and prose of the human ideas that are being written?

Basically, where do you see the limits of what AI assistance should not cross?

low shard
low shard
brave quartz
carmine palm
#

Хелп ми

brittle wing
#

-colab

patent trellisBOT
# brittle wing -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Tg-Develop Fork**

by Tg-Develop
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

left gale
#

Hi everyone! Does anyone know how to make endless streams?

finite wind
#

hey what do I do with the D and G pretrain thingys from my training?

#

I thought an index and my model pth would be the only result if I'm being honest

low shard
#

there isn't any tts that can use rvc models other than the way i explained

low shard
# carmine palm Хелп ми

This is a General AI Discortd Server and there are many voice changers, elaborate:

  • your pc gpu
  • your pc os
  • what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
  • the tutorial link used
low shard
finite wind
viral mason
low shard
#

the guides are made to be read to understand how a program works, else why even spend so much time making them

viral mason
#

Both programs are very easy to use and setup, I should probably tell them though each time how to run each

low shard
viral mason
#

Fair but in Vonovox specifically has only 2 settings that ever needs to be touched, block size and pitch

#

I guess for Wokada tg fork it's a little bit more, just chunk size and extra time

storm holly
#

OH SHIT NEW APPLIO COLAB

craggy brook
#

How can we create the sound we want here?

viral mason
#

colab gives like 4 at max

#

why would you do that, that's weird

#

<@&1159293140440723499>

brave quartz
fleet marsh
viral mason
#

applio works fine

#

I use it everyday

fleet marsh
viral mason
#

this is how I use it, I use it on Kaggle

fleet marsh
#

is it for training?

viral mason
#

yea that's for training you don't have to add any datasets

fleet marsh
#

oka

#

niceee

viral mason
#

to import a model tho should be the same

#

just go to download section then paste the link from huggingface

fleet marsh
viral mason
#

idk how to use Applio on colab

#

but applio's interface is the same on all softwares

#

local, kaggle, colab

#

should be the same

fleet marsh
#

o

#

ok

#

btw, are there any new models to train my datasets with? or im ok with Ov2?

#

that was the newest one last time i made one

viral mason
#

please don't use OV2

#

it's like

#

bad

fleet marsh
#

why?

#

it was really good back then

viral mason
#

titan, ov2, Ren3, any of those are super old and bad because they cause harmonic distortions that we didn't know about back when we first used them

fleet marsh
#

dont tell me i have to do some models again? D:

#

like

viral mason
#

yea 😭

fleet marsh
#

10 of my models use ov2

fleet marsh
#

also what harmonic distortions, do u have an example?

viral mason
inland pagoda
#

Hi! What coding LLM is best for 12 GB VRAM atm?

fleet marsh
carmine siren
#

I am looking for something like real - ESRGEN llm model, is there any alternative for upscaling image

carmine siren
fleet marsh
carmine siren
fleet marsh
#

what number of gpu should i be puting here?

white pasture
#

Why won’t it let me generate, bruh

#

It won’t lemme send pic wth

#

I’m trying to generate an image yet it says it’s not permitted no matter what I delete

white pasture
#

I’m trying to make an image of Maxie and Mega who are two Pokemon characters

#

Gives me this “Content that violates our community guidelines was detected in your generation. Your gems have been refunded. Please try again with different parameters.”

#

Even though there’s no nsfw

#

I just put “Maxie from pokemon ORAS with a younger guy with black hair, red eyes, fluffy black collar, black and red shirt uniform, red cape”
Ain’t nothing wrong with that

low shard
low shard
storm holly
#

I wasn't here for a couple months

low shard
carmine siren
#

The VibeVoice TTS model, which is developed by Microsoft, is one of the best.

abstract comet
viral mason
#

just use kaggle for applio 💔

wanton parrot
#

I use a dual nvidia gpu setup, is it possible to make vovonox use a specific gpu?

wanton parrot
#

Thanks!

viral mason
#

what are you talking about? you should specify

#

I cannot call at the moment sorry

#

whatever you have is outdated then

#

what is your pc gpu (Nvidia or AMD) and what are you using the voice changer for?

#

super outdated yea

viral mason
#

No, just run setup64 for the virtual audio cable then install driver
And for Vonovox just run setup

#

Any yt tutorials are outdated for voice changers

#

Sure

viral mason
#

Extract both after download and run the files I said there

#

Pitch at 0 works for most models if you're a guy, but if you're using a female voice pitch it up some until it sounds right

#

I personally have my block size at 0.50 but it works well at 0.30 which is default

#

It's alright

#

Excuse me?

#

I don't do that

#

<@&1159293140440723499> weirdo

#

I won't be helping you further

narrow coyote
#

what ai is ran on terminal?

feral saffron
#

i cant run the start_http.bat file for the voice changer, any tips

viral mason
small violet
#

anyone know why after follow the audio cable and mic steps for input and output it wont work? Like on roblox i cant hear anything through my mic?

misty marlin
#

why the hell did the creator change it like this now i cant choose an index

#

nvm its text to speech

misty marlin
#

the ai vc client doesnt work at all it doesnt make any sounds i checked all configs

#

the old versions worked fine but i dont wanna use old version i want new ones

#

im on win_ cuda 2.1.4 alpha

#

i have amd cpu

#

and nvidia graphic card

#

i am trying to speak

#

well

#

yea

viral mason
swift thunder
#

Does anyone here use Kaggle who can help me? I trained a model, everything was going perfectly, 275/300, then it started throwing errors and stopped training, and everything started throwing errors.

ember nymph
#

yo

tired aspen
#

so im tryna set up the vcclient from w-okada after a while of not using it [i had deleted it] and im on a new version trying to set it up with voicemod, i genuenly cant figure it out. i already have the cable stuff and whatnot

abstract comet
#

SOMEONE PUT FLOWMATCHING INTO RVC

#

PLEASE

viral mason
#

what's ur pc gpu?

#

I'll get u the download but I have to leave soon

tired aspen
#

nvidia

#

thinks its a 3060

viral mason
#

@swift thunder idk how to fix your error but look at this short tutorial I made in case you did something wrong

viral mason
tired aspen
#

pretty sure i already have the cable stuff

#

unless it got a few updates since ive downloaded it

viral mason
#

are you using Vac lite or VB cable, they're two different softwares but do the same thing just VB cable causes issues sometimes

tired aspen
#

i ohnestly dont know, it just says cable, and its benn a few years since ive downlaoded it

#

nevermind its vb

viral mason
#

I'd recommend the one I sent then just in case

tired aspen
#

yeahhh

#

k both folders are done downloading

#

do i unzip both and install the new cable?

viral mason
#

Yep

#

For vac lite just run setup64 (not as admin)

#

And most likely you won't need to restart your pc either

tired aspen
#

oo

#

in that case should i try to find a way to delete vb cable?

viral mason
#

If you like you can uninstall it the same way you installed it

#

But you don't have to

tired aspen
#

ah

#

ok i have what i need i think, how od i set it up with voicemod now?

viral mason
#

What are you using?

#

There are no default voice models that come with it, whatever you're using is outdated

#

What's your PC gpu? (Nvidia or AMD)

#

And what what do you want to do with the voice changer, just curious

viral mason
#

Wdym by this?

edgy quiver
#

I need help T-T, i am looking for some good male voice models for realtime, do u guys know any good ones

finite wind
#

man the hard truth about clipping audios on your own from 1 to 10seconds wasn't giving me good results at all

#

same goes for letting applio cut audios on its own 😔

#

can def say og 48k pretrain is noticeably giving bad results than legacy 1.5 48k pretrain

#

is there any documents on what needs to be avoided or kept as I isolates a dataset?

#

like, for example, without whether it's true or not

#

if audio utilizes stereo heavily (sound from left or right), you should turn it into mono (just an example not confirmed to be true)

#

if you can cut audios on your own from 1 to 5 seconds(RVC limitation), it's better to cut on your own to make better quality dataset (just an example not confirmed to be true)

#

why there are so few to no documents on how you SHOULD process a dataset?

finite wind
#

WHY IT WASN'T ON THE AIHUB DOC

#

AHHHHHHHHH

#

MY 8 HOURS

hardy yew
#

Your stereo data was simply downmixed to mono at preprocessing step xd

finite wind
#

yeah....

hardy yew
#

I thought there was a warning about this but maybe not

finite wind
#

I was wondering why I got irregular volume sometimes with my model

#

apparently model also studied the lowest volume part of the audio when the music was coming from only left or right

hardy yew
#

It just takes an average of both channels

#

What about normalization?

finite wind
#

I let applio do the normalization but

#

wasn't able to tell huge difference on my own when I heard the dataset after the auto normalization from applio

hardy yew
#

There's also some debate on pre vs post normalization

#

If your data has noisy silence removed then post should be quite good

finite wind
#

afaik, pre is normalization before cutting and post is after cutting

hardy yew
#

Yeah

finite wind
#

I just see no reason to go for pre

#

unless your dataset is suffering from low quailty audio issues

hardy yew
#

If there was lots of dirty silence in your dataset, post would blow it up

#

But yeah, other than that it's seemingly better

finite wind
#

idk how you deal with the brief silences between lyrics or speech though

#

like 0.1 to 0.4 seconds dirty silences

#

someone who knows def should put it on the docs

modern hornet
#

why did the mmvc file stop opening for the voice thing it was opening yesterday now gotta reinstall

hardy yew
#

a) ignore them and deal with it
b) manual cutting
c) smartcutter (my go-to)

#

Though I mostly train with video game voiceover. It's clean out of the box.

finite wind
#

see I do either
a) manually cutting them to close the silence gap
b) completely silence that dirty silence part without closing the gap

patent plover
#

hello

finite wind
#

but I can't tell which is better or should be avoided

patent plover
#

i have problems with mmvcservice

#

no works :,v

#

and i have all

#

the VB- virtual cable input, input 16inch and output

hardy yew
patent plover
#

1 day to the other stops working

hardy yew
#

How you're gonna do that is a separate thung

patent plover
#

i reinstall but dont works :,vv

finite wind
latent kraken
#

anyone wanna help me make a AI voice website

I can't pay anyone but I kinda wanna see how hellish this could be

I don't know how tf to do anything ;-;

finite wind
#

really gotta put it on AIhub documents though

hardy yew
#

I think the main problem is there's lots of uncertainty around dataset preparation

#

Lots of aspects for which people have different approach

finite wind
#

at least the ones that are generally good to do should be listed up on the doc

hardy yew
#

So it's hard to tell "this is the way. This is the only and right way"

finite wind
#

instead of having nothing should be better

#

the info of something generally good + the reason why it's generally good = an easy step for anyone, can logically think from there to guess and try some better ways to do things

#

rather than shooting themselves in the foot

patent plover
hardy yew
#

Definitely, agree

patent plover
#

:c

hardy yew
#

This is weird

#

Especially that another person above just had the same issue

finite wind
#

just one favor to ask, can you share a screenshot of any of your processed audio file because I want to see how you processed it?

#

preferably with spectrum

#

like this is a random pic from online but I usually edit out these clicking or tearing parts of the spectrum to process but NOTHING else because I have no info for anything else

#

just the pure silencing out part like you and I discussed a little bit earlier

hardy yew
#

this is one of my datasets (from a game, too)

#

other than concatenating all of it and silence truncation i didn't do much here

hardy yew
finite wind
#

yeahhhh

#

yours looks vastly different from mine I think I got a better idea now

#

thanks again

hardy yew
#

they don't always look that similar, I guess various timbre might turn out quite different

#

though obviously clean human voice will have lots of shared properties in the spectrograms

finite wind
#

this is from the very first dataset I did, I think that higher frequency parts needs to be cleared out? or is it just only in your screenshot case idk

#

some dirty silence parts can be seen in here too

hardy yew
#

it might also be my spectrogram settings TBH, not exposing too much dirt in the highs

finite wind
#

only major difference is just I added silences in between sentences

hardy yew
#

those settings are what I mainly use when looking at the harmonics

finite wind
#

ahhh

hardy yew
#

this is how the same data looks with default amplitude range

finite wind
#

maybe mine was showing 48k spectrum I think

#

I figure because yours is showing til only 15k but it's just my guess

hardy yew
#

my data is 32k so it can only peak at 16kHz, hence the range

#

in your case it can go up to 24k

finite wind
#

yeppp

#

oh and just one more thing before I go

#

I let applio cut my one long audio file for a test

#

and it chopped some of the last parts from the first sentence and then put it into the second sentence's first part

#

is that a problem or not at all?

#

or abruptly cutting it mid sentence?

hardy yew
#

TBH not sure how it affects the final model

finite wind
#

damn

hardy yew
#

Personally I don't mind it and just do use the autoslicing on my concatenated data

#

but

#

slicing phonemed in half

#

definitely can have some negative impact compared to when the samples would simply go from silence, to audio, to silence again

finite wind
#

ohhhh

hardy yew
#

Haven't ever done any research on this but that's what I would expect

finite wind
#

at least it's much better than having no answer

hardy yew
#

Whether it has a massive impact or little-to-no impact at all? No idea, maybe someone else knows

finite wind
#

I will go for manual cutting from 1 to 5 on my own and see if it helps any better

#

apparently 5 is the max for applio

#

to process while training

hardy yew
#

There is one more problem with it though

#

(and i guess the main reason for equally-lengthed 3s clips)

finite wind
#

hmm?

hardy yew
#

The training pipeline utilizes 3s segments and cuts off the rest. So if you provide it a sample of e.g. 4s, it will still only use the 3s and ignore the 1s.
If you provide a sample of 8s, it will process 2x 3s samples and discard the remaining 2s

#

(or at least that's how I understand it, recently saw a discussion on this)

finite wind
#

ohhhh

hardy yew
#

So eventually the outcome is often similar - cutting words in half and discarding some info

finite wind
#

I swear I saw it somewhere in this discord that applio can process up to 5 sec hmm

#

gotta go for 3s to be safe I guess

hardy yew
#

This needs further verification I suppose 🤔

#

Don't want to state i'm 100% sure of something when i'm not

finite wind
#

gotcha thanks a lot

astral tangle
#

hello everyone im new to ai stuff, i wanted to change a voice to another to make some ai song covers, i've tried using RVC but i have an AMD GPU (RX 6700XT) and cant get it to work, could someone help me getting it to work, or maybe guide me towards another ai i could use to change one voice to another? any help would be appreciated.

My specs are:
Rx 6700XT gpu
Windows 10

lone smelt
#

hi, anyone know why google colab keeps crashing or disconnecting when i’m generating roblox assets? not sure if it’s a GPU limit thing or what.

shrewd nest
#

Hi, anyone know how to create realistic TTS with human nature voices? Like breathing laughing?

thick patrol
#

Hello! evening, morning to everyone! im just curious about why my Odaka starts to slow down and genuiley start being unresponsive, is it a internet thing?

it only started to act like this after a few seconds tops

low shard
#

This is a General AI Discortd Server and there are many voice changers, elaborate:

  • your pc gpu
  • your pc os
  • what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
  • the tutorial link used
low shard
low shard
low shard
# misty marlin im on win_ cuda 2.1.4 alpha

This is a General AI Discortd Server and there are many voice changers, elaborate:

  • your pc gpu
  • your pc os
  • what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
  • the tutorial link used
low shard
low shard
low shard
abstract comet
low shard
low shard
# patent plover no works :,v

This is a General AI Discortd Server and there are many voice changers, elaborate:

  • your pc gpu
  • your pc os
  • what are you trying to do: TTS, AI Covers, E Girl Trolling / Catfishing or Roleplay
  • the tutorial link used
low shard
low shard
low shard
#

I should make the Sapphire Message about the Guidelines and Elaborating more visible, not sure why it's so ignored

edgy quiver
#

Yeah, trying to get a good eboy one, im kinda tired of getting catcalled and all that. I currently have wokada set up (my gpu is a 3060)

ember nymph
low shard
wanton parrot
#

Hello, im trying to run tg-develop w okada fork on ubuntu server but when i run the server it does not accept arguments

kat@kat-server:~/Voice-Changer/MMVCServerSIO$ ./MMVCServerSIO --launch-browser false --https true
usage: MMVCServerSIO [-h] [--log-level {debug,info,warning,error,critical}] [--launch-browser]
MMVCServerSIO: error: unrecognized arguments: false --https true

finite wind
#

because I certainly didn't make my dataset to have voids all around the spectrum

#

not in spacing between audios but spaces inside of them like swiss cheese

hardy yew
#

well, as i said, it was already clean as it's from a game so it's kind of an "easy" dataset

#

so lots of samples concatenated with 100ms breaks in between

#
  • smartcutter on top of that to possibly clean up silences from within the samples
low shard
finite wind
#

so these are from the original samples or smartcutter did that for you?

analog obsidian
# finite wind

rvc fills the voids when the model upscales the audio, it doesnt matter

finite wind
#

BUT

#

do I want to make those voids on purpose is the question

analog obsidian
#

no

finite wind
#

hmm okay

#

so voids on only spacing for now

#

I thought it could make clearer voice output idk

analog obsidian
#

there will be always quality loss due to how rvc works, it will never be 1:1 with the dataset

finite wind
#

yep and we're just trying to mitigate the losses as much as we can

#

and this is the part of the thing too but it's a shame I guess

hardy yew
finite wind
#

yeah somehow I thought those voids inside of the audio spectrum not the ones from spacing could enhance outcome quality

hardy yew
#

this is smartcutter

#

and this is original

#

(slightly different scale too because the second one is before downsampling to 32k)

finite wind
#

ah so original already had those voids in

hardy yew
#

yeah, looks like it

finite wind
#

oki

#

lyery also confirmed it does not enhance outcome so

hardy yew
#

I admire the attention to details, I don't look that precisely usually xD

finite wind
#

gotta do what I was doing just now

#

hey I just wanna make my dataset processing worth a while

analog obsidian
#

the only trick to enhance the quality of a rvc model is to get a better dataset, recorded with a decent mic, low noise, and no editing at all

hardy yew
#

what i usually do is slap the data i got into training and filter it later, if the model develops some flaws

finite wind
#

there are more questions to ask after I realized wthat is a better dataset in general

analog obsidian
#

raw wav audio files

finite wind
#

like hmm

analog obsidian
#

like a recording of yourself, without any editing to that audio clip

#

no mp3 compression

#

no "voids" in the spectrum

finite wind
#

should I avoid putting chest voiced audio when the majority of the audio is modal voiced? kind of stuff

finite wind
analog obsidian
#

yep you want consistency, in both timbre and audio quality

finite wind
#

yeeahhh

#

I was confused because I read you need diverse audios but at the same time you can't put drastically or moderately different styled audio (in terms of speaking / singing)

analog obsidian
#

yea with diverse they mean pitch variety, not monotone audio

finite wind
#

gotcha

analog obsidian
#

you have to teach the ai the whole voice range of your speaker

finite wind
#

I just gotta undo my voids on spectrums I did for the past 30 minutes

#

and I put too much trust on index since I tested out british accent model as a realtime model

hardy yew
# analog obsidian no "voids" in the spectrum

why not actually? I mean if it was to be added manually then sure, perhaps not worth the hassle. But assuming it's inserted automatically, isn't it better to discard the noisy silence?

analog obsidian
hardy yew
#

(even if the noise is low anyway)

finite wind
#

and it only sounded somewhat decent when I mimicked the british accent decently

#

so a lil disappointment there but we carry on

analog obsidian
#

or you're talking about the silences between samples? lol

finite wind
#

not the gaps but like holes as if it's swiss cheese

hardy yew
#

mb

analog obsidian
#

ah ok compression

#

yes avoid them

#

rvc dont like that

finite wind
#

ffghhhhh

#

I gotta undo

analog obsidian
#

i mean, it fills them up with random shit but it's not ideal

#

better to have real data there

finite wind
#

we really gotta update AIHUB documents

hardy yew
# analog obsidian yes avoid them

hmm in theory i could fill those with RX's spectral reconstruction. Wonder if it's better or worse than leaving it untouched.
Interesting thing to check I guess, not sure how that reconstruction thing performs

finite wind
#

probably better off leaving it untouched unless your dataset is low quality in that terms

analog obsidian
hardy yew
#

Yeah, just wondering, might try it

#

I mean, this model turned out nice the way it is

analog obsidian
#

i know pretrains dont like upscaled data tho

#

but finetuning is different... soo

analog obsidian
finite wind
#

what's your take on non-verbal audio for a dataset though

#

like sighing, laughing(moderate not high pitched), humming sounds like hmm or etc

hardy yew
#

out of those three laughing is the worst i think

#

wouldn't expect occassional sighing/humming to break the model

finite wind
#

oh now it makes sense

#

my british accent model had A LOT of laughing or giggling

#

f the singer ig

hardy yew
#

adding lots of noises like that will probably cause the model to insert them into normal speech

#

which is rather undesired xD

finite wind
#

yeeeep

hardy yew
#

one of my lazy trainings was Ellie from TLOU which had lots of shouting/screaming/growl-ish angry voicelines, beside normal speech

#

and it got quite audibly rendered into speech in the model

finite wind
#

I can see that

#

why it could happen ye

hardy yew
#

especially "heavier" speech with stronger emphasis turned out raspy like the screams

#

"soft" speech was more-or-less unaffected

#

but yeah, it was an experiment to see how it affects the model and turned out as expected, it's rather bad

finite wind
#

now I think the most difficult thing to do in the processing dataset step is which theme of the audio you want to mainly use as a dataset

#

you can't just put everything and hope for the training to turn out good at everything right?

hardy yew
finite wind
#

so I gotta choose what kind of audio I mainly want to train

#

like, for someone who doesn't know what a consistent data is

#

I, myself would put shouting, crying, singing, grumping, sarcastical speech, screeching, mocking etc in the same dataset

#

and it wouldn't be able to make generalized voice like I expect it to

#

for example like the one you've just said from TLOU

hardy yew
finite wind
#

that's the thing

#

we can't do that YET

hardy yew
#

yeah, hopefully some day

finite wind
#

RVC limitation

#

is uh

hardy yew
#

although a part of me doesn't want it

#

due to how common catfishing and other stinky use cases are

finite wind
#

if I have a dataset of crying, angry, annoyed then I have to choose one

#

oh it's just an example

#

even singing method can be vary from a same person

#

and we have to choose only one of them to make a model for now

#

at least realtime or retrieval tech isn't going anywhere to be developed further at the moment

#

TTS industry is going to be advancing just fine and at least it's not for catfishing

viral mason
#

Especially when used with a good pretrain like Legacy core 1.5 or 1.6

finite wind
viral mason
#

I usually keep singing out of a dataset if it's mostly a talking model

finite wind
#

that I gotta agree

#

so any model that sounded less robotic with less artifcats have these monotone-like feelings in my experience

hardy yew
finite wind
#

maybe

#

my mind says 9:1 can disrupt a lot than 5:5 ratio

viral mason
#

What does that mean

#

I just woke up and in general I'm kinda slow

hardy yew
#

screaming only = bad
screaming just a bit = not as bad

viral mason
#

Good example is my model of Doey

finite wind
#

for simple explanation would come from talking models

#

let's say you want a model to sound like a specific game character in general

#

but you have a dataset that consists of just talking, shouting, being grumpy, or idk mocking

#

ratio wise idk what ratio that can f up the model

cedar rock
viral mason
#

It's mostly for General Grievous from star wars

finite wind
#

idk we will have to find out ourselves

#

both ratio and the fact that coughing is in the dataset in the first place is a problem or not

viral mason
#

Hmm

finite wind
#

ideally it reproduces the same cough every time you cough

#

but I can imagine the model to blend that coughing into normal speech and f up the whole speech you're trying to say

#

idk if it's prevented from the training phase I don't really know

hardy yew
#

that's pretty much my Ellie case i think and for that the answer is right there

#

but i would expect that with not-so-much shouting it would be way better

finite wind
#

that is, in this case a lot of coughing is in the dataset of that general grievous

hardy yew
#

so maybe same with coughing

finite wind
#

I think moderate tone differences express happiness, sadness, and annoyance can be done and I've seen a few

#

but above that, I don't think we can do that

hardy yew
#

that for sure, nothing wrong with expressions

finite wind
#

but people expect more than just little expressions to fall into the category of "oh I should put this in the dataset"

viral mason
#

His voice is usually this really gravely somewhat robotic voice but for the most part it's pretty human sounding

#

Not sure how the sound of his voice could affect the training

finite wind
#

so did I for one or two first models I trained

hardy yew
finite wind
#

like glados and etc characters

#

I think his voice alone isn't a problem at all

#

it is that god damn non-verbal things are always the problem

hardy yew
#

it's quite harsh at times

finite wind
#

not to mention the "too much expressive" speech if you're training a character model

hardy yew
#

i'd worry a bit that RVC could exaggerate this after training

#

but usually it does well with all kinds of funky voices

finite wind
#

yeep

hardy yew
#

whether clean human speech or something very artificial

#

e.g. both robotic-ish voices i tried training had some flaws resulting from RVC learning some parts of it too well and exaggerating them

#

one was resulting from low frequency content in the voice so that was more or less architectural limitation

#

the other was a 'cyborg' voice which is 90% human with slight electronic buzzing

#

and it affected the model a bit too much too

#

doesn't sound great even though it's actually not so far from thje original

viral mason
hardy yew
#

example

viral mason
#

I love making silly models

finite wind
#

I heard it and it sounds like high pitch sounds are blended with his serious tone of voice

analog obsidian
#

me too

viral mason
#

Still struggling to make a new one that I actually like

finite wind
#

I just wanted to point that out because it might have been the very thing we were talking about the dataset ratio or the

#

too much diverse expressions generate lower quality model

viral mason
#

Makes sense

finite wind
#

Doey have serious and silly voices which are drastically different so I figure

#

That might be the reason why e-girl models were thriving back then

analog obsidian
#

rvc dont learn expressions, it learns to predict mel and features

finite wind
#

that sounds about right

viral mason
viral mason
#

I find it strange tho how rvc can make a model like this somewhat where it can change pitch to match the random voices this character changes to

#

But struggles with a character that just has a lot of range in their voice like going from serious dark voice to higher more bubbly voice

finite wind
#

The limitation is real and I'm coping

viral mason
#

I hope some rich dude will show up and improve this stuff

hardy yew
viral mason
#

I've tried many times but never was left satisfied

finite wind
#

one day I can say unhinged shit as glados that sounds so natural to the point others might think it could've been from the actual game

viral mason
finite wind
#

I used one from here for a gartic phone session and it was fun

hardy yew
#

splitting by features is probably not a way

#

but pitch maybe?

#

like, having some vastly different samples of a low voice and high-pitched completely different voice

finite wind
#

could try with a deadpool model I was gonna try training

hardy yew
#

and perhaps training with some low batch size to make it not generalize so well on purpose

finite wind
#

since he got that normal unc voice and his silly voice lines

edgy quiver
viral mason
hardy yew
finite wind
#

Say does keggle applio still works?

#

or I should just try that google colab

viral mason
edgy quiver
#

Idk how i would do that as a woman but yeah

hardy yew
#

xDD

viral mason
finite wind
#

imagine using female model as a female

#

see where that takes you

finite wind
#

since the age of AI I cannot imagine any provider without paying a server

viral mason
#

Nope, 30 hours free

#

A week

finite wind
#

WHAT

#

what spec

#

I NEED TO KNOW

hardy yew
#

one of those two

edgy quiver
finite wind
edgy quiver
#

Wdym

finite wind
hardy yew
#

it's free 30h weekly

finite wind
#

where are they pulling their cash from wth

hardy yew
#

i'd say it's amazing xD

finite wind
#

it's google level of flex in terms of a provider that is

viral mason
# hardy yew

-# I delete my acc every time my time is too low and just use one of my other emails and it resets the time each time forever giving me infinite training time

viral mason
#

Totally don't do what I said

finite wind
#

so they hadn't even blocked account refreshing yet

#

keggle is so unreal

viral mason
#

Kaggle is W

#

Love it

#

Colab is Doo Doo because the time limit for anything is like 4 hours max

finite wind
#

though how long does it usually takes for a single epoch to train?

#

my local training time for an epoch would be 24 seconds

hardy yew
finite wind
#

ok hm

#

let's say 1 epoch = 50 steps

#

how long did it take you on keggle

#

so it's around like

#

15 to 18 minutes of data

viral mason
#

Really depends on dataset length

hardy yew
#

anran_klm_dc_32k_4b | epoch=139 | step=7367 | Current time: 23:50:47 | Time per epoch: 0:00:25

#

this is from epochs with 53 steps

#

25s

finite wind
#

so that's similar to my spec with acceleration option on

#

damn

#

really similar I tell you

#

3060ti with i5-14gen 32gb ram

viral mason
#

Shhhh

hardy yew
#

good bot

viral mason
finite wind
#

okay do I install applio on keggle or

finite wind
viral mason
#

Kaggle is my favorite

#

Best option for non local

hardy yew
#

i prefer to not train locally just because it's a waste of money when i can do the same in the cloud for free xd
and also no need to keep the PC running for additional hours and blocking me from doing something

viral mason
#

Real

hardy yew
#

the last time i trained a lot on my PC, my energy bill reflected it cat_doom

viral mason
#

Yikes

hardy yew
#

the only issues emerge in case of huge datasets that exceed the disk space available on kaggle for free xd

viral mason
#

Good thing I've never gone over an hour of audio

#

Well there was that one time with Kratos

finite wind
#

hey I know we're simping keggle here but can I get a link or an explanation on how to set up applio on keggle

viral mason
#

I have a whole video

hardy yew
finite wind
#

on the contrary

viral mason
#

This is how to do it

finite wind
#

who is interested in limbus company character models

#

Imma do it

viral mason
#

I've heard of the game but have no idea what it is

finite wind
#

if you'd watch a video about it it would be more confusing ngl

viral mason
#

😭

hardy yew
# viral mason This is how to do it

I like scripting, I just have a script with a couple variables that takes all the necessary configuration and it does all the magic with one click cat_sunglasses

finite wind
hardy yew
#

entire training procedure in one short string

viral mason
#

Explain what you mean

finite wind
#

set to 32k 250 epoch batch 4 and 1.6 legacy core pretrain bat file?

#

wowwie

hardy yew
#

yeah, runs preprocessing, feature extraction, index generation and then runs the training

#

afterwards compresses all data into zip

viral mason
#

No need for such specific epoch

hardy yew
#

and then i just download it

finite wind
#

I mean you can resume training from where you left off too right?

hardy yew
finite wind
#

I guess 200, 300, 350 all good unless it's overtrained

hardy yew
finite wind
#

you can resume from where you left off

hardy yew
#

i can later reupload the data and restore it before continuing training

#

if needed

#

but i usually just pick a large number of epochs to "ensure" i won't need to resume training later

#

though sometimes i still do it later

finite wind
#

sounds about right

hardy yew
#

in my case i don't have data persistence so it's not like all the trained data stays there between runs

#

that's why i need to reupload the necessary files if i want to resume

#

not much work anyway

finite wind
#

yep

hardy yew
#

i prefer to avoid the GUI whenever i can

#

especially when the work is repetitive

finite wind
#

isn't GUI's whole point is to make it less of a chore to navigate

hardy yew
#

each to their own, I guess

#

for me making a script and then just running one command is more convenient than opening a GUI and clicking through al lthe things

#

but GUIs are definitely convenient for lots of people

#

I'm spoiled by the linux world

finite wind
#

ahh

#

I didn't get it at first because locally applio saves the latest settings you've used

tame oracle
finite wind
#

@viral mason oh yeah, when you put silences in both start and the end of a 3s clip

#

do you include those silences in the 3s in total or exlude them from the total of 3s

viral mason
finite wind
#

hmmm oki

#

I was gonna manually slice clips into 3s

#

and I was wondering if silences in a clip should be counted as total seconds

viral mason
#

Yea no need to manually do all that nonsense

analog obsidian
#

the model learns silence thanks to the mute files, don't manually add silence to the samples

viral mason
#

Applio automatically slices your audio pretty sure

#

That's why I put one entire audio file

finite wind
#

yeah but I tested it and it cuts the sentence mid way into two clips that it sounds just weird

viral mason
#

hmm

finite wind
#

I wanted to manually slice into 3s myself

analog obsidian
#

and during training it cuts that audio even more

#

by 0.36 secs

finite wind
#

got it

analog obsidian
#

model takes that 3s audio, then learns using segments of 0,36 secs, it doesnt learn the 3s at once

finite wind
#

no manual silence pauses in the dataset and no manual cutting

analog obsidian
#

truncate the silence in audacity then use simple slicing

viral mason
#

Simple slicing?

analog obsidian
viral mason
#

Ah

#

I've always used automatic

finite wind
#

truncate the silences and how long would that suppose to be after that?

#

30ms? 50ms? 100ms?

analog obsidian
#

300 ms

finite wind
#

300ms? sounds a lot generous than I thought

analog obsidian
analog obsidian
#

i train pretrains, thats what i use and works

viral mason
analog obsidian
#

0.1, 0.2, 0.3

viral mason
#

Does it really effect it at all?

#

Training

analog obsidian
#

yes

viral mason
#

I'll try 0.3 then

viral mason
# analog obsidian

Do I need to enable truncate tracks independently if I'm using one audio file or nah

analog obsidian
viral mason
#

Ah

#

🤫

finite wind
#

when audio is 48k but when you look into the spectrum it's only 19k on chart so it's 38k joe_weird

#

gotta love mp3 bro

viral mason
#

Use Wav

analog obsidian
#

if the file is already mp3 he cant do much

finite wind
#

yeah

viral mason
#

Sad

analog obsidian
#

converting it to wav is going to preserve the mp3 compression

finite wind
#

it's a lost cause

viral mason
#

Do you use YouTube dlp?

#

It's peak

finite wind
#

I do but not often since they do that too

viral mason
#

Do what?

finite wind
#

youtube compression and all that

viral mason
#

Ah

finite wind
#

I often get lower quality ones

finite wind
#

when people neatly edit an entire video of a character voice lines but download it with dlp

#

it's garbage

finite wind
#

I wish they could upload it somewhere else that can be lossless

low shard
viral mason
#

Ah

#

Ok, was making sure since you sent Vonovox as an option