#✨│ai-help

1 messages · Page 244 of 1

crude flame
#

or minecraft building stuff

analog obsidian
#

if ur dataset is above 2 hours, try using batch size 32

#

but 16 works too

#

i trained mine using batch 24

#

and 2x d passes

#

L1 Mel Loss

winter dew
#

ooo ok

#

ill keep that in mind as well

#

do you have any more tips I should know? about to sleep

analog obsidian
#

models can only do things similar to what they have in their dataset

#

so remember ur model is only gonna be able to do things similar to what are u hearing

winter dew
#

i see

#

even with a larger dataset?

analog obsidian
#

i can show u

winter dew
#

ohhhhh

analog obsidian
#

extreme example ^

winter dew
#

yea I get you

analog obsidian
#

now u kinda understand why most models sound robotic hehe

#

the more u fill those gaps, the less robotic results u get

winter dew
#

yup I understand that now

analog obsidian
#

use applio F0-spin branch because that allows u to disable multi-scale mel

#

L1 mel is more natural (in my opinion)

winter dew
#

sounds good

#

screenshotting all of this for later lol

analog obsidian
#

my settings ^

#

applio > rvc > train > train.pý

winter dew
#

okay

#

do you any good guides for it?

#

bc im gonna have to watch a guide prob to even get started lmao

crude flame
analog obsidian
#

these works but ye this is a bit more advanced

winter dew
#

ah

#

okay

crude flame
#

voc_fv4 the goat

analog obsidian
#

oh yes don't use gaming streams

#

only just chatting or whatever

winter dew
#

ooooo

#

ok

#

yea that’s helpful

analog obsidian
#

for removing room reverb i use this

winter dew
#

unfortunately I really gotta sleep but you guys are the goats

#

ty

analog obsidian
#

cat_yes dw just ask later

#

im usually here most of the time

crude flame
analog obsidian
winter dew
#

sounds good thanks

crude flame
#

yea but like

#

yk

analog obsidian
#

its good

#

its not ai based tho

crude flame
#

did you 🏴‍☠️ or

analog obsidian
#

Acon DeVerberate 3

#

"buy" it

#

is very decent imo

#

my set had a pretty loud reverb

crude flame
#

i pretty much never have reverb in my sets

#

so idk if ill use it

#

dialogue isolate is all i need 😎

analog obsidian
quasi iris
#

I've got a mac Intel that I'm running the mac deiteris w-okada file on - I've managed to quarantine the files so I can open the application, but it crashes after a few minutes of loading - could someone help me with this?

knotty moth
quasi iris
#

Thanks! I'll give it a whirl

#

I'm a little lost on what to do after hitting copy and edit in kaggle - is there a video or anything I could watch?

knotty moth
quasi iris
#

Unfortunate - I'll see if I can figure it out and ask more oof

tough elbow
#

helloo, can someone tell me how can i download w-okada voice or is there better options?

#

🙏

sturdy vigil
#

How to download w okada

sullen lion
#

its fixed on dev

#

now im not at 100% with the results im after (character with heavy style influence to the point that i can prompt the character in different outfits at full weight and still keep the art style on offshoot models like wai)

#

but im at like 70%

#

as opposed to like 45% before (gens were still very booru outside of base illy)

#

one thing im noticing in all the other models and idk how much it even rly affects me

#

but kohya gets rid of the noise offset field when i select multires noise

#

and all the models i look at off civit has a 0.1 noise offset

#

is it possible for me to get the noise offset AND the multires noise params like these models?

#

i wanna be at like near 100% parity cus i want these same kind of results

elder coral
#

litsa whyy

sand bison
#

?????

sturdy vigil
elder coral
#

why rejoin on may 12 of thisyear

sturdy vigil
elder coral
#

how do i know the epochs of the model from my weights

#

i have to submit

#

-colab

patent trellisBOT
# elder coral -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **RVC Mainline**

by Hina
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **Hina's Modified Original Wokada**
• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

sturdy vigil
sturdy vigil
elder coral
knotty moth
knotty moth
# elder coral yes

I suppose there should be other QC staff members that could respond on

#

perhaps wait till them online

pastel oak
#

Use kaggle or upgrade

elder coral
#

qc i need help

#

how do i know the epoch of my weights voice model

viral mason
#

I don't think u can check, I've made some on weights to test, btw don't use weights for model making it's bad

low shard
#

there are different ones, choose based on the things i said

stark zephyr
#

what do i do after i install the realtime voice changer?

low shard
wet lantern
#

when i install the voice it says u cant download that type why

stark zephyr
#

my gpu is an rx 6600 i wanna use voice changer ig and no tutorial link i just got the disocord link Duckus's youtube vid

low shard
low shard
#

forget everything you got from it

stark zephyr
#

okayy

low shard
#

-realtime

patent trellisBOT
# low shard -realtime
💻 Local Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Wokada Deiteris Fork

Most suggested. GUIDE

• Original Wokada

ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE

• **RVC GUI Mainline Realtime**

Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated

low shard
#

read 1st lin

stark zephyr
#

i alredy downloaded it

low shard
#

as i said

stark zephyr
#

ik

low shard
#

forget that duckus tutorial even existed

stark zephyr
#

i downloaded this 1 alredy

low shard
low shard
stark zephyr
low shard
#

@winter iron slurs arent allowed.

winter iron
low shard
#

!give-media-perms 1h @stark zephyr

stark zephyr
#

this 1?

#

i extracted alredy

low shard
stark zephyr
#

done

wet lantern
low shard
low shard
stark zephyr
low shard
#

RVC = Retrieval-based-Voice-Conversion, the best Few Shots Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models. Technically, Mainline RVC does have a go-realtime.bat (aka RVC-GUI), but it's pretty messy and outdated so it's extremely not suggested for realtime.

Wokada = uses RVC for realtime inference. There's 2 main versions, Original made by Wok, and the most suggested one is Deiteris Fork (modified version)

#

which do you need

low shard
stark zephyr
#

don3

#

done

stark zephyr
#

Its doing its thing

#

ig

wet lantern
#

i use voice ai

low shard
stark zephyr
low shard
low shard
low shard
wet lantern
#

cuz i want a live sound

stark zephyr
#

@low shard its stuck here waht do i do lol

low shard
low shard
#

holy shit ur internet is slow asf

stark zephyr
#

ik

#

im playing roblox

#

val

#

and calling wiht someone

#

at the same time

low shard
#

just be safe to not fuck it up so hard that it doesnt download

stark zephyr
#

okayy

#

this ius taking awhile

wet lantern
low shard
#

alright

#

delete everything you got off youtube and voice.ai

#

-realtime

patent trellisBOT
# low shard -realtime
💻 Local Realtime RVC

Guides for Programs that use RVC Models in Realtime for Calls/Games

• Wokada Deiteris Fork

Most suggested. GUIDE

• Original Wokada

ONLY the latest alpha comes close to the Deiteris Fork performance, older versions in youtube tuts are way worse. GUIDE

• **RVC GUI Mainline Realtime**

Unavailable, the guide is outdated and the program is worse compared to the ones above, and much less updated

low shard
#

get wokada deiteris fork, read the 1st link

wet lantern
#

i installed it but it has many things

cedar quest
#

Did those postmodern jukebox ai covers get DMCAed by the YouTube channel (frank Sinatra covers for example ) or are they generally okay with people uploading AI covers of their covers as long as their channel is credited? Because last time I had a bunch of PMJ Brittany Murphy covers they never got takedown at all , I just deleted them from my main YouTube in case the channel turned against me so I wouldn’t lose my main YT content that was non AI.

#

I can’t find any info about this online about PMJs stance on the ai covers and whether they issued takedowns

#

Obviously I’m not famous enough to have dialogue with someone from PMJ to ask for permission to have Brittany covers of their music on my channel.

#

I was going to upload them to an alternative YT channel and not my
Main.

wet lantern
stark zephyr
#

should i re install?

brittle wing
#

i have a question

#

.......................

#

will somone answer

craggy bough
#

@brittle wing ^

#

oh wait

viral mason
craggy bough
brittle wing
#

I

#

wanna know if w okada would work well with a 3050, 4050 and a 4060 all with 16gb ram

viral mason
#

I could help u set it up

brittle wing
#

i domnt have it yet son injust wanted to know

viral mason
#

idk how well it would run tho

brittle wing
#

would it work?

#

what specs do i need for it to wpork dceently

viral mason
#

I'm not sure but I use this gpu and it works well

brittle wing
#

idk much about gpus is that better than what i said

hasty stump
#

Hi, i wanted to use the ai voice changer thing to make a song, i had downloaded it a while back but i unfortunately forgot how and what do i need to install and properly work it out, anyone can help me?

viral mason
#

but again I can help u download it and see if it runs well

craggy bough
viral mason
hasty stump
#

(purely for my purpose not for profit ofc)

viral mason
#

but if you do want the voice changer u can talk to me in dms about it

viral mason
viral mason
hasty stump
viral mason
hasty stump
viral mason
#

@brittle wing u still interested?

low shard
low shard
#

i got 2 strikes in less than half a day once

low shard
low shard
viral mason
#

like me

#

words don't do much if they don't read

little timber
#

so basically im looking for a realtime live voicechanger with a smaller delay and more realistic is there any good alternatives?

tall cedar
#

Can anyone here help me with the rcv voice

#

i tried to download it but got told i had 2 pay

alpine sable
flint anvil
#

ive been experimenting with a lot of different voice models and the realism on some is real hit or miss (robotic sounding, artifacts, etc.), what should i be looking for in voice models in #1175430844685484042? i usually filter by rmvpe, english, and rvc
i already have chunk size at 74ms and extra at 2.7, f0 extractor rmvpe, with force fp32 and a dedicated gpu on deiteris' fork (so i can up the settings but the models themselves just have artifacts and stuff)
so how can i find good models or how can i up the quality of the realtime voice changer? i'm just looking for a normal talking model not singing

crude flame
flint anvil
crude flame
#

f0 method you should either look for rmvpe or fcpe

#

pretty much everyone uses rmvpe so you dont really have to filter for it

sand bison
sand bison
pastel oak
sand bison
sand thunder
sand bison
pastel oak
tall cedar
#

So i got it to work but it dont work in vrchat anyone who can help

low shard
#

!howtoask

patent trellisBOT
# low shard !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
crimson kiln
#

Hey everyone, quick question. Does stability matrix support .onnx models?

stark zephyr
#

@low shard

#

It worked I think.

#

I left it overnight.

timber merlin
#

Can someone please help me with okoda rvc? I recently upgraded my gpu to amd and i’m stuck on the cuda version, i don’t know how to switch to another version

The reason i want to switch to the other version is because next to the “gpu” processing option it only shows the CPU and overworks it to the point the voice changer stops working

#

I had help set this voice changer up a long time ago so i don’t know much or if the gpu change is the cause for the problems im facing with it, could be that or its an outdated version

winter dew
#

what exactly does it mean to merge voices?

#

i heard you can make it sound more realistic

gilded robin
#

in vonovox is there a way to add more effects? more vst plugins in the same line?
or do i just have to keep using things like reaper for that

silent stratus
#

if my dataset if over 2 hours should I just train from scratch

winter dew
#

also does anyone know why it feels like the outputted sound feels like its talking faster than the original?

#

it feels like im talking to fast after its converted

analog obsidian
#

2 hours is too small for a pretrain from scratch

silent stratus
timber merlin
vapid dirge
timber merlin
#

I do but super like corrupted, late, and laggy

#

And sometimes it comes through sometimes not

#

In the guide it says its not recommended for cpu to do the work so yeah

wet lantern
silent stratus
#

sorry for the questions i havent trained a model close to this big ever

#

or should i go 16

analog obsidian
silent stratus
winter burrow
#

is wokada still the best real time voice changer? its been a minute since ive used it

lethal depot
#

I’ve been trying to find the best place to clone voices I’ve been using Kits.AI and it’s pretty good but all they want is to take your money for things that shouldn’t even be charged for. What is the best way to clone a singer that sounds really good?

#

And I don’t mean Weights I’m sorry but it’s just not good in my opinion

odd isle
soft stratus
#

How i use Applio RVC to train my voice

latent kettle
soft stratus
#

But the news is that I don't find upload my voice

latent kettle
latent kettle
full drum
#

what do i put here if i wanna use the voice changer on discord ( i already have virtual cable downloaded)

soft stratus
full drum
#

euhh i cant send images but i mean for audio input and output

latent kettle
full drum
#

what about the input and output for the actual voicechanger

full drum
#

its still on default idk which one to change

latent kettle
full drum
#

thanks its work

soft stratus
#

Can it work for android

latent kettle
full drum
latent kettle
soft stratus
latent kettle
#

Real time voice changer

#

You said "can it work for Android " what are you talking about?

#

@soft stratus

soft stratus
latent kettle
soft stratus
#

Google colab

latent kettle
#

Yes it works but check it

#

I mean you directly have to upload your dataset file to a folder named "Dataset"

fair hearth
#

i dont have access to server mentioned in voyage contest

rotund cairn
#

can anyone help I use okada voice changer but my voice lags a bit and mumbles a lot

vapid dirge
#

does anyone actually have good success with w-okada?

#

i feel like i havent seen anyone with it working well

knotty moth
#

!howtoask

patent trellisBOT
# knotty moth !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
vapid dirge
#

im genuinely curious

#

cause ive been trying and it just doesnt sound realistic

knotty moth
vapid dirge
#

although i havent done it completely

#

im a little lost in connecting light host to the voice meter

knotty moth
vapid dirge
#

k ill try those models too

#

i just wanna see someones that works

knotty moth
#

it may depend on what kind of your voice on the mic

vapid dirge
#

i got a blue yeti is that good?

knotty moth
#

so try several models and find which of them work

#

also don't forget to adjust pitch as needed

vapid dirge
#

i do

timber merlin
# knotty moth it may depend on what kind of your voice on the mic

Can i get some support too? My problem i believe is that im on the cuda (nividia) version when i should be on another since i upgraded to amd, because i noticed during the processing options for gpu it only shows my cpu and overloads it, i downloaded this link but i dont know what the next steps are, can i get a step by step on what to do after i download it in winrar?

#

(Dm me pls on what to do if a step by step guide is against the rules here, i’m worried i’ll mess up for the fourth time)

simple ore
timber merlin
knotty moth
# timber merlin

definitely wrong version, should get the AMD one as you showed above

timber merlin
#

okay i downloaded that link idk what to do next...

#

and what should i do with the old one?

#

is it fine to keep?

simple ore
#

You can delete the old one.

timber merlin
golden walrus
#

Guys. Can i ask about what is spin 7 12 ? And how can i use it

simple ore
simple ore
timber merlin
#

okay im deleting everything in it

knotty moth
timber merlin
#

well i just deleted the folder so now i got this one which i assume is the correct one?

golden walrus
simple ore
#

it should be 7-12 one

timber merlin
#

(i dont mind sharing my screen in help vc to speed it up the process)

golden walrus
#

And i have no idea if i can pair pretrain from Seoul Streaming Station for spin, the KLM4. Because i got some cracking issues with KLM6 cat_sadcat

golden walrus
#

misc_gru i thought they are different

simple ore
golden walrus
#

i have no idea, but it is in the fork somehow cat_pawbite

simple ore
#

so I guess it just uses wharver is in rvc/models/embedders folder

golden walrus
#

do i have to download it again ?

#

cat_pawbite i mean, it's here

simple ore
#

certutil -hashfile pytorch_model.bin MD5

#

for comparison

golden walrus
#

i don't get this, too advance for me

simple ore
#

it is a command that calculates the checksum of the file, so you can find which version you're actually got

#

since the names are the same

#

open command line in the spin folder and run it

golden walrus
#

ahhhhhhhhh

#

so 7-12

#

ahhhhhhhh thank you so much

timber merlin
#

i got the voice changer but why is it a web version? is there a none-bowser version?

timber merlin
#

okay ill test it

simple ore
#

if you are running it locally, change to server

#

for direct access to hardware without the browser limitations

timber merlin
#

oh okay

#

oh ok

#

it switched

#

which one of these should i use? the WDM?

#

or mme? idk the difference

golden walrus
#

ah, Noobie sir, do you know if refineGAN can be used in realtime yet ? i read in KLM5 article, and someone said it wasn't suitable for realtime

empty sundial
#

help applio says no api found 😦

golden walrus
#

cat_blush ah nvm, i will wait for SSS release KLM6v3

simple ore
simple ore
#

the only api there is core.py command line

golden walrus
#

ah

#

klm6 use hifi

simple ore
#

singing is an issue because it renders harmonics too well, so they end up mirroring like crazy

golden walrus
#

cat_pawbite like, it is not working well when singing ? because i don't remember any models can sing

stark zephyr
#

How to get the RVC to work in discord and games?

simple ore
golden walrus
lucid stratus
#

hello. I recently got a new gpu and wanted to ask about gaming and w-okada.
With the 5070 ti it sounds quite good without gaming, but when i play games such as hunt showdown it starts to sounds a bit worse.

Is the gpu still not good enough or can i adjust settings and it will improve? I don't have that much knowledge about it. (as in reducing ingame graphics and maybe increasing the delay from 0.4sec to 1sec?

simple ore
golden walrus
#

i knew itttttttt

#

but what if i don't really sing

simple ore
simple ore
golden walrus
#

i just wonder if it can process breathy or scream

#

it's based on my data too right ?

lucid stratus
simple ore
#

but 5070ti is decent

#

unfortunately there's no way to prioritize the gpu resources

timber merlin
#

i got it to work finally!!!

#

thanks for the help everyone 💙

#

though is there a way to increase/decrease chunks?

#

because its locked for me

#

also i cant click on sup1 or sup2

#

oh nvm i just had to turn it off im stupid lol

elder coral
#

does this model have artifact problems

#

-colab

patent trellisBOT
# elder coral -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **RVC Mainline**

by Hina
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **Hina's Modified Original Wokada**
• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

signal crater
#

guys, which tool should i use for below use case?
lets say i have a video of a person (a movie scene)
i want it to say something some custom dialogue
which tool i should use?

low shard
timber merlin
#

1 small issue, i can't mess with the noise supression

stark zephyr
#

It works in discord now.

low shard
#

also you can optionally use force fp32 mode in advanced settings for better quality at the cost of some delay

low shard
stark zephyr
#

it sounds so bad tho

low shard
timber merlin
stark zephyr
low shard
# timber merlin yes sir!!!

set extra to 2.7
chunk to 200ms

the reason why you cant use noise/echo suppression is because you're in server mode

client = can use noise/echo suppression, but can have more delay and is easier

server = harder to use, less delay and can't use noise/echo suppression

if you need noise/echo suppression, you need to use server or use a 3rd party tool for noise suppression

also, you can optionally turn force fp32 mode on in advanced settings for a bit more delay and a bit more quality

stark zephyr
#

It sounds good enough for me.

low shard
# stark zephyr

set extra to 2.7, chunk to 128

ALWAYS check the triangle when youre changing settings on AMD

#

the triangle will be your life saver for AMD

stark zephyr
#

Which triangle

low shard
stark zephyr
#

oh

#

right

#

i cant make it 128.

low shard
#

hover your mouse over it, it will tell you wha tto do

low shard
#

like it doesnt matter if u set it to 127 or 129 instead

low shard
stark zephyr
#

Okay.

#

Now im switching back to the gpu.

#

No triangle anymore

#

Is that good?

low shard
#

also, if you click advanced settings, then set force fp 32 mode on, you can get more quality with a bit more delay

low shard
stark zephyr
#

.

#

This sounds way worst?

#

Its like cracking up.

low shard
stark zephyr
#

Okay.

low shard
#

its always better to try many models

timber merlin
low shard
#

also i cant set it to 200ms only 210.7ms is the closest because i can't just type it by the looks of it
yeah you cant type it unfortunately, but it doesn't really matter dw, just as long as its close and not that you put like 560 instead

low shard
stark zephyr
#

Nah thats it

#

Tysm.

timber merlin
low shard
low shard
timber merlin
#

ahh okay so it'll have probably a larger delay but it wont lag out or be inaudible

low shard
#

else it will lag because you're forcing your gpu to do a less delay than what it can do while in game

timber merlin
#

does using quest 3 pcvr mode also affect it?

low shard
#

its suggested to play at the lowest settings possible btw

low shard
#

vr games might be a bit more intensive tho?

timber merlin
timber merlin
#

ahh okay

#

thanks a lot for this info, i was struggling yesterday

#

trying to understand a lot of it

low shard
timber merlin
#

i think that's all ill be messing with it and will try the client side instead of the server

low shard
low shard
timber merlin
#

like i hear myself but at lower volume

timber merlin
#

it didnt change anything

low shard
timber merlin
#

i mean i hear myself its just quieter when i switch to client i dont think its a headphones issue

#

but its fine its not so low its inaudible, its still fine, i just noticed it

#

oh!

#

what fixes is it is the out and in (even tho both is 100 client and server but in client its quieter so it should be increased in my case)

#

all good thank you again for the help that's all HeartNMEGG

fickle minnow
#

when i download a voice it doesnt sound good at all and its buggin

patent trellisBOT
# hallow thistle !howtoask

How To Troubleshoot AIHC_WaitWhat

__**GIVE CONTEXT.**__ 📝
  • Don't simply mention your issue, like "my rvc is not working".
  • Describe the step you are on, what you're trying to do, the RVC you're using, a screenshot, etc.
  • The more context, the better.
__**BE POLITE.**__ <:matsuripray:1159685390156967936>
  • Don't be desperate. You can ping a Helper, but if they ignore, they aren't available/don't know the answer.
  • It's okay if you're frustrated, but don't take it into this server.
  • Don't DM without prior consent.
__**BE PRODUCTIVE.**__ 🤝
  • Don't ask for every little instruction. Put your own effort & test things by yourself.
  • Don't ask to ask.
  • Check if your answer is a Google search away/on our guides website.
hallow thistle
#

Which W-Okada or RVC program are you trying to use? And what is your PC GPU?

hallow thistle
fickle minnow
low shard
#

delete everything you got off video tutorials, they are old

#

the version that namari gave you is the latest wokada deiteris fork, which is the best

#

if you have vb audio cable, uninstall it

fickle minnow
hallow thistle
# fickle minnow why?

VB-Cable gives random issues for Windows users when trying to use it with W-Okada, as its settings kinda complicated to get it work properly.

fickle minnow
hallow thistle
#

There's an alternative to that one. It's Virtual Audio Cable lite, works out of the box. It doesn't have always be VB-Cable.

fickle minnow
#

alright thanks, i deleted vb cable but it still appears in my output settings

#

might have to restart my pc

fickle minnow
latent kettle
#

@simple ore sorry to bother you, but is it possible for a normal user to train a custom pre-trained model. If yes, please elaborate how.

simple ore
#

find 200+ hours of various audio from different speakers, should be same quality

simple ore
#

make up to 109 files

#

just train from scratch

hallow thistle
#

Pre-trained RVC model? I think it sounds possible, but you would need to gather a lot of audio datasets to make one, more than for a typical normal voice model.

latent kettle
simple ore
#

55h set on 4070 runs ~30min/epoch, you need to get like 300-600 epochs

#

pretrain is a base

#

like a blank sheet cake you buy from the store to write 'happy birthay' on it

latent kettle
hallow thistle
latent kettle
#

I got 4060

hallow thistle
latent kettle
#

Maybe possible on kaggle?

simple ore
#

if you pay for something better than 2x 10-year old T4s

#

but there are cheaper options around

idle osprey
#

how to make a model not sound robotic

light latch
#

Does anyone have the Google Collab link to make the covers?

marsh acorn
#

Yo can anyone come in VC support to check if my Overwatch Winston voice changer is working ?

#

just ping me if you can

latent kettle
simple ore
#

Dont think so... a pretrain can help to expand the range of the model a bit, but default hifigan kinda sucks

#

you risk of replacing the pretrain data with just your finetune if you over-do the train

#

and a bigger pretrain dataset does not seem all that better than smaller

#

sure, having some singing data and dynamic range helps, but...

humble ermine
#

how do i package my rvc model using a command here ?

viscid moss
#

HF is down btw

analog obsidian
forest vector
#

Im getting problems getting the RTX 5090 to run on deiteris' optimized W-Okada Fork Voice Changer.
I installed both versions, the one for the 50-series doesnt work, and gives "voice not selected" error messages.
tested the 50-series with a 4080M card and getting the same error.

downloaded the non 50-series version and it worked for CPU.

crude flame
forest vector
#

yes, it is selected, made sure to upload a pic on the voicemodel to make sure

crude flame
#

Did you click the voice?

forest vector
#

yes, multiple times

crude flame
#

Did you try switching to a different voice then back to that voice

forest vector
#

im certain it is selected, I know how to work with the non-50 version

forest vector
solar sinew
#

How do I train a model?

golden walrus
#

but gambatteeeeeeeeee

#

pepe_cry i can only wish you the best

golden walrus
#

cat_blush but we will get there eventually

analog obsidian
golden walrus
#

somehow i can't use KLM 4. or 5. or 6. correctly

#

it keeps losing voice and having lots of cracking

analog obsidian
#

you're trying to infer singing with speech datasets?

golden walrus
#

no lah, i don't have singing, only speech data

#

like, i can't use it for real time for whatever the reason

analog obsidian
#

welp you answer urself why your models cant infer singing

golden walrus
#

i mean, not singing, just talking normally

#

and it has cracking and lose voice mid sentence

analog obsidian
golden walrus
#

1 hour of pure speech

analog obsidian
#

could be your mic

golden walrus
#

but with normal pretrain, i can speak normally

#

same dataset

analog obsidian
#

ahhh

#

ye klm is not that great for speech

golden walrus
analog obsidian
#

imo just stay with the og pretrain

golden walrus
#

but but

#

in the pretrain

#

i heard those samples was pretty neat

analog obsidian
#

it was ok for me but i noticed speech sounds unnatural

#

the og pretrain is great

#

only issue is that it cannot sing because the pretrain was trained with speech

golden walrus
#

i can't sing irl so

#

i don't expect AI to sing well

analog obsidian
#

multi scale mel loss seems to increase vocal range but it made my models not resemble the original voice too much

#

but the true solution would be training a pure singing pretrain, then finetuning singing
(refinegan may be a better choice for this)

#

the spin embedder seems to be better at handling noise so in theory breathing should be better

#

but i havent compared that yet

analog obsidian
crude flame
#

i dont think so

austere hollow
#

how do i unlink my discord from weights

#

i got a new acc

sullen lion
#

yknow what i should ask the general question of what are some recommended settings for training cartoon style loras for usage on non-base illustrious checkpoints

#

im like 85% there but i wanna be closer to 90-95%

#

i see plenty of loras that supposedly accomplish this but trying to follow their training params has been only Okay

#

if i have to change other settings in my genner (i use automatic1111) to get closer i can do that too

#

ideally i wanna get the style to remain when i do gens on wai-nsfw (what a lot of the models i see use to prove style rigidity)

crude flame
#

but i wouldnt know im so new to training loras 😔

#

if someone could actually answer that question that would be very helpful to him and me

#

i mean ig you can also try a lycoris

#

like a locon or a glora

elder coral
#

how do i do this qc

#

the demo audio

austere hollow
#

send it in drive format

#

i did that

knotty moth
sullen lion
#

i share an instance with my friend but he doesnt seem privy to go back to pony

#

trust me if it was up to me id run crying back to pony training on that was a blessing

#

but i KNOW its possible on illustrious

#

i SEE models that do it all the time

gilded robin
#

hi theres an issue with vonovox and flexasio setup i dont know why if anyone could help

Input Device: FlexASIO
Output Device: FlexASIO
Configured Sample Rate: 48000
Error creating audio stream: Error opening Stream: Invalid number of channels [PaErrorCode -9998]
Critical error in start_vc: Error opening Stream: Invalid number of channels [PaErrorCode -9998]
Traceback (most recent call last):
  File "gui\\gui.py", line 1229, in gui.gui.GUI.start_vc
  File "gui\\gui.py", line 1270, in gui.gui.GUI.initialize_voice_conversion
  File "gui\\gui.py", line 1324, in gui.gui.GUI.start_stream
  File "core\\audio\\audio_processors.py", line 797, in core.audio.audio_processors.AudioDeviceManager.create_stream
  File "core\\audio\\audio_processors.py", line 787, in core.audio.audio_processors.AudioDeviceManager.create_stream
  File "C:\Users\legen\Desktop\Vonovox-1.4.5\runtime\Lib\site-packages\sounddevice.py", line 1825, in __init__
    _StreamBase.__init__(self, kind='duplex', wrap_callback='array',
  File "C:\Users\legen\Desktop\Vonovox-1.4.5\runtime\Lib\site-packages\sounddevice.py", line 909, in __init__
    _check(_lib.Pa_OpenStream(self._ptr, iparameters, oparameters,
  File "C:\Users\legen\Desktop\Vonovox-1.4.5\runtime\Lib\site-packages\sounddevice.py", line 2796, in _check
    raise PortAudioError(errormsg, err)
sounddevice.PortAudioError: Error opening Stream: Invalid number of channels [PaErrorCode -9998]```
#

shows up in the cmd

elder coral
sand bison
#

So, are the only two ways to create AI models with Applio and Kaggle? I used a Colab that said it was called RVC v2. Disconnected - Colab. Sorry for the inconvenience.

stark zephyr
#

hi how do i fix this

#

I clicked enter and it doesnt pull out the RVC client on my chrome

unreal linden
#

How much epoch should i train if i have 15 min voice with clear sound.

peak path
#
  1. should i use mono or a stereo wav file for my dataset files?
  2. is it possible to use flac instead of wav for dataset files? which one is better?
analog obsidian
pastel oak
pastel oak
#

Client uses MME, a sound device, by default which is extremely outdated
Server lets you choose which audio type you choose. You can use mme there too but obviously no point when wasapi is newer. Wasapi has less delay (faster audio processing) to name one benefit

pastel oak
#

Applio is not a category like colab and kaggle, you would use applio inside kaggle you get me

stark zephyr
unreal linden
pastel oak
#

?

#

Youre describing something different

#

One has to be ticked by default

gilded robin
#

but it works on w-okada just fine just not on vonovox, is there a way to contact who made vonovox?

knotty moth
#

adjust chunk according to the gpu capability

gilded robin
#

whats your extra & chunk & gpu?

pastel oak
#

Gpu goes idle maybe. Try lowering chunk a little bit more like 150, if it doesnt fix, then run the "force gpu clocks.bat" file outside of the mmvc folder

pastel oak
#

For everrything

small jacinth
#

hey yall
do any of you use AI in your projects?
i want to integrate AI into my project
but i dont know how to do it for free

pastel oak
#

48k is fine

#

Higher frequencies getting picked up

pastel oak
#

Which doesnt matter cause wokada downscales to 32k anyway

#

But your mic inputs everything into wolada to its fullest range this way

pastel oak
#

Means your mic picks up your headphones output and loops it
Move mic further away from your headphones, lower volule on your headphones, move in. Sens. Further to the right

broken urchin
#

hello

#

when running deiteris fork voice changer i was wondering if i can change the name of the run file from MMVCServerSIO to anything else and the voice changer folder name to something else

#

if i change the name of these files will the voice changer still be able to run?

paper locust
#

not sure if heres where i should ask, but im having this issue with the voice changer, when i use it i have it set so it inputs my mic, outputs VAC, and then in discord inputs VAC and outputs my headset, what happens is when other people talk loud enough, the voice changer for some reason picks up their voices slightly and then replays it, but my mic normally doesnt do that so idk what im doing wrong

paper locust
#

hm, my headphones do that but they are old so the system or something for it probably broke

#

arctis 7p+

#

the weird issue is that ive done things like this before but ive never had this specific issue happen, my mic normally doesnt pick up noises from my computer, but for some reason the program does

knotty moth
#

if you have someone irl, tell them to be quiet

paper locust
#

and i cant make it audio loop on itself by screaming loud either

#

which is the weirdest part

paper locust
#

so im forced to use standard

#

so i guess i can noise supression on the voice changer itself alongside discord standard noise suppression

pastel oak
shrewd verge
#

does anyone know how to create your own voices for the VCClient? there's probably a whole load of tutorials out there im just looking for a point in the right direction

#

i have a sample for the voice all ready to go i just dont know how to use it lol

latent kettle
shrewd verge
#

much appreciated

shy spruce
median monolith
#

when making a voice model of a character that doesnt have much dialogue (for example: a character wich only has 1 minute of unique dialogue in total or maybe even only a 1 second audio clip), how long should the dataset recommendably/ideally be? should it be only the unique instance(s) of voice/dialogue or should it be an audio of said dialogue/voice repeated until you have an audio file with certain duration? if the second case, I wonder what duration should be enough.

analog obsidian
cosmic vigil
#

#ask Index Rate what for ?

analog obsidian
#

values higher than 0 will begin to blend the model's accent with your accent

cosmic vigil
#

finally the best simple answer that can i understand with lower iq, thank you 🙏

#

i think for ideal real time voice on games, using higher index will be better 🤭

median monolith
# analog obsidian 1. 10 minutes minimum 2. no, don't repeat the audio, its going to give very bad ...

well, in that case, the character im trying to make a model only has like 1 minute of usable unique dialogue... Im not expecting it to be a SpongeBob or Michael Jackson quality type of model, but still, I wanted to know, would it then just be "better" for me to use only this 1 minute long audio of him "singing" than repeat it like 5 or 10 times? (to atleast make it less mediocre) its what I understood.

analog obsidian
#

the model requires actual 10 minutes of diverse data

#

no the same thing over and over again

#

technically you can train anything, even something below 10 mins

#

but don't expect good results

median monolith
median monolith
analog obsidian
#

sounds interesting, what if you try it?
im currently training my hifi pretrain, breaths are still robotic :<

median monolith
#

now im thinking about the same thing, but instead of different volume, it would be different pitch and "time".

analog obsidian
#

gpt told me about different volumes, pitch and time being viable as data augmentation

#

i dont think there have been tests of data augmentation here

#

adding more breaths technically may also improve them

crude flame
#

hifigan needs natural audio right? If so then thats why we never used data augmentation

analog obsidian
crude flame
#

according to gpt it may perform poorly

analog obsidian
crude flame
analog obsidian
#

i think this to work the augmentated data should be shorter than the original data

#

like adding 5 minutes of augmentated data to a 10 min set?

#

but I'm not sure, someone should try it

crude flame
#

i can try it rq

analog obsidian
#

try what noobies said
duplicate the dataset but with lower volume

#

i can try after this pretrain learns how to reproduce breaths yt_nails

crude flame
#

how much volume lowering do i do?

analog obsidian
#

good question misc_trolley emoji_40

#

-3db?

crude flame
#

ill try -3, -6, and -10

wheat egret
#

i just joined to try a specific ai voice made by MartinFLL and i wanted to know how can i use it? do i have to host an ai on a local device?

crude flame
#

i love my 5060 ti

analog obsidian
wheat egret
wheat egret
#

i don't think there is TTS

crude flame
crude flame
#

thats a rvc model you cant use it for tts

wheat egret
#

alright

crude flame
#

unless you make some audio with a diff tts then infer with rvc

wheat egret
#

oh alright

#

i see

wheat egret
crude flame
wheat egret
#

thank you

crude flame
#

^

wheat egret
crude flame
wheat egret
#

ok

#

thank you for your attention

gilded robin
# shy spruce there is a channel setting in flexasio when you set the devices. make sure both ...

ive genuinely tried everything idk, (1,2) (0,1) (0,0) (1,1) (2,2) input output channels on vonovox & flexasio, i checked the sound settings to make sure what amt of channels they have aswell,
it worked for the first few times on 0 channels for flexasio & vonovox but just stops randomly, it works like genuinely 1/10 times without reason.
if it helps when i tried to re-open setup.bat it worked for 1 single time but then stopped right after i clicked stop and start again

brittle wing
#

-colab

patent trellisBOT
# brittle wing -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **RVC Mainline**

by Hina
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **Hina's Modified Original Wokada**
• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

crude flame
#

weird thing is that they all had different step counts

#

idk why

#

i changed nothing

analog obsidian
#

@crude flame try this

vagrant flare
#

Hey does anyone know a really realistic image generator?

#

I heard about midjourney but i want to be sure it will be really reaslitic before i spend money

median monolith
#

Thats strange..., I did everything the guide for Applio Colab (cloud) said, but when I reach the part where I have to press the "start training" button, it seems to be not working for some reason is what it seems.

crude flame
#

i dont really got any audio with much volume difference

lean rover
#

Hi, could someone help me solve this error?
Looks like there's a bit of a problem.
unknwon message

If you clear the information being managed by this app, it may be recoverable.

Initialize

Reload without initialize

Error
unhandledrejection
no error stack
Error: Could not load Voice Focus estimator.
Error: Could not load Voice Focus estimator.
at http://127.0.0.1:18888/index.js:2:1042547
at Generator.throw (<anonymous>)
at s (http://127.0.0.1:18888/index.js:2:1039349)

lean rover
#

and how do I do that?

#

Yesterday when I used it everything was normal, and today it won't let me open the application.

wooden girder
#

Does anyone know why when I change protocol to rest my pref stays 0 but if it's on sio I can see it jump and change accordingly when speaking for wokada?

gilded robin
#

i have this really good model that i like a lot but i dont like the voice that much, is there a way to fix it? merging?
is there a guide to merging?

#

or a way to make it softer?

fluid lion
#

what are some of the most convincing/best sounding voice models out right now?

fluid lion
#

RVC

#

unless there a better software

fluid lion
gilded robin
fluid lion
#

if it sounds the most "realistic" than sure, im trying to see how good it sounds now, i tested it a year ago or so

#

like live voice to voice

gilded robin
#

there are always more and more settings to finetune if you want it as realistic as possible

#

such as EQ, bitcrush etc

#

a year ago or so also wasnt rmvpe instead it was crepe iirc?

fluid lion
#

i dont remember

#

i am useing RVC

gilded robin
#

dm

peak shadow
#

For some reason, RVC in Google Colab does not generate the necessary files for saving AI voice.

peak path
wheat egret
#

how can i use these two .INDEX files for RVC?

#

they only have the "added" file

peak path
wheat egret
#

ok

severe hawk
#

I'm using Kaggle with Applio and I have finished training my thing, but when I press "Restart Applio" it continues training (I can't stop it no matter what I press) and there is no added_ file. How do I fix this?

severe hawk
#

great, now my ngrok ended

#

smh

#

ima try a different method then

#

I was seeing that response to people having the same problem as me, but I couldn't find the Generate Index button for the life of me...

#

Ohhhh, that one

#

alright, thank you so much

hallow thistle
latent kettle
#

How to add mel reformer models in UVR GUI

scenic atlas
#

hey when i try to write something to chat gpt he just doesnt answer me or gives me an error does any one know how i can fix this ?

toxic ingot
low shard
#

the servers are down

#

either wait or use something else like gemini

scenic atlas
low shard
low shard
#

RVC = Retrieval-based-Voice-Conversion, the best Few Shots Speech To Speech AI Models (on v2), Inferences (use models) pre-recorded audio (ai covers) and train (make) models. Technically, Mainline RVC does have a go-realtime.bat (aka RVC-GUI), but it's pretty messy and outdated so it's extremely not suggested for realtime.

Wokada = uses RVC for realtime inference. There's 2 main versions, Original made by Wok, and the most suggested one is Deiteris Fork (modified version)

#

what's ur pc gpu? what do u want to do?

plain orchid
#

can please someone help me with image generation either chatgpt or midjourney

#

i want to use a certain style but can't figure out how to make it match

frigid stone
#

I got Applio from the compiled Windows ApplioV3.2.9.zip on HuggingFace but the run-tensorboard.bat isn't working
It throws the error ImportError: cannot import name 'notf' from 'tensorboard.compat' (D:\ApplioV3.2.9\env\lib\site-packages\tensorboard\compat\__init__.py)

#

I might've fixed it by moving to the C: drive, testing by moving it back to my D: drive

steady thorn
#

hello
i need some help
im looking to create an ai influencer ( im trying and im overwhelmed ) with comfyui, can someone help me with a workflow that could run on my laptop? low vram ( 6gb vram ) rtx 3060 laptop
text to image workflow + lora stacker

frigid stone
#

Welp, Tensorboard is now open
But it still throws those errors

#

I guess they weren't important

#

There're 5 or 6 of these in a row

#

This is the second time unzipping it, not sure how I can unzip it more properly

#

11

prisma kettle
#

Got this error in Replay and went to the website, is there any way to use the ones it suggests instead?

#

Oh boy

#

What does this mean? I'm on a 5070 Ti

#

...suffering from success man, wtf

frigid stone
#

Bizarre, not sure what I can change

prisma kettle
#

Ooooooh yeah that's right

#

I never installed it back in April, thx for the reminder

#

I forget what I was trying to even do

#

Well now is not a good time for me to install torch but back in April that's what you said to do lol

#

Oh it was UVR that I was trying to get to work

#

Is there a torch installation guide handy

#

This is what you said back in April but I'm a dummy and don't want to mess up an install

#

I can just run that in cmd?

#

Don't think I have any old versions of torch since I never installed the nightly build

#

Sweet thx

#

Wdym by activate sorry

#

I am a noob

#

The rest of it meaning the other lines above?

#

Alright thank you

#

I'll start with this

frigid stone
#

Re-extracted Applio again for the third time, didn't change anything
Not sure how to debug this

#

Going to try with 3.2.8-bugfix

prisma kettle
#

Restarting PC lol, had to install Python

frigid stone
#

The website does open, I assume it works but I haven't checked if a graph shows up yet

#

3.2.8-bugfix has no error messages

prisma kettle
#

Ok python installed and restarted PC

#

Nooo I'm still getting this error.

#

Now "The system cannot find the path specified."

#

I am so out of my depth sorry

#

Will this also make it work for Replay? Both are broken rn bc I don't have torch

#

What is that sorry

#

What link are you looking for

#

this is cool

#

Oh this is the error I got earlier

#

Idk I'm just going with what weights gives me

#

the weights one still gets updates

#

Anyway, I'm in the UVR folder in cmd now, so you're saying I should install torch there?

#

My folder is just called Ultimate Vocal Remover, unless I'm in the wrong one

#

That's weird, I just installed it and rebooted

#

I'm in the AppData/Local/Programs/Ultimate Vocal Remover folder, but it seems like that's not where I should be

#

I did

#

I'm computer noob I use exes

#

it's what I do

#

Oh maybe tha tis what i did

#

I thought I did that actually but it's been a bit

#

Nah I ran the exe

#

Why can't I use the exe

#

Alr

#

Thanks

#

I do have a torch folder in this Ultimate Vocal Remover folder

#

this is all that's in there tho

#

Can I install torch for Replay?

#

Hmm

#

I just got it from their website and then here

#

That's the version that the official website links to 🤷

#

Thank you

#

Did Anjok abandon UVR

#

Thanks for helping me sorry I don't know what I am doing at all

#

I'm installing it through the bat now

#

The website said to run the bat

#

It's already going

#

Oh

#

Shit

#

Lol

#

I moved too fast

#

I'm dumb

#

Idk if I should just let this play out or what

#

Idk how to uninstall any of what I just ran

#

Ok

#

Unzipped into a C: drive folder

#

Don't think I have ffmpeg on this pc so I will install

#

But I can't rn

#

thank you for bearing with me

#

I have to get back to work but I’ll send you a screenshot, it just may not be until later or tomorrow. Can I ping you when I’ve got it

median monolith
#

(Trying to train a voice model on Applio collab) I genuiely do not know what im doing wrong. yes, im not using the "GPU" thing since its time limit has ended T-T, and the video quality is dogshit because I didnt want the file size to be heavy.

median monolith
#

i may be wrong, but apparently the problem is with the audio itself somehow? since the log says "no wav file found" and "not enough data present in the training set", but idk how, it has more than a second of duration, its put correctly in my drive folder, and put its path in the dataset path box. the log also states an error regarding an "attempting to register factory for plugin" thing. not sure if this matters or not, but still pointing it out.

median monolith
latent kettle
median monolith
latent kettle
#

thats all you can get ?

median monolith
latent kettle
median monolith
#

and reapeating this 1 minute source audio until i have a 10 minute one doesnt seem to be a good idea

latent kettle
#

just reduce batch size to 2

#

also try to train on kaggle. kaggle provides you 28 hours of runtime per week

#

you can either utilze it in one day or 2 days or in a week

median monolith
median monolith
latent kettle
#

everything including g and d checkpoints logs eventfiles etc

latent kettle
median monolith
median monolith
latent kettle
latent kettle
median monolith
paper bloom
#

hey is there a way to have a better accent on wokada

#

like sometimes it wont pronounce the L

frigid stone
#

In the codename fork for Applio, there's no g/total graph on tensorboard, there's only a generator_total graph
These are the same thing, right?

median monolith
#

thats strange, theres no accelerator option to change

frigid stone
#

Thanks!

light latch
#

Does anyone have the Google Collab link to make the ai covers?

median monolith
#

do I need to say I did everything the docs said and I still get stuck :´)

median monolith
#

im pretty sure I did but alright, m a y b e

#

oh wait... I was supposed to replace the literal word "token" on the thing with mine... now I get it, thanks for help :>

median monolith
flint anvil
#

ok chat i want to start making models cuz i just cant find rly good models with realistic qualities, where do i start? like how do i set everything up (im familiar with programming just tell me what to download and start with if you can)
assuming i can train locally which i will likely do

next mist
#

any idea when RVC 3.0 will be released?

crude flame
dark ginkgo
#

I cant seem to use my flux on forge ui since upgrading to my rtx 5060ti

modern surge
#

-colab

patent trellisBOT
# modern surge -colab
📒 Google Colab Notebooks

Google Colab is a Cloud (Remote Good PC) Service. While the Free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.

• **Applio**

by IA Hispano
Google Colab

• **RVC Mainline**

by Hina
Google Colab

• **UVR5 NO UI**

by Eddy
Google Colab

• **UVR5 UI**

by Eddy
Google Colab

• **Wokada Deiteris Fork**

by Deiteris & Hina
Google Colab

• **Hina's Modified Original Wokada**
• **RVC-AI-Cover-Maker-WebUI**

by Shiro & Eddy
Google Colab

• **FaceFusion UI**

by Nick088
Google Colab

• **FaceFusion NO UI**

by Nick088
Google Colab

• **Music Source Separation Training (Inference)**

by Jarredou & Makidanye
Google Colab

opaque sparrow
#

How can I train using my own voice recordings to always output some other voice model? like this one https://www.weights.com/models/clroz1aic012sjmfug54yft0u
I have a 9800x3d rt7900xtx
if I train it locally how long would it take to get a really good model for Deiteris' W Okada Fork?

opaque sparrow
#

what's confusing about it?

#

is my esl english confusing? I want to train a voice model on my voice to the target model, without messing with the tune, index, pitch in the UI, or having to speak at a certain volume. If it's not possible I don't know cat_wtf

opaque sparrow
#

Ahhh, I see, it's ever explained anywhere, RVC does not need to be trained on your voice specifically, it turns any audio input to the target voice anime_shrug

#

Grok:
While not required, there are scenarios where including your voice in training could be considered:
Custom Fine-Tuning (Optional): If the model doesn’t sound natural with your voice (e.g., due to extreme pitch differences), you could fine-tune the model by adding a small dataset of your voice to improve compatibility. This involves:
Recording 5–10 minutes of your voice.

Fine-tuning the existing model using RVC-WebUI with your voice data to adjust the model’s mapping for your specific vocal range.

This is advanced and rarely needed for general use.

Improving Robustness: If you have a unique accent or speech pattern, fine-tuning with your voice can help the model handle your input better, but this is typically unnecessary for pre-trained models designed for broad compatibility.

jovial kraken
#

How can I run rvc in kaggle?

latent kettle
#

@simple ore Is it over fitting? I mean it increased in last 100 epochs. The blue circle represents 200 epochs and the final point is 300 epochs. There is no loss

latent kettle
#

How to look at them?

#

More loss = good quality?

#

Okay. But how to analyze it ?

#

I see

#

I stopped training

#

Mb

#

The dataset was very poor and I didn't expected good results. So ya, that's fine

runic kraken
#

what setting refers to making voice more understandable?

dry sable
#

Installing pre-dependencies...
ERROR: Could not find a version that satisfies the requirement faiss-gpu (from versions: none)
ERROR: No matching distribution found for faiss-gpu
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 261.0/261.0 kB 7.1 MB/s eta 0:00:00
Preparing metadata (pyproject.toml) ... done
Building wheel for pyworld (pyproject.toml) ... done
Installing dependencies from requirements.txt...
ERROR: Ignored the following versions that require a different python version: 1.21.2 Requires-Python >=3.7,<3.11; 1.21.3 Requires-Python >=3.7,<3.11; 1.21.4 Requires-Python >=3.7,<3.11; 1.21.5 Requires-Python >=3.7,<3.11; 1.21.6 Requires-Python >=3.7,<3.11
ERROR: Could not find a version that satisfies the requirement onnxruntime-gpu==1.13.1 (from versions: 1.15.0, 1.15.1, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.17.0, 1.17.1, 1.18.0, 1.18.1, 1.19.0, 1.19.2, 1.20.0, 1.20.1, 1.20.2, 1.21.0, 1.21.1, 1.22.0)
ERROR: No matching distribution found for onnxruntime-gpu==1.13.1
Successfully installed all packages!

#

is there any way to fix it?

hot lagoon
#

Why we doing loss_avg_50 charts now then just gtotal?

dark ginkgo
#

what command should I use? Everytime in the past (and currently) involving command line for python does not install

#

command I tried to run

#

also should I update my pip?

#

I have cuda 12.9, what version of Cuda do I need then?

#

or do I need to somehow update the ai softwares? Wasnt they supposed to auto update?

#

Before I do this, can anyone verify if this would work?

#

google gemini seems to want me to install the 12.1 version instead of 12.8 since it believes it's forward compatible apparently and stable. Like it recommend to not use the nightly build due to "stability issues"

#

but I did get this after pushing it for cu128

prisma kettle
#

@simple ore Here's the inside of the Replay folder in AppData

granite python
#

guys,i have [Voice Changer] Pipeline is not initialized.
how do i fix that

brittle wing
#

could someone help me with setting recommendations? for the clearest least robotic sound?

viral mason
brittle wing
#

can i send u a screenshot of my current settings?

viral mason
#

of course! my dms are open

dark ginkgo
#

still not working

#

I updated to the one you said

pine zealot
#

hi guys, can anyone tell me where can i upload the voice models i downloaded?

#

i have no idea how i use these models

jaunty shale
#

I use Kaggle mainline and in logs I accidentaly removed folder named "mute"

should I be worried?

latent kettle
pine zealot
#

sorry i mean if theres any app or something i should download

#

i have the models but i dont have any app or program to run it

naive furnace
#

Hi, is there any AI RVC that works with 5080/5090? Tried install Applio, OpenVoice, codename-rvc, spend half of the day trying to bypass python compatibility with those gpu's and nothing, it doesnt work.

dark ginkgo
#

so how would I do this? like in the folder where forge is?

hearty viper
#

hey, is there a good voice changer i could use without a good gpu?

#

colab gives me an error when i try upload a voice model

dark ginkgo
#

yeah

#

just type run pip install?

#

or type what you had ealrier

#

already done that before

#

I did that then did what you told me to ealier

#

it installs for system wide

#

so all the other ai software works now but not forge and not comyfui

#

is there not an updated version of forge? if automatic1111 is working, not forge or comfyui,

#

flux work with it now?

#

wait how? I got forge becuase at the time, automatic 1111 is incompatibel